Difference between revisions of "LaCie 5big Network 2"

Revision as of 11:14, 2 December 2013

Lacie 5big network 2 10To

CPU armv5tel (Feroceon 88FR131 rev 1 (v5l))
RAM 512mb
Flash ROM
NIC 2x Marvell Ethernet Gigabit Ethernet 10/100/1000 Base-TX
USB
internal HDD supports BASIC (1 drive), RAID 0 or 1 (2 drives), RAID 0 or 5 (3 drives), RAID 0, 5 or 6 (4 or 5 drives)
SATA Controller
Drive Capacity 0, 5, 10, and 15 TB capacities available
Fan Ultra-quiet cooling system with self-stabilizing oil-pressure bearing technology
Initial firmware v2.0.5, upgraded to v2.2.8

cat /proc/mdstat

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md4 : active raid5 sda2[0] sde2[4] sdd2[3] sdc2[2] sdb2[1]
     7805956096 blocks super 1.0 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]
     
md3 : active raid1 sda5[0] sde5[4] sdd5[3] sdc5[2] sdb5[1]
     255936 blocks [5/5] [UUUUU]
     
md2 : active raid1 sda9[0] sde9[4] sdd9[3] sdc9[2] sdb9[1]
     875456 blocks [5/5] [UUUUU]
     
md1 : active raid1 sda8[0] sde8[4] sdd8[3] sdc8[2] sdb8[1]
     843328 blocks [5/5] [UUUUU]
     
md0 : active raid1 sde7[3] sdd7[4] sdc7[2] sdb7[1] sda7[0]
     16000 blocks [5/5] [UUUUU]

Links

Unofficial wiki

Most of the info here comes from that wiki

Section for Lacie 2big network 2
Section for Lacie 5big network 2 contains much less info but most info of the 2big are applicable to the 5big

Root

Custom capsule

I created a custom capsule, a custom firmware which is very easy to do thanks to the provided script.
Not much to say, just execute the script and answer to a few questions.
Then to flash it, you can use the LaCieNetworkAssistant or just drop the capsule file into the share Share/Update and reboot the NAS, which is a method I prefer, less dependent on network operations.
Note that if you use LaCieNetworkAssistant and it fails updating the fw, they tell to disable IPConf support. To do that on linux, once the assistant is launched, right click on its icon in the task bar => preferences.
Example:

New capsule built: '/home/phil/Downloads/lacie/capsule/lacie/5bignetwork2_2.2.8.1.capsule'
After upgrading you can:
- use SSH/SFTP with root privileges with your Lacie NAS with login 'root' (login: root | password: [same as admin password] | Port: 2222)
- use Transmission Web Interface (url: http://YOUR.LACIE.NAS.IP:9091/transmission/web/)
  Don't forget to change the transmission download path.
- access, after configuring port forwarding in your router, from a external network:
  - Lacie Dashboard: http://YOUR.NETWORK.EXTERNAL.IP/
  - Transmission Web Interface: http://YOUR.NETWORK.EXTERNAL.IP:9091/transmission/web/

Authentication

root password can be permanently changed by editing /usr/lib/python2.6/site-packages/unicorn/authentication/local/user.py & looking for 'root:$1...'
This step is automated when you create a custom capsule, see sshd.i.txt
SSH runs on port 2222 and you can make use of /root/ssh/authorised_keys as usual

Misc

Serial port

There is probably a serial port on board giving access to uboot console

New disks

In case we need to play with new disks, keep in mind their UUID must be injected in th eEEPROM, see here

LaCieNetworkAssistant

These are the tools provided on the cdrom. A linux version comes, even with several packagings: auto extractible, tar.gz, rpm and deb
But they are only for i386 and won't work as such on an amd64 system.
Note that I think we can live without them, everything can be done via the web interface and the shared drive.

To try with the deb, we've to force a bit the things (probably it would be better to repackage it)

sudo dpkg -i --force-architecture --force-depends /media/cdrom/Linux/LaCieNetworkAssistant-1.4.1-Linux.deb

It provides a few binaries:

/usr/bin/LaCieNetworkAssistant
/usr/bin/zsudo
/usr/bin/tarTine
/usr/bin/LCtftpd

Installing the existing ia32 libraries is not enough, some are missing:

$ ldd LaCieNetworkAssistant |grep "not found"
libsmbclient.so.0 => not found

To solve it, you can download the i386 version and copy libsmbclient.so.0 to /usr/lib32
But this one has its own dependencies:

$ ldd libsmbclient.so.0 |grep "not found"
libtalloc.so.2 => not found
libwbclient.so.0 => not found

So, same thing, download & copy libsmbclient.so.0 libtalloc.so.2 libtalloc.so.2.0.7 libwbclient.so.0 to /usr/lib32
I got also an error linked to libtdb1 which is in the is32-libs, so again, get it & cp libtdb.so.1 libtdb.so.1.2.9 /usr/lib32
And now:

export GTK_PATH='/usr/lib32/gtk-2.0'
LaCieNetworkAssistant

Reset

See LaCie website, it's possible to reset default fw, with or without data loss

Transmission

I restored a vanilla transmission via the custom capsule to get the web interface.
NEVER launch or stop transmission daemon via the LaCie web nterface, it would restore settings.json to its defaults. Note that it might be permanently changed by mangling /etc/initng/transmission.i and /usr/lib/python2.6/site-packages/unicorn/download/torrent.py
There should be a way to disable the LaCie interface, see /usr/lib/unicorn/webapp2/controller/download.py /usr/lib/unicorn/updaterapp/modules/download.py /usr/lib/unicorn/webapp/modules/neko/download.py and /usr/lib/unicorn/unicorn.conf

Once the web interface is active, you can also activate the remote control interface:

Stop the daemon

ngc --stop transmission

Edit /lacie/torrent_dir/transmission/settings.json

"rpc-enabled": true,
"rpc-password": "your_password", # note that it will be encrypted next time automatically
"rpc-port": 9091,
"rpc-username": "your_name",
"rpc-whitelist-enabled": "false",
"rpc-authentication-required": "true",

Options are explained here

ngc --start transmission

Now you can use a remote client:

apt-get install transgui

And edit the other settings, amongst others the download-dir to some /shares/...

Two faulty disks on a 5-disk RAID5

That's pretty shitty.
One drive failed and the box sent me an email to tell the array was degraded.
That's the second time it happens while the LaCie 5big is barely one year old.
So I wrote again to the support and they sent me a new drive upfront, good.
But during the addition of the new drive and the reconstruction, this implied obviously a thorough read of all the other drives and... yet another drive gave hardware read errors and the array collapsed completely.
I got a laconic email telling "array is inactive" and on the web interface all data seemed to have disappeared.

Hopefully I had rooted my box so I could ssh and look in the logs.
/var/log/messages looked like this (excerpts):

ata1.15: exception Emask 0x1 SAct 0x0 SErr 0x0 action 0x6
ata1.15: edma_err_cause=00000084 pp_flags=00000001, dev error, EDMA self-disable
ata1.01: status: { DRDY ERR }
ata1.01: error: { UNC }
end_request: I/O error, dev sdc, sector 2107042224
raid5:md4: read error not correctable (sector 2102993328 on sdc2).
raid5: Disk failure on sdc2, disabling device.
raid5: Operation continuing on 3 devices.
I/O error in filesystem ("md4") meta-data dev md4 block 0x0       ("xfs_unmountfs_writesb") error 5 buf count 4096
I/O error in filesystem ("md4") meta-data dev md4 block 0x1d171d2b8       ("xlog_iodone") error 5 buf count 4096
Filesystem "md4": Log I/O Error Detected.  Shutting down filesystem: md4
LaCie-5big hald: unmounted /dev/md4 from '/media/internal_1' on behalf of uid 0

Structure of the array is the following:

[root@LaCie-5big /]# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md3 : active raid1 sdd5[3] sda5[0] sde5[4] sdc5[2] sdb5[1]
      255936 blocks [5/5] [UUUUU]
      
md2 : active raid1 sdd9[3] sda9[0] sde9[4] sdc9[2] sdb9[1]
      875456 blocks [5/5] [UUUUU]
      
md1 : active raid1 sdd8[3] sda8[0] sde8[4] sdc8[2] sdb8[1]
      843328 blocks [5/5] [UUUUU]
      
md0 : active raid1 sdd7[4] sde7[3] sdc7[2] sdb7[1] sda7[0]
      16000 blocks [5/5] [UUUUU]

And /dev/md4 is missing, normally constructed from /dev/sd[abcde]2
A page I found with some useful tips: https://raid.wiki.kernel.org/index.php/RAID_Recovery
Getting some more info:

mdadm --examine /dev/sda2 >> raid_sdx2.status
mdadm --examine /dev/sdb2 >> raid_sdx2.status
mdadm --examine /dev/sdc2 >> raid_sdx2.status
mdadm --examine /dev/sdd2 >> raid_sdx2.status
mdadm --examine /dev/sde2 >> raid_sdx2.status

$ cat raid_sdx2.status |egrep 'Event|/dev/sd'
/dev/sda2:
         Events : 1306184
/dev/sdb2:
         Events : 1306184
/dev/sdc2:
         Events : 1306177
/dev/sdd2:
         Events : 1306184
/dev/sde2:
         Events : 1306184

[root@LaCie-5big ~]# cat raid_sdx2.status |grep Role
   Device Role : Active device 0
   Device Role : Active device 1
   Device Role : Active device 2
   Device Role : spare
   Device Role : Active device 4

[root@LaCie-5big ~]# cat raid_sdx2.status |grep State
          State : clean
   Array State : AA..A ('A' == active, '.' == missing)
          State : clean
   Array State : AA..A ('A' == active, '.' == missing)
          State : clean
   Array State : AAAAA ('A' == active, '.' == missing)
          State : clean
   Array State : AA..A ('A' == active, '.' == missing)
          State : clean
   Array State : AA..A ('A' == active, '.' == missing)

So /dev/sdc2 had dropped and is out-of-sync.

Before messing up with it, better to shut down the Transmission server:

ngc --stop transmission

As a first attempt I tried to force reassembling the array:

mdadm --assemble --force /dev/md4 /dev/sda2 /dev/sdb2 /dev/sdc2 /dev/sdd2 /dev/sde2
mdadm: forcing event count in /dev/sdc2(2) from 1306177 upto 1306184
mdadm: clearing FAULTY flag for device 2 in /dev/md4 for /dev/sdc2
mdadm: /dev/md4 has been started with 4 drives (out of 5) and 1 spare.

As soon as the array appears again, the box mounts the corresponding shares which become accessible again.
It also starts trying to resync the new drive (/dev/sdd2) and... it crashes again after a few hours when it hits the hw errors on /dev/sdc2

[root@LaCie-5big ~]# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md4 : active raid5 sda2[0] sdd2[6] sde2[5] sdc2[2] sdb2[1]
      7805956096 blocks super 1.0 level 5, 512k chunk, algorithm 2 [5/4] [UUU_U]
      [>....................]  recovery =  0.0% (306560/1951489024) finish=30321.4min speed=1072K/sec

So better to start the array without the new drive, so at least we've a chance to save as much data as possible:

mdadm --stop /dev/md4
mdadm --assemble --force /dev/md4 /dev/sda2 /dev/sdb2 /dev/sdc2 /dev/sde2

If the array refused to be stopped, that's because it's in use:
You need to unmount any share mounted over the network, including the one by the media box

tango3[~]# umount /tmp/mnt/smb/0
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
mdadm --stop /dev/md4
mdadm --assemble --force /dev/md4 /dev/sda2 /dev/sdb2 /dev/sdc2 /dev/sde2

At this point I could save a number of data but once I try to access files mapped on the faulty area, the array collapsed again.
So I tried a different approach:
Stop the array.

tango3[~]# umount /tmp/mnt/smb/0
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
mdadm --stop /dev/md4

Run ddrescue to copy /dev/sdc2 (the faulty) to /dev/sdd2 (the new)
But ddrescue is not available on the box, neither screen which would be useful...
So I took them from Debian squeeze (oldstable).
Newer versions require newer libc & libstdc++6.

Extract the bins and drop them in the box
If the library is left in the current directory, calling screen has to be done as

LD_LIBRARY_PATH=. ./screen

And now we can call ddrescue:

./ddrescue -d /dev/sdc2 /dev/sdd2 /root/ddrescue.log

With the logfile it can be interrupted and restarted from where it was left.
To be continued....