LaCie 5big Network 2

From YobiWiki
Jump to navigation Jump to search

Lacie 5big network 2 10To

  • CPU armv5tel (Feroceon 88FR131 rev 1 (v5l))
  • RAM 512mb
  • Flash ROM
  • NIC 2x Marvell Ethernet Gigabit Ethernet 10/100/1000 Base-TX
  • USB
  • internal HDD supports BASIC (1 drive), RAID 0 or 1 (2 drives), RAID 0 or 5 (3 drives), RAID 0, 5 or 6 (4 or 5 drives)
  • SATA Controller
  • Drive Capacity 0, 5, 10, and 15 TB capacities available
  • Fan Ultra-quiet cooling system with self-stabilizing oil-pressure bearing technology
  • Initial firmware v2.0.5, upgraded to v2.2.8

cat /proc/mdstat

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md4 : active raid5 sda2[0] sde2[4] sdd2[3] sdc2[2] sdb2[1]
     7805956096 blocks super 1.0 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]
     
md3 : active raid1 sda5[0] sde5[4] sdd5[3] sdc5[2] sdb5[1]
     255936 blocks [5/5] [UUUUU]
     
md2 : active raid1 sda9[0] sde9[4] sdd9[3] sdc9[2] sdb9[1]
     875456 blocks [5/5] [UUUUU]
     
md1 : active raid1 sda8[0] sde8[4] sdd8[3] sdc8[2] sdb8[1]
     843328 blocks [5/5] [UUUUU]
     
md0 : active raid1 sde7[3] sdd7[4] sdc7[2] sdb7[1] sda7[0]
     16000 blocks [5/5] [UUUUU]

Links

Unofficial wiki

Most of the info here comes from that wiki

Root

Custom capsule

I created a custom capsule, a custom firmware which is very easy to do thanks to the provided script.
Not much to say, just execute the script and answer to a few questions.
Then to flash it, you can use the LaCieNetworkAssistant or this method which I prefer, less dependent on network operations:

  • In the dashboard, create a share named "Share"
  • Create a folder in that share named "Update"
  • Drop the capsule file into the share Share/Update
  • Reboot the NAS

Note that if you use LaCieNetworkAssistant and it fails updating the fw, they tell to disable IPConf support. To do that on linux, once the assistant is launched, right click on its icon in the task bar => preferences.
Example:

New capsule built: '/home/phil/Downloads/lacie/capsule/lacie/5bignetwork2_2.2.8.1.capsule'
After upgrading you can:
- use SSH/SFTP with root privileges with your Lacie NAS with login 'root' (login: root | password: [same as admin password] | Port: 2222)
- use Transmission Web Interface (url: http://YOUR.LACIE.NAS.IP:9091/transmission/web/)
  Don't forget to change the transmission download path.
- access, after configuring port forwarding in your router, from a external network:
  - Lacie Dashboard: http://YOUR.NETWORK.EXTERNAL.IP/
  - Transmission Web Interface: http://YOUR.NETWORK.EXTERNAL.IP:9091/transmission/web/

The script has been reported to work with capsule 2.2.9.3 producing a rooted capsule 2.2.9.3.1

Authentication

root password can be permanently changed by editing /usr/lib/python2.6/site-packages/unicorn/authentication/local/user.py & looking for 'root:$1...'
This step is automated when you create a custom capsule, see sshd.i.txt
SSH runs on port 2222 and you can make use of /root/ssh/authorised_keys as usual
Note that direct edition of user.py may be reverted back by a firmware update unless you customize directly the new capsule.
Note that some (all?) firmwares have an extra user called "partner" with the same rights as root.
Default passwords for root and partner are unknown AFAIK.
Their respective md5crypt hashes are:

$1$$1RDUuTsVHjre9juUvuICX.
$1$AhmQ/2rZ$1cYuUexBvzYmM.Zk4R/6y.

Misc

Serial port

There is probably a serial port on board giving access to uboot console

New disks

In case we need to play with new disks, keep in mind their UUID must be injected in th eEEPROM, see here

LaCieNetworkAssistant

These are the tools provided on the cdrom. A linux version comes, even with several packagings: auto extractible, tar.gz, rpm and deb
But they are only for i386 and won't work as such on an amd64 system.
Note that I think we can live without them, everything can be done via the web interface and the shared drive.

To try with the deb, we've to force a bit the things (probably it would be better to repackage it)

sudo dpkg -i --force-architecture --force-depends /media/cdrom/Linux/LaCieNetworkAssistant-1.4.1-Linux.deb

It provides a few binaries:

/usr/bin/LaCieNetworkAssistant
/usr/bin/zsudo
/usr/bin/tarTine
/usr/bin/LCtftpd

Installing the existing ia32 libraries is not enough, some are missing:

$ ldd LaCieNetworkAssistant |grep "not found"
libsmbclient.so.0 => not found

To solve it, you can download the i386 version and copy libsmbclient.so.0 to /usr/lib32
But this one has its own dependencies:

$ ldd libsmbclient.so.0 |grep "not found"
libtalloc.so.2 => not found
libwbclient.so.0 => not found

So, same thing, download & copy libsmbclient.so.0 libtalloc.so.2 libtalloc.so.2.0.7 libwbclient.so.0 to /usr/lib32
I got also an error linked to libtdb1 which is in the is32-libs, so again, get it & cp libtdb.so.1 libtdb.so.1.2.9 /usr/lib32
And now:

export GTK_PATH='/usr/lib32/gtk-2.0'
LaCieNetworkAssistant

Reset

See LaCie website, it's possible to reset default fw, with or without data loss
Reset without data loss will move all data into /Share and will make it accessible only to the admin, ok.
But when I applied the procedure I didn't have to know the admin password, I was only prompted to configure a new admin password!

Transmission

I restored a vanilla transmission via the custom capsule to get the web interface.
NEVER launch or stop transmission daemon via the LaCie web nterface, it would restore settings.json to its defaults. Note that it might be permanently changed by mangling /etc/initng/transmission.i and /usr/lib/python2.6/site-packages/unicorn/download/torrent.py
There should be a way to disable the LaCie interface, see /usr/lib/unicorn/webapp2/controller/download.py /usr/lib/unicorn/updaterapp/modules/download.py /usr/lib/unicorn/webapp/modules/neko/download.py and /usr/lib/unicorn/unicorn.conf

Once the web interface is active, you can also activate the remote control interface:

  • Stop the daemon
ngc --stop transmission
  • Edit /lacie/torrent_dir/transmission/settings.json
"rpc-enabled": true,
"rpc-password": "your_password", # note that it will be encrypted next time automatically
"rpc-port": 9091,
"rpc-username": "your_name",
"rpc-whitelist-enabled": "false",
"rpc-authentication-required": "true",

Options are explained here

ngc --start transmission


Now you can use a remote client:

apt-get install transgui

And edit the other settings, amongst others the download-dir to some /shares/...

Two faulty disks on a 5-disk RAID5

Situation

That's pretty shitty.
One drive failed and the box sent me an email to tell the array was degraded.
That's the second time it happens while the LaCie 5big is barely one year old.
So I wrote again to the support and they sent me a new drive upfront, good.
But during the addition of the new drive and the reconstruction, this implied obviously a thorough read of all the other drives and... yet another drive gave hardware read errors and the array collapsed completely.
I got a laconic email telling "array is inactive" and on the web interface all data seemed to have disappeared.

Hopefully I had rooted my box so I could ssh and look in the logs.
/var/log/messages looked like this (excerpts):

ata1.15: exception Emask 0x1 SAct 0x0 SErr 0x0 action 0x6
ata1.15: edma_err_cause=00000084 pp_flags=00000001, dev error, EDMA self-disable
ata1.01: status: { DRDY ERR }
ata1.01: error: { UNC }
end_request: I/O error, dev sdc, sector 2107042224
raid5:md4: read error not correctable (sector 2102993328 on sdc2).
raid5: Disk failure on sdc2, disabling device.
raid5: Operation continuing on 3 devices.
I/O error in filesystem ("md4") meta-data dev md4 block 0x0       ("xfs_unmountfs_writesb") error 5 buf count 4096
I/O error in filesystem ("md4") meta-data dev md4 block 0x1d171d2b8       ("xlog_iodone") error 5 buf count 4096
Filesystem "md4": Log I/O Error Detected.  Shutting down filesystem: md4
LaCie-5big hald: unmounted /dev/md4 from '/media/internal_1' on behalf of uid 0

Structure of the array is the following:

[root@LaCie-5big /]# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md3 : active raid1 sdd5[3] sda5[0] sde5[4] sdc5[2] sdb5[1]
      255936 blocks [5/5] [UUUUU]
      
md2 : active raid1 sdd9[3] sda9[0] sde9[4] sdc9[2] sdb9[1]
      875456 blocks [5/5] [UUUUU]
      
md1 : active raid1 sdd8[3] sda8[0] sde8[4] sdc8[2] sdb8[1]
      843328 blocks [5/5] [UUUUU]
      
md0 : active raid1 sdd7[4] sde7[3] sdc7[2] sdb7[1] sda7[0]
      16000 blocks [5/5] [UUUUU]

And /dev/md4 is missing, normally constructed from /dev/sd[abcde]2
A page I found with some useful tips: https://raid.wiki.kernel.org/index.php/RAID_Recovery
Getting some more info:

mdadm --examine /dev/sda2 >> raid_sdx2.status
mdadm --examine /dev/sdb2 >> raid_sdx2.status
mdadm --examine /dev/sdc2 >> raid_sdx2.status
mdadm --examine /dev/sdd2 >> raid_sdx2.status
mdadm --examine /dev/sde2 >> raid_sdx2.status

$ cat raid_sdx2.status |egrep 'Event|/dev/sd'
/dev/sda2:
         Events : 1306184
/dev/sdb2:
         Events : 1306184
/dev/sdc2:
         Events : 1306177
/dev/sdd2:
         Events : 1306184
/dev/sde2:
         Events : 1306184

[root@LaCie-5big ~]# cat raid_sdx2.status |grep Role
   Device Role : Active device 0
   Device Role : Active device 1
   Device Role : Active device 2
   Device Role : spare
   Device Role : Active device 4

[root@LaCie-5big ~]# cat raid_sdx2.status |grep State
          State : clean
   Array State : AA..A ('A' == active, '.' == missing)
          State : clean
   Array State : AA..A ('A' == active, '.' == missing)
          State : clean
   Array State : AAAAA ('A' == active, '.' == missing)
          State : clean
   Array State : AA..A ('A' == active, '.' == missing)
          State : clean
   Array State : AA..A ('A' == active, '.' == missing)

So /dev/sdc2 had dropped and is out-of-sync.

Getting data back

Before messing up with it, better to shut down the Transmission server:

ngc --stop transmission

As a first attempt I tried to force reassembling the array:

mdadm --assemble --force /dev/md4 /dev/sda2 /dev/sdb2 /dev/sdc2 /dev/sdd2 /dev/sde2
mdadm: forcing event count in /dev/sdc2(2) from 1306177 upto 1306184
mdadm: clearing FAULTY flag for device 2 in /dev/md4 for /dev/sdc2
mdadm: /dev/md4 has been started with 4 drives (out of 5) and 1 spare.

As soon as the array appears again, the box mounts the corresponding shares which become accessible again.
It also starts trying to resync the new drive (/dev/sdd2) and... it crashes again after a few hours when it hits the hw errors on /dev/sdc2

[root@LaCie-5big ~]# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md4 : active raid5 sda2[0] sdd2[6] sde2[5] sdc2[2] sdb2[1]
      7805956096 blocks super 1.0 level 5, 512k chunk, algorithm 2 [5/4] [UUU_U]
      [>....................]  recovery =  0.0% (306560/1951489024) finish=30321.4min speed=1072K/sec

So better to start the array without the new drive, so at least we've a chance to save as much data as possible:

mdadm --stop /dev/md4
mdadm --assemble --force /dev/md4 /dev/sda2 /dev/sdb2 /dev/sdc2 /dev/sde2

If the array refused to be stopped, that's because it's in use:
You need to unmount any share mounted over the network, including the one by the media box

tango3[~]# umount /tmp/mnt/smb/0
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
mdadm --stop /dev/md4
mdadm --assemble --force /dev/md4 /dev/sda2 /dev/sdb2 /dev/sdc2 /dev/sde2

Fixing with ddrescue

At this point I could save a number of data but once I try to access files mapped on the faulty area, the array collapsed again.
So I tried a different approach:
Stop the array.

tango3[~]# umount /tmp/mnt/smb/0
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
mdadm --stop /dev/md4

Run ddrescue to copy /dev/sdc2 (the faulty) to /dev/sdd2 (the new)
But ddrescue is not available on the box, neither screen which would be useful...
So I took them from Debian squeeze (oldstable).
Newer versions require newer libc & libstdc++6.

Extract the bins and drop them in the box. I copied also screenrc to /etc/, not sure if it's needed or not.
If the library is left in the current directory, calling screen has to be done as

LD_LIBRARY_PATH=. ./screen

And now we can call ddrescue:

./ddrescue -d /dev/sdc2 /dev/sdd2 /root/ddrescue.log

With the logfile it can be interrupted and restarted from where it was left.

Current status
rescued:     1998 GB,  errsize:   13824 B,  current rate:        0 B/s
   ipos:     1082 GB,   errors:       8,    average rate:   39354 kB/s
   opos:     1082 GB,     time from last successful read:     3.5 m
Finished

Not that bad.

# mdadm --assemble /dev/md4 /dev/sda2 /dev/sdb2 /dev/sdd2 /dev/sde2                                                                                 
mdadm: /dev/md4 has been started with 4 drives (out of 5).

And now disk sdc can be removed and replaced.
Hot remove should be ok but let's do it cleanly:

# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md4 : active raid5 sda2[0] sde2[5] sdd2[2] sdb2[1]
      7805956096 blocks super 1.0 level 5, 512k chunk, algorithm 2 [5/4] [UUU_U]
md3 : active raid1 sda5[0] sde5[4] sdd5[3] sdc5[2] sdb5[1]
      255936 blocks [5/5] [UUUUU]
md2 : active raid1 sda9[0] sde9[4] sdd9[3] sdc9[2] sdb9[1]
      875456 blocks [5/5] [UUUUU]
md1 : active raid1 sda8[0] sde8[4] sdd8[3] sdc8[2] sdb8[1]
      843328 blocks [5/5] [UUUUU]
md0 : active raid1 sde7[3] sdd7[4] sdc7[2] sdb7[1] sda7[0]
      16000 blocks [5/5] [UUUUU]
# mdadm /dev/md0 --fail /dev/sdc7
mdadm: set /dev/sdc7 faulty in /dev/md0
# mdadm /dev/md0 --remove /dev/sdc7
mdadm: hot removed /dev/sdc7 from /dev/md0
# mdadm /dev/md1 --fail /dev/sdc8  
mdadm: set /dev/sdc8 faulty in /dev/md1
# mdadm /dev/md1 --remove /dev/sdc8
mdadm: hot removed /dev/sdc8 from /dev/md1
# mdadm /dev/md2 --fail /dev/sdc9  
mdadm: set /dev/sdc9 faulty in /dev/md2
# mdadm /dev/md2 --remove /dev/sdc9
mdadm: hot removed /dev/sdc9 from /dev/md2
# mdadm /dev/md3 --fail /dev/sdc5  
mdadm: set /dev/sdc5 faulty in /dev/md3
# mdadm /dev/md3 --remove /dev/sdc5
mdadm: hot removed /dev/sdc5 from /dev/md3
# cat /proc/mdstat                 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md4 : active raid5 sda2[0] sde2[5] sdd2[2] sdb2[1]
      7805956096 blocks super 1.0 level 5, 512k chunk, algorithm 2 [5/4] [UUU_U]
md3 : active raid1 sda5[0] sde5[4] sdd5[3] sdb5[1]
      255936 blocks [5/4] [UU_UU]
md2 : active raid1 sda9[0] sde9[4] sdd9[3] sdb9[1]
      875456 blocks [5/4] [UU_UU]
md1 : active raid1 sda8[0] sde8[4] sdd8[3] sdb8[1]
      843328 blocks [5/4] [UU_UU]
md0 : active raid1 sde7[3] sdd7[4] sdb7[1] sda7[0]
      16000 blocks [5/4] [UU_UU]

Then physically remove sdc.

Checking the remaining drives thoroughly:

# smartctl -t long /dev/sda
# smartctl -t long /dev/sdb
# smartctl -t long /dev/sdd
# smartctl -t long /dev/sde

Five hours later...

# smartctl -l xselftest /dev/sda
# smartctl -l xselftest /dev/sdb
# smartctl -l xselftest /dev/sdd
# smartctl -l xselftest /dev/sde

They all say sth like

smartctl 5.40 2011-04-07 r5807 [arm-unknown-linux-gnueabi] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF READ SMART DATA SECTION ===
General Purpose Logging (GPL) feature set supported
SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     20001         -