Difference between revisions of "LaCie 5big Network 2"

Latest revision as of 15:27, 13 October 2019

Lacie 5big network 2 10To

CPU armv5tel (Feroceon 88FR131 rev 1 (v5l))
RAM 512mb
Flash ROM
NIC 2x Marvell Ethernet Gigabit Ethernet 10/100/1000 Base-TX
USB
internal HDD supports BASIC (1 drive), RAID 0 or 1 (2 drives), RAID 0 or 5 (3 drives), RAID 0, 5 or 6 (4 or 5 drives)
SATA Controller
Drive Capacity 0, 5, 10, and 15 TB capacities available
Fan Ultra-quiet cooling system with self-stabilizing oil-pressure bearing technology
Initial firmware v2.0.5, upgraded to v2.2.8

cat /proc/mdstat

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md4 : active raid5 sda2[0] sde2[4] sdd2[3] sdc2[2] sdb2[1]
     7805956096 blocks super 1.0 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]
     
md3 : active raid1 sda5[0] sde5[4] sdd5[3] sdc5[2] sdb5[1]
     255936 blocks [5/5] [UUUUU]
     
md2 : active raid1 sda9[0] sde9[4] sdd9[3] sdc9[2] sdb9[1]
     875456 blocks [5/5] [UUUUU]
     
md1 : active raid1 sda8[0] sde8[4] sdd8[3] sdc8[2] sdb8[1]
     843328 blocks [5/5] [UUUUU]
     
md0 : active raid1 sde7[3] sdd7[4] sdc7[2] sdb7[1] sda7[0]
     16000 blocks [5/5] [UUUUU]

Links

Unofficial wiki

Most of the info here comes from that wiki

Section for Lacie 2big network 2
Section for Lacie 5big network 2 contains much less info but most info of the 2big are applicable to the 5big

Root

Custom capsule

I created a custom capsule, a custom firmware which is very easy to do thanks to the provided script.
update: the website seems to have disappeared, see the original page and script on Archive.org.
Not much to say, just execute the script and answer to a few questions.
Then to flash it, you can use the LaCieNetworkAssistant or this method which I prefer, less dependent on network operations:

In the dashboard, create a share named "Share"
Create a folder in that share named "Update"
Drop the capsule file into the share Share/Update
Reboot the NAS

Note that if you use LaCieNetworkAssistant and it fails updating the fw, they tell to disable IPConf support. To do that on linux, once the assistant is launched, right click on its icon in the task bar => preferences.
Example:

New capsule built: '/home/phil/Downloads/lacie/capsule/lacie/5bignetwork2_2.2.8.1.capsule'
After upgrading you can:
- use SSH/SFTP with root privileges with your Lacie NAS with login 'root' (login: root | password: [same as admin password] | Port: 2222)
- use Transmission Web Interface (url: http://YOUR.LACIE.NAS.IP:9091/transmission/web/)
  Don't forget to change the transmission download path.
- access, after configuring port forwarding in your router, from a external network:
  - Lacie Dashboard: http://YOUR.NETWORK.EXTERNAL.IP/
  - Transmission Web Interface: http://YOUR.NETWORK.EXTERNAL.IP:9091/transmission/web/

The script has been reported to work with capsule 2.2.10.1 producing a rooted capsule 2.2.10.1.1

Authentication

root password can be permanently changed by editing /usr/lib/python2.6/site-packages/unicorn/authentication/local/user.py & looking for 'root:$1...'
This step is automated when you create a custom capsule, see sshd.i.txt
SSH runs on port 2222 and you can make use of /root/ssh/authorised_keys as usual
Note that direct edition of user.py may be reverted back by a firmware update unless you customize directly the new capsule.
Note that some (all?) firmwares have an extra user called "partner" with the same rights as root.
Default passwords for root and partner are unknown AFAIK.
Their respective md5crypt hashes are:

$1$$1RDUuTsVHjre9juUvuICX.
$1$AhmQ/2rZ$1cYuUexBvzYmM.Zk4R/6y.

We can remove partner account by editing /usr/lib/python2.6/site-packages/unicorn/authentication/local/user.py (search partner) and deluser partner or changing ExecMode in /usr/lib/python2.6/site-packages/exec_mode/exec_mode.py?

Misc

Serial port

There is probably a serial port on board giving access to uboot console

New disks

In case we need to play with new disks, keep in mind their UUID must be injected in th eEEPROM, see here

LaCieNetworkAssistant

These are the tools provided on the cdrom. A linux version comes, even with several packagings: auto extractible, tar.gz, rpm and deb
But they are only for i386 and won't work as such on an amd64 system.
Note that I think we can live without them, everything can be done via the web interface and the shared drive.

To try with the deb, we've to force a bit the things (probably it would be better to repackage it)

sudo dpkg -i --force-architecture --force-depends /media/cdrom/Linux/LaCieNetworkAssistant-1.4.1-Linux.deb

It provides a few binaries:

/usr/bin/LaCieNetworkAssistant
/usr/bin/zsudo
/usr/bin/tarTine
/usr/bin/LCtftpd

Installing the existing ia32 libraries is not enough, some are missing:

$ ldd LaCieNetworkAssistant |grep "not found"
libsmbclient.so.0 => not found

To solve it, you can download the i386 version and copy libsmbclient.so.0 to /usr/lib32
But this one has its own dependencies:

$ ldd libsmbclient.so.0 |grep "not found"
libtalloc.so.2 => not found
libwbclient.so.0 => not found

So, same thing, download & copy libsmbclient.so.0 libtalloc.so.2 libtalloc.so.2.0.7 libwbclient.so.0 to /usr/lib32
I got also an error linked to libtdb1 which is in the is32-libs, so again, get it & cp libtdb.so.1 libtdb.so.1.2.9 /usr/lib32
And now:

export GTK_PATH='/usr/lib32/gtk-2.0'
LaCieNetworkAssistant

Reset

See LaCie website, it's possible to reset default fw, with or without data loss
Reset without data loss will move all data into /Share and will make it accessible only to the admin.
Admin password gets reset in the process.

Reset Without Data Loss

Caution: Following these steps will erase all Dashboard data such as users, groups, shares, and settings. It will also reset the machine name to the default and reset the network settings to DHCP.
Caution: After following these steps, all files will be moved to a folder called Recovery in Share and so by default will be available only to the administrator.

Make sure the product is turned OFF.
Press and hold down the front button. Without releasing the front button, turn the product on by pressing the power switch on the rear of the product.
Keep the front button pressed until the front LED becomes solid red, then release it. (The LED should become solid red about 10 seconds after you turn on the product.)
The front LED will blink blue. When it becomes static blue, press the front button once within 5 seconds to confirm reset.

If any of the steps are omitted, the product will boot normally without resetting.

Transmission

I restored a vanilla transmission via the custom capsule to get the web interface.
NEVER launch or stop transmission daemon via the LaCie web nterface, it would restore settings.json to its defaults. Note that it might be permanently changed by mangling /etc/initng/transmission.i and /usr/lib/python2.6/site-packages/unicorn/download/torrent.py
There should be a way to disable the LaCie interface, see /usr/lib/unicorn/webapp2/controller/download.py /usr/lib/unicorn/updaterapp/modules/download.py /usr/lib/unicorn/webapp/modules/neko/download.py and /usr/lib/unicorn/unicorn.conf

Once the web interface is active, you can also activate the remote control interface:

Stop the daemon

ngc --stop transmission

Edit /lacie/torrent_dir/transmission/settings.json

"rpc-enabled": true,
"rpc-password": "your_password", # note that it will be encrypted next time automatically
"rpc-port": 9091,
"rpc-username": "your_name",
"rpc-whitelist-enabled": "false",
"rpc-authentication-required": "true",

Options are explained here

ngc --start transmission

Now you can use a remote client:

apt-get install transgui

And edit the other settings, amongst others the download-dir to some /shares/...

HTTP server

There is already a HTTP server running on ports 80 and 443.
To add one:

/etc/lighttpd/lighttpd-dune.conf

server.modules              = (
  "mod_expire",
  "mod_compress",
  "mod_rewrite",
  "mod_setenv",
)

server.document-root        = "/shares/Share/@dune/@yamj/Jukebox"
server.port                 = 8000
server.errorlog             = "/var/log/lighttpd-dune-error.log"
server.pid-file             = "/var/run/lighttpd-dune.pid"
server.upload-dirs          =  ( "/lacie/tmp" )

compress.allowed-encodings  = ("gzip", "deflate")
compress.cache-dir          = "/var/cache/lighttpd-dune"
compress.filetype           = ("text/plain", "text/html", "text/javascript", "text/css", "text/xml")

index-file.names            = (
  "index.html", "index.php"
)


$HTTP["url"] =~ "index\.html"{
      setenv.add-response-header = ( "Cache-Control" => "no-cache, no-store" )
}

$HTTP["url"] =~ "(gif|png|jpg|css)$"{
      expire.url = ( "" => "access 1 months" )
      setenv.add-response-header = ( "Cache-Control" => "public" )
}

mimetype.assign             = (
  ".gif"          =>      "image/gif",
  ".jpg"          =>      "image/jpeg",
  ".jpeg"         =>      "image/jpeg",
  ".png"          =>      "image/png",
  ".css"          =>      "text/css",
  ".html"         =>      "text/html",
  ".htm"          =>      "text/html",
  ".js"           =>      "text/javascript",
  # default mime type
  ""              =>      "application/octet-stream",
 )

/etc/initng/httpdune.i

#!/sbin/itype
# This is a i file, used by initng parsed by install_service

daemon httpdune {
	need = virtual/net;
	exec daemon = /usr/sbin/lighttpd -D -f /etc/lighttpd/lighttpd-dune.conf;
	respawn;
}

ngc --start httpdune

Install service: edit /etc/initng/runlevel/default.runlevel and add httpdune

Two faulty disks on a 5-disk RAID5

Situation

That's pretty shitty.
One drive failed and the box sent me an email to tell the array was degraded.
That's the second time it happens while the LaCie 5big is barely one year old.
So I wrote again to the support and they sent me a new drive upfront, good.
But during the addition of the new drive and the reconstruction, this implied obviously a thorough read of all the other drives and... yet another drive gave hardware read errors and the array collapsed completely.
I got a laconic email telling "array is inactive" and on the web interface all data seemed to have disappeared.

Hopefully I had rooted my box so I could ssh and look in the logs.
/var/log/messages looked like this (excerpts):

ata1.15: exception Emask 0x1 SAct 0x0 SErr 0x0 action 0x6
ata1.15: edma_err_cause=00000084 pp_flags=00000001, dev error, EDMA self-disable
ata1.01: status: { DRDY ERR }
ata1.01: error: { UNC }
end_request: I/O error, dev sdc, sector 2107042224
raid5:md4: read error not correctable (sector 2102993328 on sdc2).
raid5: Disk failure on sdc2, disabling device.
raid5: Operation continuing on 3 devices.
I/O error in filesystem ("md4") meta-data dev md4 block 0x0       ("xfs_unmountfs_writesb") error 5 buf count 4096
I/O error in filesystem ("md4") meta-data dev md4 block 0x1d171d2b8       ("xlog_iodone") error 5 buf count 4096
Filesystem "md4": Log I/O Error Detected.  Shutting down filesystem: md4
LaCie-5big hald: unmounted /dev/md4 from '/media/internal_1' on behalf of uid 0

Structure of the array is the following:

[root@LaCie-5big /]# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md3 : active raid1 sdd5[3] sda5[0] sde5[4] sdc5[2] sdb5[1]
      255936 blocks [5/5] [UUUUU]
      
md2 : active raid1 sdd9[3] sda9[0] sde9[4] sdc9[2] sdb9[1]
      875456 blocks [5/5] [UUUUU]
      
md1 : active raid1 sdd8[3] sda8[0] sde8[4] sdc8[2] sdb8[1]
      843328 blocks [5/5] [UUUUU]
      
md0 : active raid1 sdd7[4] sde7[3] sdc7[2] sdb7[1] sda7[0]
      16000 blocks [5/5] [UUUUU]

And /dev/md4 is missing, normally constructed from /dev/sd[abcde]2
A page I found with some useful tips: https://raid.wiki.kernel.org/index.php/RAID_Recovery
Getting some more info:

mdadm --examine /dev/sda2 >> raid_sdx2.status
mdadm --examine /dev/sdb2 >> raid_sdx2.status
mdadm --examine /dev/sdc2 >> raid_sdx2.status
mdadm --examine /dev/sdd2 >> raid_sdx2.status
mdadm --examine /dev/sde2 >> raid_sdx2.status

$ cat raid_sdx2.status |egrep 'Event|/dev/sd'
/dev/sda2:
         Events : 1306184
/dev/sdb2:
         Events : 1306184
/dev/sdc2:
         Events : 1306177
/dev/sdd2:
         Events : 1306184
/dev/sde2:
         Events : 1306184

[root@LaCie-5big ~]# cat raid_sdx2.status |grep Role
   Device Role : Active device 0
   Device Role : Active device 1
   Device Role : Active device 2
   Device Role : spare
   Device Role : Active device 4

[root@LaCie-5big ~]# cat raid_sdx2.status |grep State
          State : clean
   Array State : AA..A ('A' == active, '.' == missing)
          State : clean
   Array State : AA..A ('A' == active, '.' == missing)
          State : clean
   Array State : AAAAA ('A' == active, '.' == missing)
          State : clean
   Array State : AA..A ('A' == active, '.' == missing)
          State : clean
   Array State : AA..A ('A' == active, '.' == missing)

So /dev/sdc2 had dropped and is out-of-sync.

Getting data back

Before messing up with it, better to shut down the Transmission server:

ngc --stop transmission

As a first attempt I tried to force reassembling the array:

mdadm --assemble --force /dev/md4 /dev/sda2 /dev/sdb2 /dev/sdc2 /dev/sdd2 /dev/sde2
mdadm: forcing event count in /dev/sdc2(2) from 1306177 upto 1306184
mdadm: clearing FAULTY flag for device 2 in /dev/md4 for /dev/sdc2
mdadm: /dev/md4 has been started with 4 drives (out of 5) and 1 spare.

As soon as the array appears again, the box mounts the corresponding shares which become accessible again.
It also starts trying to resync the new drive (/dev/sdd2) and... it crashes again after a few hours when it hits the hw errors on /dev/sdc2

[root@LaCie-5big ~]# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md4 : active raid5 sda2[0] sdd2[6] sde2[5] sdc2[2] sdb2[1]
      7805956096 blocks super 1.0 level 5, 512k chunk, algorithm 2 [5/4] [UUU_U]
      [>....................]  recovery =  0.0% (306560/1951489024) finish=30321.4min speed=1072K/sec

So better to start the array without the new drive, so at least we've a chance to save as much data as possible:

mdadm --stop /dev/md4
mdadm --assemble --force /dev/md4 /dev/sda2 /dev/sdb2 /dev/sdc2 /dev/sde2

If the array refused to be stopped, that's because it's in use:
You need to unmount any share mounted over the network, including the one by the media box

tango3[~]# umount /tmp/mnt/smb/0
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
mdadm --stop /dev/md4
mdadm --assemble --force /dev/md4 /dev/sda2 /dev/sdb2 /dev/sdc2 /dev/sde2

Fixing with ddrescue

At this point I could save a number of data but once I try to access files mapped on the faulty area, the array collapsed again.
So I tried a different approach:
Stop the array.

tango3[~]# umount /tmp/mnt/smb/0
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
umount /dev/md4
mdadm --stop /dev/md4

Run ddrescue to copy /dev/sdc2 (the faulty) to /dev/sdd2 (the new)
But ddrescue is not available on the box, neither screen which would be useful...
So I took them from Debian squeeze (oldstable).
Newer versions require newer libc & libstdc++6.

Extract the bins and drop them in the box. I copied also screenrc to /etc/, not sure if it's needed or not.
If the library is left in the current directory, calling screen has to be done as

LD_LIBRARY_PATH=. ./screen

And now we can call ddrescue:

./ddrescue -d /dev/sdc2 /dev/sdd2 /root/ddrescue.log

With the logfile it can be interrupted and restarted from where it was left.

Current status
rescued:     1998 GB,  errsize:   13824 B,  current rate:        0 B/s
   ipos:     1082 GB,   errors:       8,    average rate:   39354 kB/s
   opos:     1082 GB,     time from last successful read:     3.5 m
Finished

Not that bad.

# mdadm --assemble /dev/md4 /dev/sda2 /dev/sdb2 /dev/sdd2 /dev/sde2                                                                                 
mdadm: /dev/md4 has been started with 4 drives (out of 5).

And now disk sdc can be removed and replaced.
Hot remove should be ok but let's do it cleanly:

# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md4 : active raid5 sda2[0] sde2[5] sdd2[2] sdb2[1]
      7805956096 blocks super 1.0 level 5, 512k chunk, algorithm 2 [5/4] [UUU_U]
md3 : active raid1 sda5[0] sde5[4] sdd5[3] sdc5[2] sdb5[1]
      255936 blocks [5/5] [UUUUU]
md2 : active raid1 sda9[0] sde9[4] sdd9[3] sdc9[2] sdb9[1]
      875456 blocks [5/5] [UUUUU]
md1 : active raid1 sda8[0] sde8[4] sdd8[3] sdc8[2] sdb8[1]
      843328 blocks [5/5] [UUUUU]
md0 : active raid1 sde7[3] sdd7[4] sdc7[2] sdb7[1] sda7[0]
      16000 blocks [5/5] [UUUUU]
# mdadm /dev/md0 --fail /dev/sdc7
mdadm: set /dev/sdc7 faulty in /dev/md0
# mdadm /dev/md0 --remove /dev/sdc7
mdadm: hot removed /dev/sdc7 from /dev/md0
# mdadm /dev/md1 --fail /dev/sdc8  
mdadm: set /dev/sdc8 faulty in /dev/md1
# mdadm /dev/md1 --remove /dev/sdc8
mdadm: hot removed /dev/sdc8 from /dev/md1
# mdadm /dev/md2 --fail /dev/sdc9  
mdadm: set /dev/sdc9 faulty in /dev/md2
# mdadm /dev/md2 --remove /dev/sdc9
mdadm: hot removed /dev/sdc9 from /dev/md2
# mdadm /dev/md3 --fail /dev/sdc5  
mdadm: set /dev/sdc5 faulty in /dev/md3
# mdadm /dev/md3 --remove /dev/sdc5
mdadm: hot removed /dev/sdc5 from /dev/md3
# cat /proc/mdstat                 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md4 : active raid5 sda2[0] sde2[5] sdd2[2] sdb2[1]
      7805956096 blocks super 1.0 level 5, 512k chunk, algorithm 2 [5/4] [UUU_U]
md3 : active raid1 sda5[0] sde5[4] sdd5[3] sdb5[1]
      255936 blocks [5/4] [UU_UU]
md2 : active raid1 sda9[0] sde9[4] sdd9[3] sdb9[1]
      875456 blocks [5/4] [UU_UU]
md1 : active raid1 sda8[0] sde8[4] sdd8[3] sdb8[1]
      843328 blocks [5/4] [UU_UU]
md0 : active raid1 sde7[3] sdd7[4] sdb7[1] sda7[0]
      16000 blocks [5/4] [UU_UU]

Then physically remove sdc.

Checking the remaining drives thoroughly:

# smartctl -t long /dev/sda
# smartctl -t long /dev/sdb
# smartctl -t long /dev/sdd
# smartctl -t long /dev/sde

Five hours later...

# smartctl -l xselftest /dev/sda
# smartctl -l xselftest /dev/sdb
# smartctl -l xselftest /dev/sdd
# smartctl -l xselftest /dev/sde

They all say sth like

smartctl 5.40 2011-04-07 r5807 [arm-unknown-linux-gnueabi] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF READ SMART DATA SECTION ===
General Purpose Logging (GPL) feature set supported
SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     20001         -