Two years and one month after
setting up vinum RAID5 array, one disk has failed:
Apr 5 15:51:23 heart /kernel: ad4s1h: hard error reading fsbn 483750673 of 241875305-241875336 (ad4s1 bn 483750673; cn 479911 tn 6 sn 7)
Apr 5 15:51:23 heart /kernel: ad4: timeout waiting for cmd=ef s=e0 e=04
{...repeating many times
Apr 5 15:51:23 heart /kernel: ad4: timeout sending command=c4 s=e0 e=04
Apr 5 15:51:23 heart /kernel: ad4: error executing command - resetting
Apr 5 15:51:23 heart /kernel: ad4: timeout sending command=ec s=80 e=04
Apr 5 15:51:24 heart /kernel: ad4: ATA identify failed
Apr 5 15:51:24 heart /kernel: ad4: timeout sending command=c6 s=80 e=04
}
Apr 5 15:51:24 heart /kernel: vinum: Can't write config to /dev/ad4s1h, error 5
Apr 5 15:51:24 heart /kernel: ad4: timeout sending command=e7 s=80 e=04
Apr 5 15:51:24 heart /kernel: ad4: flushing cache on close failed
New IDE WD Caviar 200GB now costs $68 (was $107).
dmesg after replacing:
ad2: 190782MB [387621/16/63] at ata1-master UDMA33
ad4: 190782MB [387621/16/63] at ata2-master UDMA100
ad6: 190782MB [387621/16/63] at ata3-master UDMA100
'vinum printconfig' shows that it has lost the failed disk:
...
drive b device [nothing here]
...
So the recovery procedure was like this:
# fdisk -BI ad4 ignore "invalid mbr"
# disklabel -wB ad4s1 auto
# disklabel -e ad4s1 add new partition 'h' equal to 'c' with fstype vinum:
8 partitions:
# size offset fstype [fsize bsize bps/cpg]
c: 390721905 0 unused 0 0 # (Cyl. 0 - 387620*)
h: 390721905 0 vinum # (Cyl. 0 - 387620*)
# cat vinum.tmp.conf
drive b device /dev/ad4s1h
# vinum create vinum.tmp.conf
# rm vinum.tmp.conf
# vinum start raid5.p0.s1 which was "stalled"
And now it does rebuild (approx. 6 hours).
Upd. Two days after: a crash after more then year of normal work.
IdlePTD at physical address 0x00454000
initial pcb at physical address 0x00392600
panicstr: ffs_blkfree: freeing free block
panic messages:
---
panic: ffs_blkfree: freeing free block
syncing disks... 5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
giving up on 2 buffers
with stack trace:
#0 0xc01747fa in dumpsys ()
#1 0xc01745c4 in boot ()
#2 0xc01749f8 in poweroff_wait ()
#3 0xc0266dff in ffs_blkfree ()
#4 0xc026b934 in indir_trunc ()
#5 0xc026b6e6 in handle_workitem_freeblocks ()
#6 0xc0269b2b in process_worklist_item ()
#7 0xc02699ba in softdep_process_worklist ()
#8 0xc01a3e6b in sched_sync ()