At least that is mdadm's judgement. But reading carefully I think it's wrong. The raid header on each member contains the state of the array. I think that is done to prevent you could pull a disk, update the degraded array, and add the (now obsolete) disk again, killing the filesystem.So, I understand then that disk A is 'kaputt',
But that status is written by the raid manager in the kernel. (Of course. The member itself is just a partition) So how can sda2 contain a raid status where sdb2 is missing, while sdb2 has the opposite? I think the raid manager writes this state as soon as the state changes. So if you pull a disk from an assembled array, that disk is marked missing on all other members. So the only way I can think of to get your situation is when the array was assembled with sdb2 missing, and later it was assembled with sda2 missing. The 'Update times' seem to agree:
- Code: Select all
Update Time : Sat May 23 13:42:53 2015
Update Time : Mon Jun 15 20:46:21 2015
Update Time : Mon Jun 15 20:46:21 2015
Update Time : Mon Jun 15 20:46:21 2015
(BTW, the box should have refused to assemble the array with sda2 missing. As sdb2 was missing before, the array was down. Maybe you forced something using the webinterface on June 15?)
AFAIK there are 2 ways you can get this error. A hardware error, causing a sector to be unreadable, or an error in the filesystem, pointing to a sector outside the array. A look at dmesg might tell you. When the above story is true, both could happen. The filesystem is likely damaged, and possibly sdb2 is dying.mounting /dev/md4 on /mountpoint failed: Input/output error
If sdb2 is dying, you can try to reassemble the array with sdb2 missing. If the filesystem is damaged, you could do the same, but if you made significant changes to the filesystem after Jun 15, this might not help. But it won't hurt to try, of course. As long as you mount the array read only it won't cause further damager. Maybe the array can be assembled this way:
- Code: Select all
mdadm --assemble /dev/md4 /dev/sd[acd]2 --force
If you are tempted to repair the filesystem, it is desireable to generate a low-level copy of the disks (or filesystem) first, as it can bring further damage. Unless you have good backups, of course.