How to resolve the error Degraded array event detected on centOS server?
Collapse
Announcement
Collapse
No announcement yet.
Degraded array event detected.
Collapse
X
-
This first thing to check in this case is the RAID status of the server.
In our case, you will notice that in the RAID md1, the second disk has gone out of Sync [U_]. To check this use below command.
Code:root@falcon947 [~]# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sda1[0] sdb1[1] 524224 blocks super 1.0 [2/2] [UU] md1 : active raid1 sda3[0] 1951809344 blocks super 1.1 [2/1] [U_] bitmap: 12/15 pages [48KB], 65536KB chunk unused devices: <none>
check dmesg to get the accurate results and here you will find that disk sdb3 is the one that has an issue.
Code:root@falcon947 [~]# dmesg | grep sdb sd 1:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB) sd 1:0:0:0: [sdb] 4096-byte physical blocks sd 1:0:0:0: [sdb] Write Protect is off sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdb: sda1 sda2 sda3 sdb1 sdb2 sdb3 sd 1:0:0:0: [sdb] Attached SCSI disk md: bind<sdb3> md: kicking non-fresh sdb3 from array! md: unbind<sdb3> md: export_rdev(sdb3) md: bind<sdb1> Adding 1048572k swap on /dev/sdb2. Priority:-2 extents:1 across:1048572k
check the stucture or device blocks by using the command 'lsblk'. You can see that sdb3 is not a part of raid and doesn't hold any mount point.
Before we perform any operation, we will have to analyze the drive 'sdb' and check for any errors using 'smartctl'. The disk in our case 'Passed' all tests which is a good sign.Code:root@falcon947 [~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk ├─sda1 8:1 0 512M 0 part │ └─md0 9:0 0 512M 0 raid1 /boot ├─sda2 8:2 0 1G 0 part [SWAP] └─sda3 8:3 0 1.8T 0 part └─md1 9:1 0 1.8T 0 raid1 / sdb 8:16 0 1.8T 0 disk ├─sdb1 8:17 0 512M 0 part │ └─md0 9:0 0 512M 0 raid1 /boot ├─sdb2 8:18 0 1G 0 part [SWAP] └─sdb3 8:19 0 1.8T 0 part
So, we will try to remove the device sdb3 from RAID.Code:root@falcon947 [~]# smartctl -H /dev/sdb smartctl 5.43 2016-09-28 r4347 [x86_64-linux-2.6.32-696.13.2.el6.x86_64] (local build) Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED
Try to re-add the device sdb3 to RAID md1 using below command.Code:root@falcon947 [~]# /sbin/mdadm /dev/md1 --fail /dev/sdb3 --remove /dev/sdb3 mdadm: set device faulty failed for /dev/sdb3: No such device
Once done, you can check the block devices in the tree structure as we did earlier.Code:root@falcon947 [~]# /sbin/mdadm /dev/md1 --add /dev/sdb3 mdadm: added /dev/sdb3
You will notice the device that was out of sync from the array is being recovered and the RAID array md1 is being rebuilt with devices sdb3 and sda3.Code:root@falcon947 [~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk ├─sda1 8:1 0 512M 0 part │ └─md0 9:0 0 512M 0 raid1 /boot ├─sda2 8:2 0 1G 0 part [SWAP] └─sda3 8:3 0 1.8T 0 part └─md1 9:1 0 1.8T 0 raid1 / sdb 8:16 0 1.8T 0 disk ├─sdb1 8:17 0 512M 0 part │ └─md0 9:0 0 512M 0 raid1 /boot ├─sdb2 8:18 0 1G 0 part [SWAP] └─sdb3 8:19 0 1.8T 0 part └─md1 9:1 0 1.8T 0 raid1 /
That's it !!Code:root@falcon947 [~]# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sda1[0] sdb1[1] 524224 blocks super 1.0 [2/2] [UU] md1 : active raid1 sdb3[2] sda3[0] 1951809344 blocks super 1.1 [2/1] [U_] [>....................] recovery = 0.0% (43136/1951809344) finish=15830.5min speed=2054K/sec bitmap: 12/15 pages [48KB], 65536KB chunk unused devices: <none>
Comment