Collapse

Announcement

Collapse
No announcement yet.

Degraded array event detected.

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Degraded array event detected.

    How to resolve the error Degraded array event detected on centOS server?

  • #2
    This first thing to check in this case is the RAID status of the server.

    In our case, you will notice that in the RAID md1, the second disk has gone out of Sync [U_]. To check this use below command.

    Code:
    root@falcon947 [~]# cat /proc/mdstat
    Personalities : [raid1]
    md0 : active raid1 sda1[0] sdb1[1]
          524224 blocks super 1.0 [2/2] [UU]
    
    md1 : active raid1 sda3[0]
          1951809344 blocks super 1.1 [2/1] [U_]
          bitmap: 12/15 pages [48KB], 65536KB chunk
    
    unused devices: <none>

    check dmesg to get the accurate results and here you will find that disk sdb3 is the one that has an issue.

    Code:
    root@falcon947 [~]# dmesg | grep sdb
    sd 1:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
    sd 1:0:0:0: [sdb] 4096-byte physical blocks
    sd 1:0:0:0: [sdb] Write Protect is off
    sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
    sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
     sdb: sda1 sda2 sda3
     sdb1 sdb2 sdb3
    sd 1:0:0:0: [sdb] Attached SCSI disk
    md: bind<sdb3>
    md: kicking non-fresh sdb3 from array!
    md: unbind<sdb3>
    md: export_rdev(sdb3)
    md: bind<sdb1>
    Adding 1048572k swap on /dev/sdb2.  Priority:-2 extents:1 across:1048572k

    check the stucture or device blocks by using the command 'lsblk'. You can see that sdb3 is not a part of raid and doesn't hold any mount point.

    Code:
    root@falcon947 [~]# lsblk
    NAME    MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
    sda       8:0    0  1.8T  0 disk
    ├─sda1    8:1    0  512M  0 part
    │ └─md0   9:0    0  512M  0 raid1 /boot
    ├─sda2    8:2    0    1G  0 part  [SWAP]
    └─sda3    8:3    0  1.8T  0 part
      └─md1   9:1    0  1.8T  0 raid1 /
    sdb       8:16   0  1.8T  0 disk
    ├─sdb1    8:17   0  512M  0 part
    │ └─md0   9:0    0  512M  0 raid1 /boot
    ├─sdb2    8:18   0    1G  0 part  [SWAP]
    └─sdb3    8:19   0  1.8T  0 part
    Before we perform any operation, we will have to analyze the drive 'sdb' and check for any errors using 'smartctl'. The disk in our case 'Passed' all tests which is a good sign.

    Code:
    root@falcon947 [~]# smartctl -H /dev/sdb
    smartctl 5.43 2016-09-28 r4347 [x86_64-linux-2.6.32-696.13.2.el6.x86_64] (local build)
    Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
    
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    So, we will try to remove the device sdb3 from RAID.

    Code:
    root@falcon947 [~]# /sbin/mdadm /dev/md1 --fail /dev/sdb3 --remove /dev/sdb3
    mdadm: set device faulty failed for /dev/sdb3:  No such device
    Try to re-add the device sdb3 to RAID md1 using below command.

    Code:
    root@falcon947 [~]# /sbin/mdadm /dev/md1 --add /dev/sdb3
    mdadm: added /dev/sdb3
    Once done, you can check the block devices in the tree structure as we did earlier.

    Code:
    root@falcon947 [~]# lsblk
    NAME    MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
    sda       8:0    0  1.8T  0 disk
    ├─sda1    8:1    0  512M  0 part
    │ └─md0   9:0    0  512M  0 raid1 /boot
    ├─sda2    8:2    0    1G  0 part  [SWAP]
    └─sda3    8:3    0  1.8T  0 part
      └─md1   9:1    0  1.8T  0 raid1 /
    sdb       8:16   0  1.8T  0 disk
    ├─sdb1    8:17   0  512M  0 part
    │ └─md0   9:0    0  512M  0 raid1 /boot
    ├─sdb2    8:18   0    1G  0 part  [SWAP]
    └─sdb3    8:19   0  1.8T  0 part
      └─md1   9:1    0  1.8T  0 raid1 /
    You will notice the device that was out of sync from the array is being recovered and the RAID array md1 is being rebuilt with devices sdb3 and sda3.

    Code:
    root@falcon947 [~]# cat /proc/mdstat
    Personalities : [raid1]
    md0 : active raid1 sda1[0] sdb1[1]
          524224 blocks super 1.0 [2/2] [UU]
    
    md1 : active raid1 sdb3[2] sda3[0]
          1951809344 blocks super 1.1 [2/1] [U_]
          [>....................]  recovery =  0.0% (43136/1951809344) finish=15830.5min speed=2054K/sec
          bitmap: 12/15 pages [48KB], 65536KB chunk
    
    unused devices: <none>
    That's it !!

    Comment

    Working...
    X