Announcement

**Mandeep** · 04-23-2018, 11:16 AM

This first thing to check in this case is the RAID status of the server.

In our case, you will notice that in the RAID md1, the second disk has gone out of Sync [U_]. To check this use below command.

Code:

root@falcon947 [~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[0] sdb1[1]
      524224 blocks super 1.0 [2/2] [UU]

md1 : active raid1 sda3[0]
      1951809344 blocks super 1.1 [2/1] [U_]
      bitmap: 12/15 pages [48KB], 65536KB chunk

unused devices: <none>

check dmesg to get the accurate results and here you will find that disk sdb3 is the one that has an issue.

Code:

root@falcon947 [~]# dmesg | grep sdb
sd 1:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
sd 1:0:0:0: [sdb] 4096-byte physical blocks
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sdb: sda1 sda2 sda3
 sdb1 sdb2 sdb3
sd 1:0:0:0: [sdb] Attached SCSI disk
md: bind<sdb3>
md: kicking non-fresh sdb3 from array!
md: unbind<sdb3>
md: export_rdev(sdb3)
md: bind<sdb1>
Adding 1048572k swap on /dev/sdb2.  Priority:-2 extents:1 across:1048572k

check the stucture or device blocks by using the command 'lsblk'. You can see that sdb3 is not a part of raid and doesn't hold any mount point.

Code:

root@falcon947 [~]# lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
sda       8:0    0  1.8T  0 disk
â”œâ”€sda1    8:1    0  512M  0 part
â”‚ â””â”€md0   9:0    0  512M  0 raid1 /boot
â”œâ”€sda2    8:2    0    1G  0 part  [SWAP]
â””â”€sda3    8:3    0  1.8T  0 part
  â””â”€md1   9:1    0  1.8T  0 raid1 /
sdb       8:16   0  1.8T  0 disk
â”œâ”€sdb1    8:17   0  512M  0 part
â”‚ â””â”€md0   9:0    0  512M  0 raid1 /boot
â”œâ”€sdb2    8:18   0    1G  0 part  [SWAP]
â””â”€sdb3    8:19   0  1.8T  0 part

Before we perform any operation, we will have to analyze the drive 'sdb' and check for any errors using 'smartctl'. The disk in our case 'Passed' all tests which is a good sign.

Code:

root@falcon947 [~]# smartctl -H /dev/sdb
smartctl 5.43 2016-09-28 r4347 [x86_64-linux-2.6.32-696.13.2.el6.x86_64] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

So, we will try to remove the device sdb3 from RAID.

Code:

root@falcon947 [~]# /sbin/mdadm /dev/md1 --fail /dev/sdb3 --remove /dev/sdb3
mdadm: set device faulty failed for /dev/sdb3:  No such device

Try to re-add the device sdb3 to RAID md1 using below command.

Code:

root@falcon947 [~]# /sbin/mdadm /dev/md1 --add /dev/sdb3
mdadm: added /dev/sdb3

Once done, you can check the block devices in the tree structure as we did earlier.

Code:

root@falcon947 [~]# lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
sda       8:0    0  1.8T  0 disk
â”œâ”€sda1    8:1    0  512M  0 part
â”‚ â””â”€md0   9:0    0  512M  0 raid1 /boot
â”œâ”€sda2    8:2    0    1G  0 part  [SWAP]
â””â”€sda3    8:3    0  1.8T  0 part
  â””â”€md1   9:1    0  1.8T  0 raid1 /
sdb       8:16   0  1.8T  0 disk
â”œâ”€sdb1    8:17   0  512M  0 part
â”‚ â””â”€md0   9:0    0  512M  0 raid1 /boot
â”œâ”€sdb2    8:18   0    1G  0 part  [SWAP]
â””â”€sdb3    8:19   0  1.8T  0 part
  â””â”€md1   9:1    0  1.8T  0 raid1 /

You will notice the device that was out of sync from the array is being recovered and the RAID array md1 is being rebuilt with devices sdb3 and sda3.

Code:

root@falcon947 [~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[0] sdb1[1]
      524224 blocks super 1.0 [2/2] [UU]

md1 : active raid1 sdb3[2] sda3[0]
      1951809344 blocks super 1.1 [2/1] [U_]
      [>....................]  recovery =  0.0% (43136/1951809344) finish=15830.5min speed=2054K/sec
      bitmap: 12/15 pages [48KB], 65536KB chunk

unused devices: <none>

That's it !!

Announcement

Degraded array event detected.

Degraded array event detected.

Comment