Collapse

Announcement

Collapse
No announcement yet.

System health bash script ?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • System health bash script ?

    Can anybody provide me a simple bash script sample which will produce important system health report consisting hardware resources via email ?

  • #2

    While there are many scripts available depends on the need of system report. You can use following script which will give detail report for,

    - Mounted File Systems
    - Read-only File System If Any
    - File Systems Disk Usage
    - Inode Usage
    - Zombie Processes If Found
    - RAM Utilization
    - SWAP Utilization
    - CPU Utilization
    - Current Load Average
    - Top 5 memory & CPU consuming resources.


    Create file /root/system-health.sh file and add following code then give 755 permission to file..


    Code:
    #!/bin/bash
    ##----------Created By Swapnil at 24x7servermanagement.com -----------------------------##
    ##---------- Purpose : To quickly check and report health status in a linux system.----------##
    ##---------- Tested on : RHEL/Centos-7/6, Ubuntu16----##
    ##-----NOTE: This script requires root privileges, otherwise you could run the script -------##
    ##---- as a sudo user who got root privileges. ----------------------------------------------##
    ##----------- "sudo /bin/bash <ScriptName>" -------------------------------------------------##
    
    S="************************************"
    D="-------------------------------------"
    
    mount=$(mount|egrep -iw "ext4|ext3|xfs|gfs|gfs2|btrfs"|sort -u -t' ' -k1,2)
    disk=$(df -PTh|egrep -iw "ext4|ext3|xfs|gfs|gfs2|btrfs"|sort -k6n|awk '!seen[$1]++')
    inode=$(df -PThi|egrep -iw "ext4|ext3|xfs|gfs|gfs2|btrfs"|sort -k6n|awk '!seen[$1]++')
    
    #--------Checking the availability of sysstat package..........#
    if [ ! -x /usr/bin/mpstat ]
    then
        printf "\nError : Either \"mpstat\" command not available OR \"sysstat\" package is not properly installed. Please make sure this package is installed and working properly!, then run this script.\n\n"
        exit 1
    fi
    
    echo -e "$S Health Status Report $S"
    echo -e "\nOperating System Details"
    echo -e "$D"
    printf "Hostname :" $(hostname -f > /dev/null 2>&1) && printf " $(hostname -f)" || printf " $(hostname -s)"
    
    [ -x /usr/bin/lsb_release ] &&  echo -e "\nOperating System :" $(lsb_release -d|awk -F: '{print $2}'|sed -e 's/^[ \t]*//')  || echo -e "\nOperating System :" $(cat /etc/system-release)
    echo -e "Kernel Version :" $(uname -r)
    printf "OS Architecture :" $(arch | grep x86_64 2>&1 > /dev/null) && printf " 64 Bit OS\n"  || printf " 32 Bit OS\n"
    
    #--------Print system uptime-------#
    UPTIME=$(uptime)
    echo $UPTIME|grep day 2>&1 > /dev/null
    if [ $? != 0 ]
    then
      echo $UPTIME|grep -w min 2>&1 > /dev/null && echo -e "System Uptime : "$(echo $UPTIME|awk '{print $2" by "$3}'|sed -e 's/,.*//g')" minutes"  || echo -e "System Uptime : "$(echo $UPTIME|awk '{print $2" by "$3" "$4}'|sed -e 's/,.*//g')" hours"
    else
      echo -e "System Uptime :" $(echo $UPTIME|awk '{print $2" by "$3" "$4" "$5" hours"}'|sed -e 's/,//g')
    fi
    echo -e "Current System Date & Time : "$(date +%c)
    
    
    #--------Check read-only file systems--------#
    echo -e "\nChecking For Read-only File System[s]"
    echo -e "$D"
    echo "$mount"|grep -w \(ro\) && echo -e "\n.....Read Only file system[s] found"|| echo -e ".....No read-only file system[s] found. "
    
    
    #--------Check currently mounted file systems--------#
    echo -e "\n\nChecking For Currently Mounted File System[s]"
    echo -e "$D$D"
    echo "$mount"|column -t
    
    
    #--------Check disk usage--------#
    echo -e "\n\nChecking For Disk Usage On Mounted File System[s]"
    echo -e "$D$D"
    echo -e "( 0-90% = OK/HEALTHY, 90-95% = WARNING, 95-100% = CRITICAL )"
    echo -e "$D$D"
    echo -e "Mounted File System[s] Utilization (Percentage Used):\n"
    
    echo "$disk"|awk '{print $1 " "$7}' > /tmp/s1.out
    echo "$disk"|awk '{print $6}'|sed -e 's/%//g' > /tmp/s2.out
    > /tmp/s3.out
    
    for i in $(cat /tmp/s2.out);
    do
    {
      if [ $i -ge 95 ];
       then
         echo -e $i"% ------------------Critical" >> /tmp/s3.out;
       elif [[ $i -ge 90 && $i -lt 95 ]];
       then
         echo -e $i"% ------------------Warning" >> /tmp/s3.out;
       else
         echo -e $i"% ------------------Good/Healthy" >> /tmp/s3.out;
      fi
    }
    done
    paste -d"\t" /tmp/s1.out /tmp/s3.out|column -t
    
    #--------Check zombie processes--------#
    echo -e "\n\nChecking For Zombie Processes"
    echo -e "$D"
    ps -eo stat|grep -w Z 1>&2 > /dev/null
    if [ $? == 0 ]
    then
      echo -e "Number of zombie process on the system are :" $(ps -eo stat|grep -w Z|wc -l)
      echo -e "\n  Details of each zombie processes found    "
      echo -e "  $D"
      ZPROC=$(ps -eo stat,pid|grep -w Z|awk '{print $2}')
      for i in $(echo "$ZPROC")
      do
          ps -o pid,ppid,user,stat,args -p $i
      done
    else
     echo -e "No zombie processes found on the system."
    fi
    
    #--------Check Inode usage--------#
    echo -e "\n\nChecking For INode Usage"
    echo -e "$D$D"
    echo -e "( 0-90% = OK/HEALTHY, 90-95% = WARNING, 95-100% = CRITICAL )"
    echo -e "$D$D"
    echo -e "INode Utilization (Percentage Used):\n"
    
    echo "$inode"|awk '{print $1" "$7}' > /tmp/s1.out
    echo "$inode"|awk '{print $6}'|sed -e 's/%//g' > /tmp/s2.out
    > /tmp/s3.out
    
    for i in $(cat /tmp/s2.out);
    do
      if [[ $i = *[[:digit:]]* ]];
      then
      {
      if [ $i -ge 95 ];
      then
        echo -e $i"% ------------------Critical" >> /tmp/s3.out;
      elif [[ $i -ge 90 && $i -lt 95 ]];
      then
        echo -e $i"% ------------------Warning" >> /tmp/s3.out;
      else
        echo -e $i"% ------------------Good/Healthy" >> /tmp/s3.out;
      fi
      }
      else
        echo -e $i"% (Inode Percentage details not available)" >> /tmp/s3.out
      fi
    done
    paste -d"\t" /tmp/s1.out /tmp/s3.out|column -t
    
    
    #--------Check RAM Utilization--------#
    memory=$(cat /proc/meminfo)
    echo -e "\n\nChecking Memory Usage Details"
    echo -e "$D"
    echo -e "Total RAM (/proc/meminfo) : "$(echo "$memory"|grep MemTotal|awk '{print $2/1024}') "MB OR" $(echo "$memory"|grep MemTotal|awk '{print $2/1024/1024}') "GB"
    echo -e "Used RAM in MB : "$(free -m|grep -w Mem:|awk '{print $3}')", in GB : "$(free -m|grep -w Mem:|awk '{print $3/1024}')
    echo -e "Free RAM in MB : "$(echo "$memory"|grep -w MemFree|awk '{print $2/1024}')" , in GB : "$(echo "$memory"|grep -w MemFree |awk '{print $2/1024/1024}')
    
    #--------Check SWAP Utilization--------#
    echo -e "\n\nChecking SWAP Details"
    echo -e "$D"
    echo -e "Total Swap Memory in MB : "$(echo "$memory"|grep -w SwapTotal|awk '{print $2/1024}')", in GB : "$(echo "$memory"|grep -w SwapTotal|awk '{print $2/1024/1024}')
    echo -e "Swap Free Memory in MB : "$(echo "$memory"|grep -w SwapFree|awk '{print $2/1024}')", in GB : "$(echo "$memory"|grep -w SwapFree|awk '{print $2/1024/1024}')
    
    #--------Check CPU Utilization (current data)--------#
    echo -e "\n\nChecking For CPU Utilization"
    echo -e "$D"
    echo -e "Manufacturer: "$(dmidecode -s processor-manufacturer|uniq)
    echo -e "Processor Model: "$(dmidecode -s processor-version|uniq)
    if [ -e /usr/bin/lscpu ]
    then
    {
        echo -e "No. Of Processor(s) :" $(lscpu|grep -w "Socket(s):"|awk -F: '{print $2}')
        echo -e "No. of Core(s) per processor :" $(lscpu|grep -w "Core(s) per socket:"|awk -F: '{print $2}')
    }
    else
    {
        echo -e "No. Of Processor(s) Found :" $(grep -c processor /proc/cpuinfo)
        echo -e "No. of Core(s) per processor :" $(grep "cpu cores" /proc/cpuinfo|uniq|wc -l)
    }
    fi
    echo -e "\nCurrent Processor Utilization Summary :\n"
    mpstat|tail -2
    
    #--------Check for load average (current data)--------#
    echo -e "\n\nChecking For Load Average"
    echo -e "$D"
    echo -e "Current Load Average : $(uptime|grep -o "load average.*"|awk '{print $3" " $4" " $5}')"
    
    
    #--------Print top 5 most memory consuming resources---------#
    echo -e "\n\nTop 5 Memory Resource Processes"
    echo -e "$D$D"
    ps -eo pmem,pcpu,pid,ppid,user,stat,args | sort -k 1 -r | head -6|sed 's/$/\n/'
    
    #--------Print top 5 most CPU consuming resources---------#
    echo -e "\n\nTop 5 CPU Resource Processes"
    echo -e "$D$D"
    ps -eo pcpu,pmem,pid,ppid,user,stat,args | sort -k 1 -r | head -6|sed 's/$/\n/'



    After that setup following cronjob which will execute after every 24 hours..


    Code:
    0 1 * * * /root/system-health.sh 2>&1 | mail -s "Server Health Check" example@mail.example
    Last edited by Swapnil; 11-16-2017, 08:59 PM.

    Comment

    Working...
    X