脚本部署位置:
检测网络健康状况的脚本部署在10.10.10.247上。
脚本在crontab中的配置:
#monitor servers
*/5 * * * * /usr/local/shell/sunweijing/program/pingmonitor.sh > /dev/null 2>&1
#servers recover
*/5 * * * * /usr/local/shell/sunweijing/program/recover.sh > /dev/null 2>&1
脚本原理:
(1)pingmonitor.sh
脚本pingmonitor.sh会每5分钟,用“ping -W 1 -c 2 目的ip地址|grep "received" | awk '{print $6}' | sed 's/\%//g'”得出10.247与目的ip之间通信的丢包率,如果丢包率等于100,就进行短信报警,报警内容为:Please check the host what its ipaddrss is 目的ip,当前日期。
如果丢包率不为100%,就用“`ping -W 1 -c 3 目的ip| grep "rtt" | awk '{print $4}' | awk -F "/" '{print $2}' | awk -F "." '{print $1}'”得出10.247与目的ip之间通信的平均网络延迟,如果网络延迟大于30,就会短信报警,报警内容为:
Please check the host and the network what its ipaddrss is ip地址,Because the avg ttl is 网络延迟,当前日期。
脚本是从ip列表中读取目的ip,每当pingmonitor.sh检测某一个ip的丢包率为100%时,就会把它从ip列表中去除,在recover.txt中追加这个ip。
本脚本的亮点:
Ip列表不是存在一个文件中,而是分散到几个文件中,pingmonitor.sh脚本会几乎同时对这几个ip列表文件检测,大大提高了效率,想了解具体看下面的源码就可以了。
(2)recover.sh
脚本recover.sh的作用是,每5分钟用 “ping -W 1 -c 3 目的ip| grep "received" | awk '{print $6}' | sed 's/\%//g'”,得出与目的ip之间通信的掉包率,如果小于10%就会,把ip从recover.txt中去除,追加到列表中。
脚本源码:
(1) pingmonitor.sh
#! /bin/sh
/usr/local/shell/sunweijing/program/pingmonitor_1.sh &
/usr/local/shell/sunweijing/program/pingmonitor_2.sh &
/usr/local/shell/sunweijing/program/pingmonitor_3.sh &
/usr/local/shell/sunweijing/program/pingmonitor_4.sh &
/usr/local/shell/sunweijing/program/pingmonitor_5.sh &
/usr/local/shell/sunweijing/program/pingmonitor_6.sh &
/usr/local/shell/sunweijing/program/pingmonitor_7.sh &
/usr/local/shell/sunweijing/program/pingmonitor_8.sh &
/usr/local/shell/sunweijing/program/pingmonitor_9.sh &
/usr/local/shell/sunweijing/program/pingmonitor_10.sh &
/usr/local/shell/sunweijing/program/pingmonitor_11.sh &
##################################################
(2) 在pingmonitor.sh中引用每一个脚本源码基本相同,只是检测的ip列表不同,引用的每一个脚本对应检测相应的ip列表。
下面以pingmonitor_1.sh为例列出源码:
#! /bin/sh
#
#writer sunweijing
#
#host ping monitor
#
pdate=`date +%D%t%T`
#
phome=/usr/local/shell/sunweijing
#
for pip_1 in `cat $phome/sourcefile/ip10_1.txt`
#
do
#
#
plost_1=`ping -W 1 -c 2 $pip_1 | grep "received" | awk '{print $6}' | sed 's/\%//g'`
#
if [ $plost_1 -eq "100" ]
then
#
echo "$pip_1,$pdate" >> $phome/result/timeout.txt
#
echo "$pip_1" >> $phome/result/recover.txt
#
sed -i "/$pip_1/d" $phome/sourcefile/ip10_1.txt
#
#
#
wget --output-document=/dev/null " check the host what its ipaddrss is $pip_1,$pdate"
else
#
pttl_1=`ping -W 1 -c 3 $pip_1 | grep "rtt" | awk '{print $4}' | awk -F "/" '{print $2}' | awk -F "." '{print $1}'`
#
#
if [ $pttl_1 -gt "30" ]
#
then
#
wget --output-document=/dev/null " check the host and the network what its ipaddrss is $pip_1,Because the avg ttl is $pttl_1,$pdate"
#
fi
#
fi
#
done
############################################################################################
(3) /usr/local/shell/sunweijing/program/recover.sh源码:
#! /bin/sh
rdate=`date +%D%t%T`
#
rhome=/usr/local/shell/sunweijing
#
for rip in `cat $rhome/result/recover.txt`
#
do
#
rlost=`ping -W 1 -c 3 $rip | grep "received" | awk '{print $6}' | sed 's/\%//g'`
#
rlist=`wc -l $rhome/sourcefile/ip10_* | grep '1[56789]' | sed -n '1p' | awk '{print $2}'`
#
if [ $rlost -lt "10" ]
then
echo "$rip" >> $rlist
#
sed -i "/$rip/d" $rhome/result/recover.txt
#
wget --output-document=/dev/null " $rip is recovered,$rdate"
#
fi
done
###########################################################################################
以下是辅助功能脚本源码
(4)langood.sh
#! /bin/sh
#writer sunweijing
#
#monitor wangood
#
sendmessage ()
{
wget --output-document=/dev/null ""
}
################test lan##########################
lan=`ping -c 3 -W 3 10.10.10.239 | grep "received" | awk '{print $6}' | awk -F "%" '{print $1}'`
mess2="network to LAN gets right"
phonenu=13552250592
cudate=`date`
if [ $lan -lt 10 ]
then
sed -i '/langood.sh/d' /var/spool/cron/root
echo "*/2 * * * * /usr/local/shell/sunweijing/program/monitorwan.sh" >> /var/spool/cron/root
sendmessage $phonenu "$mess2 $cudate"
fi
###############################################################################################
(5)monitorwan.sh
#! /bin/sh
#writer sunweijing
#
#monitor outline
#
sendmessage ()
{
wget --output-document=/dev/null ""
}
cron1 ()
{
grep "monitorwan.sh" /var/spool/cron/root | sed -i 's/2/30/g'
}
################test lan##########################
lan=`ping -c 3 -W 3 10.10.10.239 | grep "received" | awk '{print $6}' | awk -F "%" '{print $1}'`
mess2="network to LAN is connectless"
phonenu=13552250592
messa="network to WAN is connectless"
cudate=`date`
if [ $lan -eq 100 ]
then
sendmessage $phonenu "$mess2 $cudate"
else
#################test wan#########################
ns1=`nslookup 202.102.227.68 | grep "name" | awk '{print $2}' | awk '{print $1}'`
ns2=`nslookup 202.99.96.68 | grep "name" | awk '{print $2}' | awk '{print $1}'`
ns3=`nslookup 202.106.196.115 | grep "name" | awk '{print $2}' | awk '{print $1}'`
ns4=`nslookup 8.8.8.8 | grep "name" | awk '{print $2}' | awk '{print $1}'`
nsresult=`echo $ns1$ns2$ns3$ns4`
if [ -z $nsresult ]
then
sendmessage $phonenu "$messa $cudate"
fi
fi
#################################################################################################
(6)wangood.sh
#! /bin/sh
#writer sunweijing
#
#monitor wangood
#
sendmessage ()
{
wget --output-document=/dev/null ""
}
################test lan##########################
lan=`ping -c 3 -W 3 10.10.10.239 | grep "received" | awk '{print $6}' | awk -F "%" '{print $1}'`
mess2="network to LAN is connectless"
phonenu=13552250592
messa="network to WAN is connectless"
cudate=`date`
if [ $lan -eq 100 ]
then
sendmessage $phonenu "$mess2 $cudate"
else
#################test wan#########################
ns1=`nslookup 202.102.227.68 | grep "name" | awk '{print $2}' | awk '{print $1}'`
ns2=`nslookup 202.99.96.68 | grep "name" | awk '{print $2}' | awk '{print $1}'`
ns3=`nslookup 202.106.196.115 | grep "name" | awk '{print $2}' | awk '{print $1}'`
ns4=`nslookup 8.8.8.8 | grep "name" | awk '{print $2}' | awk '{print $1}'`
nsresult=`echo $ns1$ns2$ns3$ns4`
if [ -z $nsresult ]
then
sendmessage $phonenu "$messa $cudate"
fi
fi
################################################################################
(7)以下是相关目录列表
一级目录:# ls
program result sourcefile tmp
二级目录:# ls program/
langood.sh pingmonitor_11.sh pingmonitor_3.sh pingmonitor_6.sh pingmonitor_9.sh test.sh
monitorwan.sh pingmonitor_1.sh pingmonitor_4.sh pingmonitor_7.sh pingmonitor.sh t.txt
pingmonitor_10.sh pingmonitor_2.sh pingmonitor_5.sh pingmonitor_8.sh recover.sh wangood.sh
# ls result/*
result/recover.txt result/timeout.txt
]# ls sourcefile/
allarp.txt allip.txt ip10_10.txt ip10_1.txt ip10_3.txt ip10_5.txt ip10_7.txt ip10_9.txt
allip10.txt allmac.txt ip10_11.txt ip10_2.txt ip10_4.txt ip10_6.txt ip10_8.txt ip10.txt
# ls tmp/
resuilt resuilt.txt result.txt