一、前言
Keepalived的作用是检测web服务器的状态,如果有一台web服务器死机,或工作出现故障,Keepalived将检测到,并将有故障的web服务器从系统中剔除,当web服务器工作正常后Keepalived自动将web服务器加入到服务器群中,这些工作全部自动完成,不需要人工干涉,需要人工做的只是修复故障的web服务器。keepalived是集群管理中保证集群高可用的一个服务软件,其功能类似于heartbeat,用来防止单点故障。二、准备工作
Master:centos5.8(32) Ip:192.168.18.88
Slave:centos5.8(32) Ip:192.168.18.89
VIP:192.168.18.90
keepalived-1.2.11.tar.gz
三、编译安装
#tar zxvf keepalived-1.2.11.tar.gz
#cd keepalived-1.2.11
#./configure --prefix=/usr/local/keepalived
#make
#make install
注:说明一下,所设keepalived安装到/usr/local/keepalived, 则所有配置文件均位于此目录之下。Linux默认总是将程序安装到/usr/local目录下,所有程序共享了/usr/local/sbin目录,这对一些人的操作习惯是有影响的。
四、设置keepalived服务启动脚本
1. 建立服务启动脚本,以便使用service命令控制
#cp /usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/init.d/keepalived
#chmod +x /etc/init.d/keepalived
因为我们使用非默认路径(/usr/local)安装keepalived, 故需要修改几处路径,以保证keepalived能正常启动, 需要修改的文件如下:
(1)修改/etc/init.d/keepalived,
寻找大约15行左右的. /etc/sysconfig/keepalived,
修改为:. /usr/local/keepalived/etc/sysconfig/keepalived, 即指向正确的文件位置
同时在上述行下添加以下内容(将keepavlied主程序所在路径导入到环境变量PATH中):
PATH="$PATH:/usr/local/keepalived/sbin"
export PATH
(2)修改/usr/local/keepalived/etc/sysconfig/keepalived文件,设置正确的服务启动参数 KEEPALIVED_OPTIONS="-D -f /usr/local/keepalived/etc/keepalived/keepalived.conf"
经过以上修改,keepalived基本安装即可完成,启动测试:
#service keepalived restart
(3) 切勿忘记将此服务设置为开机启动
#chkconfig keepalived on
五、配置keepalived参数
1. Master服务器:
#vim /usr/local/keepalived/etc/keepalived/keepalived.conf
注:可以把以前的配置的文件删掉或者重新建一个。把以下参数加进去复制进去。
- ! Configuration File for keepalived
- global_defs {
- router_id NodeA #节点A
- }
- vrrp_instance VI_1 {
- state MASTER #设置为主服务器名
- interface eth0 #监测网络接口
- virtual_router_id 51 #主、备必须一样
- priority 100 #(主、备机取不同的优先级,主机值较大,备机值较小,值越大优先级越高)
- advert_int 1 #VRRP Multicast广播周期秒数
- authentication {
- auth_type PASS #VRRP认证方式,主备必须一致
- auth_pass 1111 #(密码)
- }
- virtual_ipaddress {
- 192.168.18.90 #VRRP HA虚拟地址(可以为多个)
- }
- }
2.Slave服务器:
#vim /usr/local/keepalived/etc/keepalived/keepalived.conf
注:可以把以前的配置的文件删掉或者重新建一个。把以下参数加进去复制进去。
- ! Configuration File for keepalived
- global_defs {
- router_id NodeB #节点B
- }
- vrrp_instance VI_1 {
- state BACKUP #设置为备服务器名
- interface eth0 #监测网络接口
- virtual_router_id 51 #主、备必须一样
- priority 90 #(主、备机取不同的优先级,主机值较大,备机值较小,值越大优先级越高)
- advert_int 1 #VRRP Multicast广播周期秒数
- authentication {
- auth_type PASS #VRRP认证方式,主备必须一致
- auth_pass 1111 #(密码)
- }
- virtual_ipaddress {
- 192.168.18.90 #VRRP HA虚拟地址(可以为多个)
- }
- }
注:为了保证服务可靠性,我们应该在每个节点服务器上运行shell脚本检测本机的服务是否正常,一旦检测到服务异常时,尝试开启服务;如果开启失败,则停止掉本机的keepalived, 如此虚拟IP自动转移到备用机器之上,如每隔10秒检测一次本机服务状态,如果连接3次检测失败,则停止掉keepalived实例。同时如果本机服务是正常的,但是keepalived没有启动(故障恢复之后),则启动keepalived,以达到故障恢复之目的。
脚本名为check_service.sh。放在每个节点服务器的开机脚本里
#vim /etc/rc.local
或者手动执行测试。
- #!/bin/sh
- #date: 09/17/2015
- maxfails=3 #最大失败数
- fails=0 #失败
- success=0 #成功
- while [ 1 ]
- do
- echo "尝试连接http"
- /usr/bin/wget --timeout=3 --tries=1 -q -O /dev/null #试连接本地http服务http服务
- if [ $? -ne 0 ] ; then
- service httpd status | grep stopped
- echo "http访问失败,正常尝试开启!"
- logger -is "http service fails... try to start httpd."
- service httpd start 2>&1 | logger
- let fails=$[$fails+1]
- echo "fails:"$fails
- success=0
- else
- echo "http连接成功!"
- sleep 2
- echo "尝试连接tomcat"
- /usr/bin/wget --timeout=3 --tries=1 -q -O /dev/null
- if [ $? -ne 0 ] ; then
- service tomcat status | grep stopped
- echo "tomcat访问失败,正在尝试开启!"
- logger -is "tomcat service fails... try to start tomcat"
- echo "loading "
- service tomcat start 2>&1 | logger
- let fails=$[$fails+1]
- echo "fails:"$fails
- success=0
- else
- echo "tomcat连接成功!"
- sleep 2
- echo "尝试连接mysql服务"
- /usr/bin/wget --timeout=3 --tries=1 -q -O /dev/null
- if [ $? -ne 0 ] ; then
- service mysqld status | grep stopped
- echo "mysql访问失败,正在尝试开启!"
- logger -is "http service fails...try to start mysql"
- service mysqld start 2>&1 | logger
- let fails=$[$fails+1]
- echo "fails:"$fails
- success=0
- else
- echo "mysql连接成功!"
- fails=0
- let success=$[$success+1]
- echo "success:"$success
- fi
- fi
- fi
- if [ $fails -ge $maxfails ] ; then
- echo "失败大于等于3"
- fails=0
- success=0
- echo "查看keepalived状态"
- service keepalived status | grep running
- echo "httpd,tomcat,mysql当中有服务有故障,尝试关闭 keepalived服务 "
- if [ $? -eq 0 ] ; then
- logger -is "local service fails $maxfails times ... try to stop keepalived."
- service keepalived stop 2>&1 | logger
- echo "恭喜你,keepalived服务关闭成功!"
- fi
- fi
- if [ $success -gt $maxfails ] ; then
- service keepalived status | grep stopped
- echo "所有服务正常,尝试开启keepalived服务"
- if [ $? -eq 0 ] ; then
- logger -is "service changes normal, try to start keepalived ."
- service keepalived start 2>&1 | logger
- echo "keepalived服务已运行!"
- fi
- success=0
- fi
- sleep 25
- done