安装pacemaker rpm包后,遇到启动失败的情况,原因和动态链接库的加载有关,以下是详细。
问题
编译生成pacemaker 1.1.15的rpm包,然后在其它机器上安装pacemaker rpm包后,启动失败。
[root@srdsdevapp73 ~]# service pacemaker start Starting Pacemaker Cluster Manager [FAILED]
环境
- CentOS 6.3 64bit
原因
通过strace发现pacemaker启动失败由于加载库libcoroipcc.so.4失败
[root@srdsdevapp73 ~]# strace -f service pacemaker start ... [pid 19960] writev(2, [{"pacemakerd", 10}, {": ", 2}, {"error while loading shared libra"..., 36}, {": ", 2}, {"libcoroipcc.so.4", 16}, {": ", 2}, {"cannot open shared object file", 30}, {": ...
再用ldd检查pacemakerd,发现总共有3个库找不到
[root@srdsdevapp73 ~]# ldd /usr/sbin/pacemakerd linux-vdso.so.1 => (0x00007fffc4c9f000) libcrmcluster.so.4 => /usr/lib/libcrmcluster.so.4 (0x0000003cbac00000) libstonithd.so.2 => /usr/lib/libstonithd.so.2 (0x0000003cba400000) libcrmcommon.so.3 => /usr/lib/libcrmcommon.so.3 (0x0000003cb4c00000) libm.so.6 => /lib64/libm.so.6 (0x0000003cb3c00000) libcpg.so.4 => /usr/lib64/libcpg.so.4 (0x00007f3f72199000) libcfg.so.6 => /usr/lib64/libcfg.so.6 (0x00007f3f71f95000) libcmap.so.4 => /usr/lib64/libcmap.so.4 (0x00007f3f71d8f000) libquorum.so.5 => /usr/lib64/libquorum.so.5 (0x00007f3f71b8b000) libgnutls.so.26 => /usr/lib64/libgnutls.so.26 (0x0000003cb8800000) libcorosync_common.so.4 => /usr/lib64/libcorosync_common.so.4 (0x00007f3f71988000) libplumb.so.2 => /usr/lib64/libplumb.so.2 (0x00007f3f71754000) libpils.so.2 => /usr/lib64/libpils.so.2 (0x00007f3f7154b000) libqb.so.0 => /usr/lib64/libqb.so.0 (0x00007f3f712e6000) libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003cb2c00000) libbz2.so.1 => /lib64/libbz2.so.1 (0x0000003cb7000000) libxslt.so.1 => /usr/lib64/libxslt.so.1 (0x0000003cb4800000) libxml2.so.2 => /usr/lib64/libxml2.so.2 (0x0000003cb6000000) libc.so.6 => /lib64/libc.so.6 (0x0000003cb2400000) libuuid.so.1 => /lib64/libuuid.so.1 (0x0000003cb5000000) libpam.so.0 => /lib64/libpam.so.0 (0x0000003cb6c00000) librt.so.1 => /lib64/librt.so.1 (0x0000003cb3000000) libdl.so.2 => /lib64/libdl.so.2 (0x0000003cb2800000) libglib-2.0.so.0 => /lib64/libglib-2.0.so.0 (0x0000003cb3800000) libltdl.so.7 => /usr/lib64/libltdl.so.7 (0x0000003cb8400000) libcoroipcc.so.4 => not found libcfg.so.4 => not found libconfdb.so.4 => not found libtasn1.so.3 => /usr/lib64/libtasn1.so.3 (0x0000003cb7800000) libz.so.1 => /lib64/libz.so.1 (0x0000003cb3400000) libgcrypt.so.11 => /lib64/libgcrypt.so.11 (0x0000003cb7400000) /lib64/ld-linux-x86-64.so.2 (0x0000003cb2000000) libaudit.so.1 => /lib64/libaudit.so.1 (0x0000003cb6400000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x0000003cb5c00000) libgpg-error.so.0 => /lib64/libgpg-error.so.0 (0x0000003cb6800000) libfreebl3.so => /lib64/libfreebl3.so (0x0000003cb5800000)
上面有一段"/usr/lib/libcrmcluster.so.4"比较奇怪,确认后发现文件不对,是以前安装的版本(不清楚当初怎么安装的了)。 正确的库位置应该是"/usr/lib64/libcrmcluster.so.4"。 将老版本的pacemaker删除后,一切正常
[root@srdsdevapp73 ~]# rm -f /usr/lib/libcrm* [root@srdsdevapp73 ~]# rm -f /usr/lib/libstonithd.* [root@srdsdevapp73 ~]# ldd /usr/sbin/pacemakerd linux-vdso.so.1 => (0x00007fff9a3ff000) libcrmcluster.so.4 => /usr/lib64/libcrmcluster.so.4 (0x00007f849a1fc000) libstonithd.so.2 => /usr/lib64/libstonithd.so.2 (0x00007f8499fea000) libcrmcommon.so.3 => /usr/lib64/libcrmcommon.so.3 (0x00007f8499d93000) libm.so.6 => /lib64/libm.so.6 (0x0000003cb3c00000) libcpg.so.4 => /usr/lib64/libcpg.so.4 (0x00007f8499b8c000) libcfg.so.6 => /usr/lib64/libcfg.so.6 (0x00007f8499988000) libcmap.so.4 => /usr/lib64/libcmap.so.4 (0x00007f8499782000) libquorum.so.5 => /usr/lib64/libquorum.so.5 (0x00007f849957e000) libgnutls.so.26 => /usr/lib64/libgnutls.so.26 (0x0000003cb8800000) libcorosync_common.so.4 => /usr/lib64/libcorosync_common.so.4 (0x00007f849937b000) libplumb.so.2 => /usr/lib64/libplumb.so.2 (0x00007f8499147000) libpils.so.2 => /usr/lib64/libpils.so.2 (0x00007f8498f3e000) libqb.so.0 => /usr/lib64/libqb.so.0 (0x00007f8498cd9000) libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003cb2c00000) libbz2.so.1 => /lib64/libbz2.so.1 (0x0000003cb7000000) libxslt.so.1 => /usr/lib64/libxslt.so.1 (0x0000003cb4800000) libxml2.so.2 => /usr/lib64/libxml2.so.2 (0x0000003cb6000000) libc.so.6 => /lib64/libc.so.6 (0x0000003cb2400000) libuuid.so.1 => /lib64/libuuid.so.1 (0x0000003cb5000000) libpam.so.0 => /lib64/libpam.so.0 (0x0000003cb6c00000) librt.so.1 => /lib64/librt.so.1 (0x0000003cb3000000) libdl.so.2 => /lib64/libdl.so.2 (0x0000003cb2800000) libglib-2.0.so.0 => /lib64/libglib-2.0.so.0 (0x0000003cb3800000) libltdl.so.7 => /usr/lib64/libltdl.so.7 (0x0000003cb8400000) libtasn1.so.3 => /usr/lib64/libtasn1.so.3 (0x0000003cb7800000) libz.so.1 => /lib64/libz.so.1 (0x0000003cb3400000) libgcrypt.so.11 => /lib64/libgcrypt.so.11 (0x0000003cb7400000) /lib64/ld-linux-x86-64.so.2 (0x0000003cb2000000) libaudit.so.1 => /lib64/libaudit.so.1 (0x0000003cb6400000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x0000003cb5c00000) libgpg-error.so.0 => /lib64/libgpg-error.so.0 (0x0000003cb6800000) libfreebl3.so => /lib64/libfreebl3.so (0x0000003cb5800000) [root@srdsdevapp73 ~]# service pacemaker start Starting Pacemaker Cluster Manager [ OK ]
总结
Linux下查找动态链接库的默认路径(未在/etc/ld.so.conf中设置,动态链接库加载时会优先查找/etc/ld.so.cache中库)的顺序如下,如果有同名的库文件挡在前面,可能导致动态链接库加载失败。
- /lib
- /usr/lib
- /lib64
- /usr/lib64