[Pacemaker] Pacemaker installed to custom location
Andrew Beekhof
andrew at beekhof.net
Fri Apr 26 00:29:12 UTC 2013
On 26/04/2013, at 12:12 AM, James Masson <james.masson at opencredo.com> wrote:
>
> Hi list,
>
> I'm trying to build and run pacemaker from a custom location.
>
>
> #################
> # cluster-glue
> tar xf pacemaker/cluster-glue-1.0.11+.tar.gz
> (
> cd Reusable-Cluster-Components-glue--8347e8c9b94f
> ./autogen.sh
> ./configure --prefix=${BOSH_INSTALL_TARGET} --enable-fatal-warnings=no --with-daemon-group=vcap --with-daemon-user=vcap --with-ocf-root=${BOSH_INSTALL_TARGET}/usr/lib/ocf/resource.d/pacemaker
> make
> make install
> )
>
>
> # libqb
> tar xf pacemaker/libqb-0.14.4.tar.gz
> (
> cd libqb-0.14.4
> ./autogen.sh
> ./configure --prefix=${BOSH_INSTALL_TARGET}
> make
> make install
> )
>
> # corosync
> tar xzf pacemaker/corosync-2.3.0.tar.gz
> (
> cd corosync-2.3.0
> export PKG_CONFIG_PATH="${BOSH_INSTALL_TARGET}/lib/pkgconfig/"
> export LDFLAGS="-L${BOSH_INSTALL_TARGET}/lib -L${BOSH_INSTALL_TARGET}/lib/heartbeat -L${BOSH_INSTALL_TARGET}/lib/stonith -L${BOSH_INSTALL_TARGET}/lib/pkgconfig"
> export CFLAGS="-I${BOSH_INSTALL_TARGET}/include/heartbeat -I${BOSH_INSTALL_TARGET}/include "
> ./autogen.sh
> # ./configure --prefix=${BOSH_INSTALL_TARGET} --disable-nss --with-socket-dir=/var/vcap/sys/run/cluster-stack --sysconfdir=/var/vcap/jobs/cluster-stack/etc
> ./configure --prefix=${BOSH_INSTALL_TARGET} --disable-nss
> make
> make install
> )
>
> # pacemaker
> tar xf pacemaker/Pacemaker-1.1.9.tar.gz
> (
> cd pacemaker-Pacemaker-1.1.9
> export PKG_CONFIG_PATH="${BOSH_INSTALL_TARGET}/lib/pkgconfig/"
> export LDFLAGS="-L${BOSH_INSTALL_TARGET}/lib -L${BOSH_INSTALL_TARGET}/lib/heartbeat -L${BOSH_INSTALL_TARGET}/lib/stonith -L${BOSH_INSTALL_TARGET}/lib/pkgconfig"
> export CFLAGS="-I${BOSH_INSTALL_TARGET}/include/heartbeat -I${BOSH_INSTALL_TARGET}/include "
> ./autogen.sh
> ./configure --prefix=${BOSH_INSTALL_TARGET} --without-snmp --with-corosync -with-ais --with-cs-quorum --with-ais-prefix=${BOSH_INSTALL_TARGET}
> make
> make install
> )
> ######################
>
> I've tried this with the latest versions, and with recompiling the current Ubuntu versions of these packages - the result is the same.
>
> The packages compile correctly - in this case into the directory /var/vcap/packages/cluster-stack
>
> The contents of that directory is then shipped to other machines to run.
>
> ld.so.conf is updated with "/var/vcap/packages/cluster-stack/lib/"
>
> Corosync starts up fine.
>
> Pacemakerd does not - the result is.
Try turning up the debug to see why the cib isn't happy:
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: error: pcmk_child_exit: Child process cib exited (pid=10484, rc=100)
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: warning: pcmk_child_exit: Pacemaker child process cib no longer
>
> ###################
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync notice [MAIN ] Corosync Cluster Engine ('UNKNOWN'): started and ready to provide service.
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync info [MAIN ] Corosync built-in features: pie relro bindnow
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync notice [TOTEM ] Initializing transport (UDP/IP Unicast).
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync notice [TOTEM ] Initializing transmit/receive security (NSS) crypto: none hash: none
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync notice [TOTEM ] The network interface [10.0.4.50] is now up.
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync notice [SERV ] Service engine loaded: corosync configuration map access [0]
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync info [QB ] server name: cmap
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync notice [SERV ] Service engine loaded: corosync configuration service [1]
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync info [QB ] server name: cfg
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync notice [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync info [QB ] server name: cpg
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync notice [SERV ] Service engine loaded: corosync profile loading service [4]
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync notice [SERV ] Service engine loaded: corosync cluster quorum service v0.1 [3]
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync info [QB ] server name: quorum
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync notice [TOTEM ] adding new UDPU member {10.0.4.50}
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync notice [TOTEM ] adding new UDPU member {10.0.4.51}
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync notice [TOTEM ] A processor joined or left the membership and a new membership (10.0.4.50:44) was formed.
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync notice [TOTEM ] A processor joined or left the membership and a new membership (10.0.4.50:48) was formed.
> Apr 25 13:53:52 [10461] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 corosync notice [MAIN ] Completed service synchronization, ready to provide service.
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: info: read_config: User configured file based logging and explicitly disabled syslog.
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: notice: main: Starting Pacemaker 1.1.9 (Build: 2a917dd): ncurses libqb-logging libqb-ipc lha-fencing nagios corosync-native
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: info: main: Maximum core file size is: 18446744073709551615
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: info: qb_ipcs_us_publish: server name: pacemakerd
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: notice: corosync_node_name: Unable to get node name for nodeid 0
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: notice: get_local_node_name: Defaulting to uname -n for the local corosync node name
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: notice: update_node_processes: 0x61cd50 Node 839122954 now known as fcde02a2-cc41-4c58-b6d2-b7bb0bada436, was:
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: info: start_child: Forked child 10484 for process cib
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: info: start_child: Forked child 10485 for process stonith-ng
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: info: start_child: Forked child 10486 for process lrmd
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: info: start_child: Forked child 10487 for process attrd
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: info: start_child: Forked child 10488 for process pengine
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: info: start_child: Forked child 10489 for process crmd
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: info: main: Starting mainloop
> Apr 25 13:54:10 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: crm_log_init: Changed active directory to /var/vcap/data/packages/cluster-stack/0.12-dev.1/var/lib/heartbeat/cores/root
> Apr 25 13:54:10 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: get_cluster_type: Verifying cluster type: 'corosync'
> Apr 25 13:54:10 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: get_cluster_type: Assuming an active 'corosync' cluster
> Apr 25 13:54:10 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: notice: crm_cluster_connect: Connecting to cluster infrastructure: corosync
> Apr 25 13:54:10 [10486] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 lrmd: info: crm_log_init: Changed active directory to /var/vcap/data/packages/cluster-stack/0.12-dev.1/var/lib/heartbeat/cores/root
> Apr 25 13:54:10 [10486] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 lrmd: info: qb_ipcs_us_publish: server name: lrmd
> Apr 25 13:54:10 [10486] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 lrmd: info: main: Starting
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: error: pcmk_child_exit: Child process cib exited (pid=10484, rc=100)
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: warning: pcmk_child_exit: Pacemaker child process cib no longer wishes to be respawned. Shutting ourselves down.
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: notice: pcmk_shutdown_worker: Shuting down Pacemaker
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: notice: stop_child: Stopping crmd: Sent -15 to process 10489
> Apr 25 13:54:10 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: crm_get_peer: Node <null> now has id: 839122954
> Apr 25 13:54:10 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: crm_update_peer_proc: init_cpg_connection: Node (null)[839122954] - corosync-cpg is now online
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: notice: pcmk_child_exit: Child process crmd terminated with signal 15 (pid=10489, core=0)
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: notice: stop_child: Stopping pengine: Sent -15 to process 10488
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: info: pcmk_child_exit: Child process pengine exited (pid=10488, rc=0)
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: notice: stop_child: Stopping attrd: Sent -15 to process 10487
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: error: pcmk_child_exit: Child process attrd exited (pid=10487, rc=100)
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: warning: pcmk_child_exit: Pacemaker child process attrd no longer wishes to be respawned. Shutting ourselves down.
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: notice: stop_child: Stopping lrmd: Sent -15 to process 10486
> Apr 25 13:54:10 [10486] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 lrmd: info: crm_signal_dispatch: Invoking handler for signal 15: Terminated
> Apr 25 13:54:10 [10486] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 lrmd: info: lrmd_shutdown: Terminating with 0 clients
> Apr 25 13:54:10 [10486] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 lrmd: info: qb_ipcs_us_withdraw: withdrawing server sockets
> Apr 25 13:54:10 [10486] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 lrmd: info: crm_xml_cleanup: Cleaning up memory from libxml2
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: info: pcmk_child_exit: Child process lrmd exited (pid=10486, rc=0)
> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: notice: stop_child: Stopping stonith-ng: Sent -15 to process 10485
> Apr 25 13:54:10 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: notice: corosync_node_name: Unable to get node name for nodeid 839122954
> Apr 25 13:54:10 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: notice: get_local_node_name: Defaulting to uname -n for the local corosync node name
> Apr 25 13:54:10 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: init_cs_connection_once: Connection to 'corosync': established
> Apr 25 13:54:10 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: crm_get_peer: Node 839122954 is now known as fcde02a2-cc41-4c58-b6d2-b7bb0bada436
> Apr 25 13:54:10 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: crm_get_peer: Node 839122954 has uuid 839122954
> Apr 25 13:54:10 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: crm_ipc_connect: Could not establish cib_rw connection: Connection refused (111)
> Apr 25 13:54:11 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: crm_ipc_connect: Could not establish cib_rw connection: Connection refused (111)
> Apr 25 13:54:13 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: crm_ipc_connect: Could not establish cib_rw connection: Connection refused (111)
> Apr 25 13:54:16 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: crm_ipc_connect: Could not establish cib_rw connection: Connection refused (111)
> Apr 25 13:54:20 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: crm_ipc_connect: Could not establish cib_rw connection: Connection refused (111)
> Apr 25 13:54:20 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: error: setup_cib: Could not connect to the CIB service: Transport endpoint is not connected (-107)
> Apr 25 13:54:20 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: qb_ipcs_us_publish: server name: stonith-ng
> Apr 25 13:54:20 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: main: Starting stonith-ng mainloop
> Apr 25 13:54:20 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: crm_signal_dispatch: Invoking handler for signal 15: Terminated
> Apr 25 13:54:20 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: stonith_shutdown: Terminating with 0 clients
> Apr 25 13:54:20 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: qb_ipcs_us_withdraw: withdrawing server sockets
> Apr 25 13:54:20 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: main: Done
> Apr 25 13:54:20 [10485] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 stonith-ng: info: crm_xml_cleanup: Cleaning up memory from libxml2
> Apr 25 13:54:20 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: info: pcmk_child_exit: Child process stonith-ng exited (pid=10485, rc=0)
> Apr 25 13:54:20 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: notice: pcmk_shutdown_worker: Shutdown complete
> Apr 25 13:54:20 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: notice: pcmk_shutdown_worker: Attempting to inhibit respawning after fatal error
> Apr 25 13:54:20 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: info: crm_xml_cleanup: Cleaning up memory from libxml2
> #################################
>
> Corosync config is this
>
>
> ################
> # Please read the openais.conf.5 manual page
> #
> #
>
> totem {
> version: 2
>
> crypto_cipher: none
> crypto_hash: none
>
> interface {
> ringnumber: 0
> bindnetaddr: 10.0.4.0
> mcastport: 5405
> ttl: 1
> }
> transport: udpu
> }
>
>
>
> nodelist {
>
> node {
> ring0_addr: 10.0.4.50
> }
>
> node {
> ring0_addr: 10.0.4.51
> }
>
>
> }
>
>
> service {
> # Load the Pacemaker Cluster Resource Manager
> name: pacemaker
> ver: 1
> }
>
> logging {
> fileline: off
> to_stderr: yes
> to_logfile: on
> logfile: /var/vcap/sys/log/cluster-stack/corosync.log
> to_syslog: no
> syslog_facility: daemon
> debug: off
> timestamp: on
> logger_subsys {
> subsys: AMF
> debug: off
> tags: enter|leave|trace1|trace2|trace3|trace4|trace6
> }
> }
> ####################################
>
>
> I have a working corosync/pacemaker setup using a similar configuration with a local source install, so I'm pretty sure the problem is due to the relocation of the package tree.
>
> Any ideas what I've missed?
>
> thanks
>
> James M
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list