[Pacemaker] can't get pacemaker started
Dave Jiang
dave.jiang at intel.com
Thu Jul 26 23:31:34 UTC 2012
Hi. I'm following the cluster from scratch guide to create a simple
active/passive 2 node cluster. I'm using the standard packages that come
with Fedora 17. I have corosync running and linked up. However I cannot
seem to get Pacemaker to run correctly. I don't see all the processes
loaded:
17286 ? Ss 0:00 /usr/sbin/pacemakerd
-f
17288 ? Ss 0:00 \_ /usr/libexec/pacemaker/stonithd
Looking at the log these stand out:
Jul 26 16:26:02 leftnode cib[17378]: warning: retrieveCib: Cluster
configuration not found: /var/lib/heartbeat/crm/cib.xml
Jul 26 16:26:02 leftnode attrd[17381]: notice: crm_cluster_connect:
Connecting to cluster infrastructure: corosync
Jul 26 16:26:02 leftnode cib[17378]: warning: readCibXmlFile: Primary
configuration corrupt or unusable, trying backup...
Jul 26 16:26:02 leftnode crmd[17383]: info: crm_log_init_worker:
Changed active directory to /var/lib/heartbeat/cores/hacluster
Jul 26 16:26:02 leftnode cib[17378]: warning: readCibXmlFile:
Continuing with an empty configuration.
I'm not running heartbeat, should I be? It wasn't talked about in the guide.
And then I noticed the qb_rb_chmod failed and a bunch of other failures.
Any ideas what am I not setting up correctly?
Jul 26 16:26:02 leftnode crmd[17383]: notice: main: CRM Git Version:
ee0730e13d124c3d58f00016c3376a1de5323cff
Jul 26 16:26:02 leftnode corosync[16373]: [QB ]
qb_rb_chmod:cpg-request-16373-17381-254: Operation not permitted (1)
Jul 26 16:26:02 leftnode cib[17378]: info: validate_with_relaxng:
Creating RNG parser context
Jul 26 16:26:02 leftnode corosync[16373]: [QB ] shm connection
FAILED: Operation not permitted (1)
Jul 26 16:26:02 leftnode corosync[16373]: [QB ] Error in connection
setup (16373-17381-254): Operation not permitted (1)
Jul 26 16:26:02 leftnode attrd[17381]: error: init_cpg_connection:
Could not connect to the Cluster Process Group API: 2
Jul 26 16:26:02 leftnode stonith-ng[17379]: info:
init_ais_connection_once: Connection to 'corosync': established
Jul 26 16:26:02 leftnode attrd[17381]: error: main: HA Signon failed
Jul 26 16:26:02 leftnode stonith-ng[17379]: info: crm_new_peer: Node
leftnode now has id: 16820416
Jul 26 16:26:02 leftnode attrd[17381]: error: main: Aborting startup
Jul 26 16:26:02 leftnode stonith-ng[17379]: info: crm_new_peer: Node
16820416 is now known as leftnode
Jul 26 16:26:02 leftnode pacemakerd[17377]: error: pcmk_child_exit:
Child process attrd exited (pid=17381, rc=100)
Jul 26 16:26:02 leftnode pacemakerd[17377]: warning: pcmk_child_exit:
Pacemaker child process attrd no longer wishes to be respawned. Shutting
ourselves down.
Jul 26 16:26:02 leftnode pacemakerd[17377]: notice:
pcmk_shutdown_worker: Shuting down Pacemaker
Jul 26 16:26:02 leftnode pacemakerd[17377]: notice: stop_child:
Stopping crmd: Sent -15 to process 17383
Jul 26 16:26:02 leftnode crmd[17383]: info: do_cib_control: Could
not connect to the CIB service: connection failed
Jul 26 16:26:02 leftnode cib[17378]: info: startCib: CIB
Initialization completed successfully
Jul 26 16:26:02 leftnode crmd[17383]: warning: do_cib_control: Couldn't
complete CIB registration 1 times... pause and retry
Jul 26 16:26:02 leftnode cib[17378]: info: get_cluster_type: Cluster
type is: 'corosync'
Jul 26 16:26:02 leftnode crmd[17383]: info: crm_signal_dispatch:
Invoking handler for signal 15: Terminated
Jul 26 16:26:02 leftnode cib[17378]: notice: crm_cluster_connect:
Connecting to cluster infrastructure: corosync
Jul 26 16:26:02 leftnode crmd[17383]: notice: crm_shutdown: Requesting
shutdown, upper limit is 1200000ms
Jul 26 16:26:02 leftnode crmd[17383]: warning: do_log: FSA: Input
I_SHUTDOWN from crm_shutdown() received in state S_STARTING
Jul 26 16:26:02 leftnode corosync[16373]: [QB ]
qb_rb_chmod:cpg-request-16373-17378-255: Operation not permitted (1)
Jul 26 16:26:02 leftnode crmd[17383]: notice: do_state_transition:
State transition S_STARTING -> S_STOPPING [ input=I_SHUTDOWN
cause=C_SHUTDOWN origin=crm_shutdown ]
Jul 26 16:26:02 leftnode crmd[17383]: info: get_cluster_type:
Cluster type is: 'corosync'
Jul 26 16:26:02 leftnode corosync[16373]: [QB ] shm connection
FAILED: Operation not permitted (1)
Jul 26 16:26:02 leftnode crmd[17383]: notice:
terminate_ais_connection: Disconnecting from Corosync
Jul 26 16:26:02 leftnode corosync[16373]: [QB ] Error in connection
setup (16373-17378-255): Operation not permitted (1)
Jul 26 16:26:02 leftnode cib[17378]: error: init_cpg_connection:
Could not connect to the Cluster Process Group API: 2
Jul 26 16:26:02 leftnode crmd[17383]: info:
terminate_ais_connection: No CPG connection
Jul 26 16:26:02 leftnode cib[17378]: crit: cib_init: Cannot sign in
to the cluster... terminating
Jul 26 16:26:02 leftnode crmd[17383]: info:
terminate_ais_connection: No Quorum connection
Jul 26 16:26:02 leftnode pacemakerd[17377]: error: pcmk_child_exit:
Child process cib exited (pid=17378, rc=100)
Jul 26 16:26:02 leftnode crmd[17383]: info: do_ha_control:
Disconnected from OpenAIS
Jul 26 16:26:02 leftnode pacemakerd[17377]: warning: pcmk_child_exit:
Pacemaker child process cib no longer wishes to be respawned. Shutting
ourselves down.
Jul 26 16:26:02 leftnode crmd[17383]: info: do_cib_control:
Disconnecting CIB
Jul 26 16:26:02 leftnode crmd[17383]: info: do_exit: Performing
A_EXIT_0 - gracefully exiting the CRMd
Jul 26 16:26:02 leftnode crmd[17383]: info: free_mem: Dropping
I_TERMINATE: [ state=S_STOPPING cause=C_FSA_INTERNAL origin=do_stop ]
Jul 26 16:26:02 leftnode crmd[17383]: info: crm_xml_cleanup:
Cleaning up memory from libxml2
Jul 26 16:26:02 leftnode crmd[17383]: info: do_exit: [crmd] stopped (0)
Jul 26 16:26:02 leftnode pacemakerd[17377]: info: pcmk_child_exit:
Child process crmd exited (pid=17383, rc=0)
Jul 26 16:26:02 leftnode pacemakerd[17377]: notice: stop_child:
Stopping pengine: Sent -15 to process 17382
Jul 26 16:26:02 leftnode pacemakerd[17377]: info: pcmk_child_exit:
Child process pengine exited (pid=17382, rc=0)
Jul 26 16:26:02 leftnode pacemakerd[17377]: notice: stop_child:
Stopping lrmd: Sent -15 to process 17380
Jul 26 16:26:02 leftnode lrmd: [17380]: info: lrmd is shutting down
Jul 26 16:26:02 leftnode pacemakerd[17377]: info: pcmk_child_exit:
Child process lrmd exited (pid=17380, rc=0)
Jul 26 16:26:02 leftnode pacemakerd[17377]: notice: stop_child:
Stopping stonith-ng: Sent -15 to process 17379
More information about the Pacemaker
mailing list