[Pacemaker] FW: Some resources are restarted after a node joins back cluster after failover.
Andrew Beekhof
andrew at beekhof.net
Thu Oct 11 04:04:54 CEST 2012
On Tue, Oct 2, 2012 at 6:42 PM, Poonam Agarwal
<Poonam.Agarwal at ipaccess.com> wrote:
> Hi,
>
>
>
> I had sent this message before, do not know how and why it got dropped.
Perhaps you weren't subscribed yet?
>
> I am facing below issue. Can somebody please help?
There were some problems in this area in the past.
Would you consider upgrading to 1.1.8? It has many bugfixes.
http://www.clusterlabs.org/rpm-next/
>
> -Poonam.
>
>
>
> From: Poonam Agarwal
> Sent: Thursday, September 20, 2012 11:24 AM
> To: 'pacemaker at oss.clusterlabs.org'
> Subject: Some resources are restarted after a node joins back cluster after
> failover.
>
>
>
> Hi,
>
>
>
> I have two node HA cluster namely oamdev-vm2 and oamdev-vm3. Oamdev-vm2 was
> master for some resources ms_drbd_resource_r0 and NOSServiceManager0.
>
> Then oamdev-vm2 was taken down by using ‘service corosync stop’. Failover
> happened and oamdev-vm3 became master for all resources.
>
> Now, when oamdev-vm2 came back by using ‘service corosync start’ then, when
> it joined the cluster some resources on master node oamdev-vm3 were
> restarted.
>
> This is not expected as it takes my main resource NosServiceManager0 down
> and increases the application downtime.
>
> I am not able to figure out what is causing this resource restart. Is it any
> ordering or colocation rules causing this??
>
> Below is the version of pacemaker, corosync, corosync configuration and the
> corosync logs at the bottom of this email which highlight the services
> restarted.
>
>
>
> I am using Red Hat 5 with following versions of pacemaker/corosync.
>
> [root at oamdev-vm2 ~]# rpm -qa | grep pacemaker
>
> pacemaker-libs-1.1.5-1.1.el5
>
> pacemaker-1.1.5-1.1.el5
>
> pacemaker-debuginfo-1.0.11-1.2.el5
>
> drbd-pacemaker-8.3.12-1
>
> [root at oamdev-vm2 ~]# rpm -qa | grep corosync
>
> corosync-debuginfo-1.2.7-1.1.el5
>
> corosync-1.2.7-1.1.el5
>
> corosynclib-devel-1.2.7-1.1.el5
>
> corosynclib-1.2.7-1.1.el5
>
>
>
> My corosync conf looks like this:
>
>
>
> node oamdev-vm2
>
> node oamdev-vm3 \
>
> attributes standby="off"
>
> primitive Apache ocf:heartbeat:apache \
>
> params configfile="/etc/httpd/conf/httpd.conf"
> statusurl="http://localhost/server-status" \
>
> op monitor interval="30s" OCF_CHECK_LEVEL="0" \
>
> op start interval="0" timeout="40s" \
>
> op stop interval="0" timeout="60s"
>
> primitive Imq ocf:ipaccess:imq \
>
> op monitor interval="15s"
>
> primitive NOSFileSystem ocf:heartbeat:Filesystem \
>
> params device="10.255.239.26:/var/lib/ipaccess/export/data"
> directory="/var/lib/ipaccess/data" fstype="nfs" \
>
> op start interval="0" timeout="60s" \
>
> op stop interval="0" timeout="360s" \
>
> op monitor interval="60s"
>
> primitive NOSIpAddress10_255_239_23 ocf:ipaccess:ipaddress \
>
> params ip="10.255.239.23" cidr_netmask="24" networkType="Internal" \
>
> op monitor interval="15s" \
>
> meta target-role="Started"
>
> primitive NOSIpAddress10_255_239_25 ocf:ipaccess:ipaddress \
>
> params ip="10.255.239.25" cidr_netmask="24" networkType="Internal" \
>
> op monitor interval="30s" \
>
> meta target-role="Started"
>
> primitive NOSServiceManager0 ocf:ipaccess:glassfish \
>
> params objectInstanceId="0"
> databaseUrl="jdbc:mysql://10.255.239.24:3306/nos" \
>
> op start interval="0" timeout="300" \
>
> op stop interval="0" timeout="300" \
>
> op monitor interval="10s" OCF_CHECK_LEVEL="0" \
>
> meta target-role="Started" resource-stickiness="1000"
>
> primitive p_drbd_resource_fs1 ocf:linbit:drbd \
>
> params drbd_resource="fs1" \
>
> op monitor interval="29s" role="Master" timeout="120s" \
>
> op monitor interval="31s" role="Slave" timeout="120s" \
>
> op start interval="0" timeout="240s" \
>
> op stop interval="0" timeout="100s"
>
> primitive p_drbd_resource_r0 ocf:linbit:drbd \
>
> params drbd_resource="r0" \
>
> op monitor interval="29s" role="Master" timeout="120s" \
>
> op monitor interval="31s" role="Slave" timeout="120s" \
>
> op start interval="0" timeout="240s" \
>
> op stop interval="0" timeout="100s"
>
> primitive p_export_fs1 ocf:heartbeat:exportfs \
>
> params clientspec="10.255.239.26/24"
> directory="/var/lib/ipaccess/export/data" fsid="3211"
> options="sync,rw,no_root_squash" \
>
> op monitor interval="60s"
>
> primitive p_filesystem_drbd_fs1 ocf:heartbeat:Filesystem \
>
> params device="/dev/drbd/by-res/fs1" options="user_xattr,rw,acl"
> directory="/var/lib/ipaccess/export" fstype="ext3"
>
> primitive p_filesystem_drbd_r0 ocf:heartbeat:Filesystem \
>
> params device="/dev/drbd/by-res/r0" options="user_xattr,rw,acl"
> directory="/var/lib/mysql" fstype="ext3"
>
> primitive p_ip_fs1 ocf:ipaccess:ipaddress \
>
> params ip="10.255.239.26" cidr_netmask="24" networkType="Internal" \
>
> op monitor interval="30s"
>
> primitive p_ip_mysql ocf:ipaccess:ipaddress \
>
> params ip="10.255.239.24" cidr_netmask="24" networkType="Internal" \
>
> op monitor interval="30s"
>
> primitive p_mysql ocf:heartbeat:mysql \
>
> params binary="/usr/bin/mysqld_safe" pid="/var/lib/mysql/mysqld.pid"
> datadir="/var/lib/mysql" \
>
> op monitor interval="10" timeout="30" \
>
> op start interval="0" timeout="120" \
>
> op stop interval="0" timeout="120"
>
> primitive p_nfsserver_fs1 ocf:ipaccess:nfsserver \
>
> params nfs_init_script="/usr/lib/ipaccess/tools/nfs-ha"
> nfs_shared_infodir="/var/lib/ipaccess/export/nfsinfo"
> nfs_notify_cmd="/usr/lib/ipaccess/tools/nfs-notify" nfs_ip="10.255.239.26" \
>
> op start interval="0" timeout="60s" \
>
> op stop interval="0" timeout="60s" \
>
> op monitor interval="30s"
>
> primitive portmap lsb:portmap \
>
> op monitor interval="120s"
>
> group fs1_group p_filesystem_drbd_fs1 p_ip_fs1 p_nfsserver_fs1 p_export_fs1
> \
>
> meta target-role="Started"
>
> group mysql_group p_filesystem_drbd_r0 p_ip_mysql p_mysql \
>
> meta target-role="Started"
>
> ms ms_drbd_resource_fs1 p_drbd_resource_fs1 \
>
> meta master-max="1" master-node-max="1" clone-max="2"
> clone-node-max="1" notify="true" target-role="Started"
>
> ms ms_drbd_resource_r0 p_drbd_resource_r0 \
>
> meta master-max="1" master-node-max="1" clone-max="2"
> clone-node-max="1" notify="true" target-role="Started"
>
> clone ApacheCluster Apache \
>
> meta globally-unique="false" ordered="false" target-role="Started"
>
> clone ImqCluster Imq \
>
> meta globally-unique="false" ordered="true" notify="true"
> target-role="Started"
>
> clone NOSFileSystemCluster NOSFileSystem \
>
> meta target-role="Started"
>
> clone portmapCluster portmap \
>
> meta target-role="Started"
>
> location location_NOSServiceManager0_oamdev-vm2 NOSServiceManager0 100:
> oamdev-vm2
>
> location location_ms_drbd_resource_fs1_master ms_drbd_resource_fs1 100:
> oamdev-vm3
>
> location location_ms_drbd_resource_fs1_nodes ms_drbd_resource_fs1 \
>
> rule $id="location_ms_drbd_resource_fs1_nodes-rule" -inf: #uname ne
> oamdev-vm3 and #uname ne oamdev-vm2
>
> location location_ms_drbd_resource_r0_master ms_drbd_resource_r0 100:
> oamdev-vm2
>
> location location_ms_drbd_resource_r0_nodes ms_drbd_resource_r0 \
>
> rule $id="location_ms_drbd_resource_r0_nodes-rule" -inf: #uname ne
> oamdev-vm2 and #uname ne oamdev-vm3
>
> colocation colocation_NOSIpAddress10_255_239_23_NOSServiceManager0 inf:
> NOSIpAddress10_255_239_23 NOSServiceManager0
>
> colocation colocation_NOSIpAddress10_255_239_25_NOSServiceManager0 inf:
> NOSIpAddress10_255_239_25 NOSServiceManager0
>
> colocation colocation_filesystem_drbd_fs1 inf: fs1_group
> ms_drbd_resource_fs1:Master
>
> colocation colocation_filesystem_drbd_r0 inf: mysql_group
> ms_drbd_resource_r0:Master
>
> order order_NOSFileSystemCluster_after_portmapCluster inf: portmapCluster
> NOSFileSystemCluster
>
> order order_NOSServiceManager0_after_NOSFileSystemCluster inf:
> NOSFileSystemCluster NOSServiceManager0
>
> order order_NOSServiceManager0_after_p_mysql inf: p_mysql NOSServiceManager0
>
> order order_filesystem_after_drbd_fs1 inf: ms_drbd_resource_fs1:promote
> fs1_group:start
>
> order order_filesystem_after_drbd_r0 inf: ms_drbd_resource_r0:promote
> mysql_group:start
>
> order order_fs1_group_after_portmapCluster inf: portmapCluster fs1_group
>
> property $id="cib-bootstrap-options" \
>
> dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f"
> \
>
> cluster-infrastructure="openais" \
>
> expected-quorum-votes="2" \
>
> stonith-enabled="false" \
>
> no-quorum-policy="ignore" \
>
> default-action-timeout="240" \
>
> start-failure-is-fatal="false"
>
> rsc_defaults $id="rsc-options" \
>
> failure-timeout="30s" \
>
> resource-stickiness="100"
>
> op_defaults $id="op-options" \
>
> on-fail="restart"
>
>
>
>
>
> Corosync logs on the node oamdev-vm3 where resources were restarted(Time:
> Sep 19 12:52:35 ). Highlighted sections where restart is seen.
>
>
>
> Sep 19 12:52:30 oamdev-vm3.lab.ipaccess.com lrmd: [16766]: info: RA output:
> (NOSServiceManager0:monitor:stderr) 2012/09/19_12:52:30 INFO: monitor:
> running
>
> Sep 19 12:52:31 corosync [pcmk ] notice: pcmk_peer_update: Transitional
> membership event on ring 32: memb=1, new=0, lost=0
>
> Sep 19 12:52:31 corosync [pcmk ] info: pcmk_peer_update: memb:
> oamdev-vm3.lab.ipaccess.com 519044874
>
> Sep 19 12:52:31 corosync [pcmk ] notice: pcmk_peer_update: Stable
> membership event on ring 32: memb=2, new=1, lost=0
>
> Sep 19 12:52:31 corosync [pcmk ] info: update_member: Node
> 351272714/oamdev-vm2.lab.ipaccess.com is now: member
>
> Sep 19 12:52:31 corosync [pcmk ] info: pcmk_peer_update: NEW:
> oamdev-vm2.lab.ipaccess.com 351272714
>
> Sep 19 12:52:31 corosync [pcmk ] info: pcmk_peer_update: MEMB:
> oamdev-vm2.lab.ipaccess.com 351272714
>
> Sep 19 12:52:31 corosync [pcmk ] info: pcmk_peer_update: MEMB:
> oamdev-vm3.lab.ipaccess.com 519044874
>
> Sep 19 12:52:31 corosync [pcmk ] info: send_member_notification: Sending
> membership update 32 to 2 children
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: notice:
> ais_dispatch_message: Membership 32: quorum acquired
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> crm_update_peer: Node oamdev-vm2.lab.ipaccess.com: id=351272714 state=member
> (new) addr=r(0) ip(10.255.239.20) votes=1 born=24 seen=32
> proc=00000000000000000000000000000002
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: notice:
> ais_dispatch_message: Membership 32: quorum acquired
>
> Sep 19 12:52:31 corosync [TOTEM ] A processor joined or left the membership
> and a new membership was formed.
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> ais_status_callback: status: oamdev-vm2.lab.ipaccess.com is now member (was
> lost)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> crm_update_peer: Node oamdev-vm2.lab.ipaccess.com: id=351272714 state=member
> (new) addr=r(0) ip(10.255.239.20) votes=1 born=24 seen=32
> proc=00000000000000000000000000000002
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> crm_update_quorum: Updating quorum status to true (call=214)
>
> Sep 19 12:52:31 corosync [pcmk ] info: update_member: 0x4d29c40 Node
> 351272714 (oamdev-vm2.lab.ipaccess.com) born on: 32
>
> Sep 19 12:52:31 corosync [pcmk ] info: update_member: Node
> oamdev-vm2.lab.ipaccess.com now has process list:
> 00000000000000000000000000111312 (1118994)
>
> Sep 19 12:52:31 corosync [pcmk ] info: send_member_notification: Sending
> membership update 32 to 2 children
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_delete for section
> //node_state[@uname='oamdev-vm2.lab.ipaccess.com']/lrm
> (origin=local/crmd/210, version=0.35.62): ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_delete for section
> //node_state[@uname='oamdev-vm2.lab.ipaccess.com']/transient_attributes
> (origin=local/crmd/211, version=0.35.63): ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section nodes
> (origin=local/crmd/212, version=0.35.64): ok (rc=0)
>
> Sep 19 12:52:31 corosync [MAIN ] Completed service synchronization, ready
> to provide service.
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section cib
> (origin=local/crmd/214, version=0.35.66): ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> ais_dispatch_message: Membership 32: quorum retained
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> crm_update_peer: Node oamdev-vm2.lab.ipaccess.com: id=351272714 state=member
> addr=r(0) ip(10.255.239.20) votes=1 born=32 seen=32
> proc=00000000000000000000000000111312 (new)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_sync_one for section 'all'
> (origin=oamdev-vm2.lab.ipaccess.com/oamdev-vm2.lab.ipaccess.com/(null),
> version=0.35.66): ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> crmd_ais_dispatch: Setting expected votes to 2
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> abort_transition_graph: te_update_diff:276 - Triggered transition abort
> (complete=1, tag=lrm_rsc_op, id=p_mysql_monitor_0,
> magic=0:7;21:1:7:7a06b9e0-1d98-4a00-a287-cbc4178d65e4, cib=0.35.62) :
> Resource op removal
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> erase_xpath_callback: Deletion of
> "//node_state[@uname='oamdev-vm2.lab.ipaccess.com']/lrm": ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> abort_transition_graph: te_update_diff:163 - Triggered transition abort
> (complete=1, tag=transient_attributes, id=oamdev-vm2.lab.ipaccess.com,
> magic=NA, cib=0.35.63) : Transient attribute: removal
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> erase_xpath_callback: Deletion of
> "//node_state[@uname='oamdev-vm2.lab.ipaccess.com']/transient_attributes":
> ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [
> input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_state_transition: All 1 cluster nodes are eligible to run resources.
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_pe_invoke: Query 217: Requesting the current CIB: S_POLICY_ENGINE
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_pe_invoke: Query 218: Requesting the current CIB: S_POLICY_ENGINE
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> ais_dispatch_message: Membership 32: quorum retained
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: notice:
> crmd_peer_update: Status update: Client oamdev-vm2.lab.ipaccess.com/crmd now
> has status [online] (DC=true)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> crm_update_peer: Node oamdev-vm2.lab.ipaccess.com: id=351272714 state=member
> addr=r(0) ip(10.255.239.20) votes=1 born=32 seen=32
> proc=00000000000000000000000000111312 (new)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section
> crm_config (origin=local/crmd/216, version=0.35.67): ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section nodes
> (origin=local/crmd/220, version=0.35.69): ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> crmd_ais_dispatch: Setting expected votes to 2
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_state_transition: State transition S_POLICY_ENGINE -> S_INTEGRATION [
> input=I_NODE_JOIN cause=C_FSA_INTERNAL origin=crmd_peer_update ]
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info: update_dc:
> Unset DC oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> join_make_offer: Making join offers based on membership 32
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_dc_join_offer_all: join-5: Waiting on 2 outstanding join acks
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section
> crm_config (origin=local/crmd/223, version=0.35.71): ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info: update_dc:
> Set DC to oamdev-vm3.lab.ipaccess.com (3.0.5)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com lrmd: [16766]: info: RA output:
> (p_mysql:monitor:stderr) 2012/09/19_12:52:31 INFO: MySQL monitor succeeded
>
>
>
> Sep 19 12:52:34 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info: update_dc:
> Unset DC oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:34 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_dc_join_offer_all: A new node joined the cluster
>
> Sep 19 12:52:34 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_dc_join_offer_all: join-6: Waiting on 2 outstanding join acks
>
> Sep 19 12:52:34 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info: update_dc:
> Set DC to oamdev-vm3.lab.ipaccess.com (3.0.5)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_state_transition: State transition S_INTEGRATION -> S_FINALIZE_JOIN [
> input=I_INTEGRATED cause=C_FSA_INTERNAL origin=check_join_state ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_state_transition: All 2 cluster nodes responded to the join offer.
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_dc_join_finalize: join-6: Syncing the CIB from
> oamdev-vm3.lab.ipaccess.com to the rest of the cluster
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_sync for section 'all'
> (origin=local/crmd/226, version=0.35.71): ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section nodes
> (origin=local/crmd/227, version=0.35.72): ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section nodes
> (origin=local/crmd/228, version=0.35.73): ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_delete for section
> //node_state[@uname='oamdev-vm2.lab.ipaccess.com']/transient_attributes
> (origin=oamdev-vm2.lab.ipaccess.com/crmd/6, version=0.35.74): ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_dc_join_ack: join-6: Updating node state to member for
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_delete for section
> //node_state[@uname='oamdev-vm2.lab.ipaccess.com']/lrm
> (origin=local/crmd/229, version=0.35.75): ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> erase_xpath_callback: Deletion of
> "//node_state[@uname='oamdev-vm2.lab.ipaccess.com']/lrm": ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_dc_join_ack: join-6: Updating node state to member for
> oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_state_transition: State transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [
> input=I_FINALIZED cause=C_FSA_INTERNAL origin=check_join_state ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_state_transition: All 2 cluster nodes are eligible to run resources.
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_dc_join_final: Ensuring DC, quorum and node attributes are up-to-date
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> crm_update_quorum: Updating quorum status to true (call=235)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> abort_transition_graph: do_te_invoke:173 - Triggered transition abort
> (complete=1) : Peer Cancelled
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_pe_invoke: Query 236: Requesting the current CIB: S_POLICY_ENGINE
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com attrd: [16767]: info:
> attrd_local_callback: Sending full refresh (origin=crmd)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com attrd: [16767]: info:
> attrd_trigger_update: Sending flush op to all hosts for: probe_complete
> (true)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_delete for section
> //node_state[@uname='oamdev-vm3.lab.ipaccess.com']/lrm
> (origin=local/crmd/231, version=0.35.77): ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> abort_transition_graph: te_update_diff:276 - Triggered transition abort
> (complete=1, tag=lrm_rsc_op, id=p_drbd_resource_r0:1_monitor_0,
> magic=0:7;19:42:7:ddd16f01-0ba8-4299-8998-1e292a1b5b4b, cib=0.35.77) :
> Resource op removal
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> erase_xpath_callback: Deletion of
> "//node_state[@uname='oamdev-vm3.lab.ipaccess.com']/lrm": ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_pe_invoke: Query 237: Requesting the current CIB: S_POLICY_ENGINE
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> te_update_diff: Detected LRM refresh - 17 resources updated: Skipping all
> resource events
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> abort_transition_graph: te_update_diff:236 - Triggered transition abort
> (complete=1, tag=diff, id=(null), magic=NA, cib=0.35.78) : LRM Refresh
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_pe_invoke: Query 238: Requesting the current CIB: S_POLICY_ENGINE
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section nodes
> (origin=local/crmd/233, version=0.35.79): ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section cib
> (origin=local/crmd/235, version=0.35.81): ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_pe_invoke_callback: Invoking the PE: query=238,
> ref=pe_calc-dc-1348055555-190, seq=32, quorate=1
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> unpack_config: On loss of CCM Quorum: Ignore
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> clone_print: Master/Slave Set: ms_drbd_resource_r0 [p_drbd_resource_r0]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print: Masters: [ oamdev-vm3.lab.ipaccess.com ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print: Stopped: [ p_drbd_resource_r0:0 ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> group_print: Resource Group: mysql_group
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print: p_filesystem_drbd_r0 (ocf::heartbeat:Filesystem):
> Started oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print: p_ip_mysql (ocf::ipaccess:ipaddress): Started
> oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print: p_mysql (ocf::heartbeat:mysql): Started
> oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> clone_print: Master/Slave Set: ms_drbd_resource_fs1 [p_drbd_resource_fs1]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print: Masters: [ oamdev-vm3.lab.ipaccess.com ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print: Stopped: [ p_drbd_resource_fs1:1 ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> clone_print: Clone Set: portmapCluster [portmap]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print: Started: [ oamdev-vm3.lab.ipaccess.com ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print: Stopped: [ portmap:1 ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> group_print: Resource Group: fs1_group
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print: p_filesystem_drbd_fs1 (ocf::heartbeat:Filesystem):
> Started oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print: p_ip_fs1 (ocf::ipaccess:ipaddress): Started
> oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print: p_nfsserver_fs1 (ocf::ipaccess:nfsserver):
> Started oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print: p_export_fs1 (ocf::heartbeat:exportfs): Started
> oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> clone_print: Clone Set: NOSFileSystemCluster [NOSFileSystem]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print: Started: [ oamdev-vm3.lab.ipaccess.com ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print: Stopped: [ NOSFileSystem:1 ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print: NOSIpAddress10_255_239_23 (ocf::ipaccess:ipaddress):
> Started oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print: NOSServiceManager0 (ocf::ipaccess:glassfish): Started
> oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> clone_print: Clone Set: ImqCluster [Imq]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print: Started: [ oamdev-vm3.lab.ipaccess.com ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print: Stopped: [ Imq:0 ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> clone_print: Clone Set: ApacheCluster [Apache]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print: Started: [ oamdev-vm3.lab.ipaccess.com ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print: Stopped: [ Apache:0 ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print: NOSIpAddress10_255_239_25 (ocf::ipaccess:ipaddress):
> Started oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> RecurringOp: Start recurring monitor (31s) for p_drbd_resource_r0:0 on
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> RecurringOp: Start recurring monitor (31s) for p_drbd_resource_r0:0 on
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> RecurringOp: Start recurring monitor (31s) for p_drbd_resource_fs1:1 on
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> RecurringOp: Start recurring monitor (31s) for p_drbd_resource_fs1:1 on
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> RecurringOp: Start recurring monitor (120s) for portmap:1 on
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> RecurringOp: Start recurring monitor (60s) for NOSFileSystem:1 on
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> RecurringOp: Start recurring monitor (15s) for Imq:0 on
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> RecurringOp: Start recurring monitor (30s) for Apache:0 on
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Start p_drbd_resource_r0:0 (oamdev-vm2.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave p_drbd_resource_r0:1 (Master
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave p_filesystem_drbd_r0 (Started
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave p_ip_mysql (Started oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave p_mysql (Started oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave p_drbd_resource_fs1:0 (Master
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Start p_drbd_resource_fs1:1 (oamdev-vm2.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave portmap:0 (Started oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Start portmap:1 (oamdev-vm2.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Restart p_filesystem_drbd_fs1 (Started
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Restart p_ip_fs1 (Started oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Restart p_nfsserver_fs1 (Started
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Restart p_export_fs1 (Started oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Restart NOSFileSystem:0 (Started
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Start NOSFileSystem:1 (oamdev-vm2.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave NOSIpAddress10_255_239_23 (Started
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Restart NOSServiceManager0 (Started
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Start Imq:0 (oamdev-vm2.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave Imq:1 (Started oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Start Apache:0 (oamdev-vm2.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave Apache:1 (Started oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave NOSIpAddress10_255_239_25 (Started
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com attrd: [16767]: info:
> attrd_trigger_update: Sending flush op to all hosts for:
> master-p_drbd_resource_r0:0 (<null>)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE
> [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> unpack_graph: Unpacked transition 17: 87 actions in 87 synapses
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_te_invoke: Processing graph 17 (ref=pe_calc-dc-1348055555-190) derived
> from /var/lib/pengine/pe-input-5945.bz2
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> te_rsc_command: Initiating action 18: monitor p_drbd_resource_r0:0_monitor_0
> on oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> te_pseudo_action: Pseudo action 43 fired and confirmed
>
>
>
>
>
> Any help /hints/suggestion are appreciated.
>
>
>
> -Poonam.
>
>
>
>
>
> This message contains confidential information and may be privileged. If you
> are not the intended recipient, please notify the sender and delete the
> message immediately.
>
> ip.access ltd, registration number 3400157, Building 2020,
> Cambourne Business Park, Cambourne, Cambridge CB23 6DW, United Kingdom
>
>
More information about the Pacemaker
mailing list