[Pacemaker] OFFLINE node after cluster upgrade
ruslan usifov
ruslan.usifov at gmail.com
Fri Feb 24 21:28:54 UTC 2012
I solve this problem!
On one node in log i found follow error message.
slv009 .... peer is not p art of our cluster
So i stop pacemaker in that host (i use v1 for pacemaker):
/etc/pacemaker stop
/etc/corosync stop
Then remove all cib info from /var/lib/heatbeat/crm and cleanup
/var/lib/pengine dir. thean restart clsuer on that node. And vuala all
begin working as expected.
But i still have question why this happens??? Why nodes begin think that
other nodes are not the part of cluster???
2012/2/24 ruslan usifov <ruslan.usifov at gmail.com>
> Hello
>
> I have 3 nodes cluster setup. After upgrade OS, i get that one node
> parmanently on OFFLINE state.
>
>
> OS: ubuntu 10.0.4
> pacemaker: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
>
>
>
> on OFFLINE node i see in log follow:
>
> Feb 24 20:27:45 slv009 crmd: [9125]: info: do_dc_release: DC role released
> Feb 24 20:27:45 slv009 crmd: [9125]: info: do_te_control: Transitioner is
> now inactive
> Feb 24 20:28:05 slv009 crmd: [9125]: info: crm_timer_popped: Election
> Trigger (I_DC_TIMEOUT) just popped (20000ms)
> Feb 24 20:28:05 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDING
> Feb 24 20:28:05 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT
> cause=C_TIMER_POPPED origin=crm_
> timer_popped ]
> Feb 24 20:28:05 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_elec
> tion_count_vote ]
> Feb 24 20:28:05 slv009 crmd: [9125]: info: do_dc_release: DC role released
> Feb 24 20:28:05 slv009 crmd: [9125]: info: do_te_control: Transitioner is
> now inactive
> Feb 24 20:28:25 slv009 crmd: [9125]: info: crm_timer_popped: Election
> Trigger (I_DC_TIMEOUT) just popped (20000ms)
> Feb 24 20:28:25 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDING
> Feb 24 20:28:25 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT
> cause=C_TIMER_POPPED origin=crm_
> timer_popped ]
> Feb 24 20:28:25 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_elec
> tion_count_vote ]
> Feb 24 20:28:25 slv009 crmd: [9125]: info: do_dc_release: DC role released
> Feb 24 20:28:25 slv009 crmd: [9125]: info: do_te_control: Transitioner is
> now inactive
> Feb 24 20:28:45 slv009 crmd: [9125]: info: crm_timer_popped: Election
> Trigger (I_DC_TIMEOUT) just popped (20000ms)
> Feb 24 20:28:45 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDING
> Feb 24 20:28:45 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT
> cause=C_TIMER_POPPED origin=crm_
> timer_popped ]
> Feb 24 20:28:45 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_elec
> tion_count_vote ]
> Feb 24 20:28:45 slv009 crmd: [9125]: info: do_dc_release: DC role released
> Feb 24 20:28:45 slv009 crmd: [9125]: info: do_te_control: Transitioner is
> now inactive
> Feb 24 20:29:05 slv009 crmd: [9125]: info: crm_timer_popped: Election
> Trigger (I_DC_TIMEOUT) just popped (20000ms)
> Feb 24 20:29:05 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDING
> Feb 24 20:29:05 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT
> cause=C_TIMER_POPPED origin=crm_
> timer_popped ]
> Feb 24 20:29:05 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_elec
> tion_count_vote ]
> Feb 24 20:29:05 slv009 crmd: [9125]: info: do_dc_release: DC role released
> Feb 24 20:29:05 slv009 crmd: [9125]: info: do_te_control: Transitioner is
> now inactive
> Feb 24 20:29:25 slv009 crmd: [9125]: info: crm_timer_popped: Election
> Trigger (I_DC_TIMEOUT) just popped (20000ms)
> Feb 24 20:29:25 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDING
> Feb 24 20:29:25 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT
> cause=C_TIMER_POPPED origin=crm_
> timer_popped ]
> Feb 24 20:29:25 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_elec
> tion_count_vote ]
> Feb 24 20:29:25 slv009 crmd: [9125]: info: do_dc_release: DC role released
> Feb 24 20:29:25 slv009 crmd: [9125]: info: do_te_control: Transitioner is
> now inactive
> Feb 24 20:29:45 slv009 crmd: [9125]: info: crm_timer_popped: Election
> Trigger (I_DC_TIMEOUT) just popped (20000ms)
> Feb 24 20:29:45 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDING
> Feb 24 20:29:45 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT
> cause=C_TIMER_POPPED origin=crm_
> timer_popped ]
> Feb 24 20:29:45 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_elec
> tion_count_vote ]
> Feb 24 20:29:45 slv009 crmd: [9125]: info: do_dc_release: DC role released
> Feb 24 20:29:45 slv009 crmd: [9125]: info: do_te_control: Transitioner is
> now inactive
> Feb 24 20:30:05 slv009 crmd: [9125]: info: crm_timer_popped: Election
> Trigger (I_DC_TIMEOUT) just popped (20000ms)
> Feb 24 20:30:05 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDING
> Feb 24 20:30:05 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT
> cause=C_TIMER_POPPED origin=crm_
> timer_popped ]
> Feb 24 20:30:05 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_elec
> tion_count_vote ]
>
>
> I have follow crm conf:
>
> node slv008
> node slv009
> node slv010
> primitive http_173.192.214.78_eth1 ocf:heartbeat:IPaddr2 \
> params ip="173.192.214.78" nic="eth1:1" cidr_netmask="30" \
> op monitor interval="10s"
> primitive http_nginx ocf:heartbeat:nginx \
> op monitor interval="10s" timeout="120s"
> group http http_173.192.214.78_eth1 http_nginx \
> meta target-role="Started" is-managed="true"
> property $id="cib-bootstrap-options" \
> dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
> cluster-infrastructure="openais" \
> expected-quorum-votes="3" \
> stonith-enabled="false"
> rsc_defaults $id="rsc-options" \
> resource-stickiness="100"
>
>
>
>
>
> Also i cant restart pacemaker on that node cleanly ie throw init.d script
> (it just hung and all)
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120225/4a653507/attachment.htm>
More information about the Pacemaker
mailing list