[Pacemaker] cluster got stuck on stopping resources
Andreas Kurz
andreas.kurz at linbit.com
Mon Jun 7 10:13:41 UTC 2010
Hi all,
I observed a strange behaviour when trying to stop two resources with latest
pacemaker:
I updated two resources (ping) and changed some constraints. One of the
changed resources is mentioned in the logs with "strange" lrmd messages :
...
Jun 07 10:16:58 emahqwienfw1b crmd: [31354]: ERROR: do_lrm_rsc_op: Operation
monitor on res_ping_ABC failed: -1
Jun 07 10:16:58 emahqwienfw1b lrmd: [31351]: notice: on_msg_perform_op:
resource res_ping_ABC is frozen, no ops can run.
Jun 07 10:16:58 emahqwienfw1b lrmd: [31351]: debug: RA output [dummy status to
fool heartbeat
] didn't match any pattern
Jun 07 10:16:58 emahqwienfw1b crmd: [31354]: WARN: do_log: FSA: Input I_FAIL
from do_lrm_rsc_op() received in state S_TRANSITI
ON_ENGINE
Jun 07 10:16:58 emahqwienfw1b crmd: [31354]: info: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_POLICY_ENGIN
E [ input=I_FAIL cause=C_FSA_INTERNAL origin=do_lrm_rsc_op ]
....
Then I try to stop two other resources (part of a group) and nothing happens.
One of this resources is a dependency of res_ping_ABC that is mentioned as
"frozen" by the lrmd.
Running ptest -L shows that pengine knows what to do (stop the two resources
and all dependencies).
Any ideas? hb_report is attached .... I left the cluster in this state so if
there is anything else I should provide for debugging please tell me.
Regards,
Andreas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: resource_stop-stucks.tar.bz2
Type: application/x-bzip-compressed-tar
Size: 52834 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20100607/eb651c43/attachment-0003.bin>
More information about the Pacemaker
mailing list