[Pacemaker] master/slave resource does not stop (tries start repeatedly)
Kazunori INOUE
inouekazu at intellilink.co.jp
Fri Sep 7 09:49:04 UTC 2012
Hi,
I am using Pacemaker-1.1.
- ClusterLabs/pacemaker : 872a2f1af1 (Sep 07)
Though a monitor of master resource fails and there is no node which
the master/slave resource can run, the master/slave resource does not stop.
[test case]
1. use StatefulRA which set on-fail="restart" of monitor and
migration-threshold is 1.
# crm_mon
Online: [ vm5 vm6 ]
Master/Slave Set: msAP [prmAP]
Masters: [ vm5 ]
Slaves: [ vm6 ]
2. let the master resource on vm5 fail, and move it to vm6.
Online: [ vm5 vm6 ]
Master/Slave Set: msAP [prmAP]
Masters: [ vm6 ]
Stopped: [ prmAP:1 ]
Failed actions:
prmAP_monitor_10000 (node=vm5, call=14, rc=1, status=complete): unknown error
3. let the master resource on vm6 fail again, then
the master/slave resource tries start repeatedly.
the state of following (a) and (b) is repeated.
(a)
Online: [ vm5 vm6 ]
Failed actions:
prmAP_monitor_10000 (node=vm5, call=14, rc=1, status=complete): unknown error
prmAP_monitor_10000 (node=vm6, call=20, rc=1, status=complete): unknown error
(b)
Online: [ vm5 vm6 ]
Master/Slave Set: msAP [prmAP]
Slaves: [ vm5 vm6 ]
Failed actions:
prmAP_monitor_10000 (node=vm5, call=14, rc=1, status=complete): unknown error
prmAP_monitor_10000 (node=vm6, call=20, rc=1, status=complete): unknown error
# grep -e run_graph: -e common_apply_stickiness: -e LogActions: ha-log
>> after the master resource on vm5 failed
Sep 7 16:06:03 vm5 pengine[23199]: notice: LogActions: Recover prmAP:0 (Master vm5)
Sep 7 16:06:03 vm5 crmd[23200]: notice: run_graph: Transition 4 (Complete=3, Pending=0, Fired=0, Skipped=8, Incomplete=3, Source=/var/lib/pacemaker/pengine/pe-input-4.bz2): Stopped
Sep 7 16:06:03 vm5 pengine[23199]: warning: common_apply_stickiness: Forcing msAP away from vm5 after 1 failures (max=1)
Sep 7 16:06:03 vm5 pengine[23199]: warning: common_apply_stickiness: Forcing msAP away from vm5 after 1 failures (max=1)
Sep 7 16:06:03 vm5 pengine[23199]: notice: LogActions: Stop prmAP:0 (vm5)
Sep 7 16:06:03 vm5 pengine[23199]: notice: LogActions: Promote prmAP:1 (Slave -> Master vm6)
Sep 7 16:06:03 vm5 crmd[23200]: notice: run_graph: Transition 5 (Complete=4, Pending=0, Fired=0, Skipped=4, Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-input-5.bz2): Stopped
Sep 7 16:06:03 vm5 pengine[23199]: warning: common_apply_stickiness: Forcing msAP away from vm5 after 1 failures (max=1)
Sep 7 16:06:03 vm5 pengine[23199]: notice: LogActions: Promote prmAP:0 (Slave -> Master vm6)
Sep 7 16:06:03 vm5 crmd[23200]: notice: run_graph: Transition 6 (Complete=3, Pending=0, Fired=0, Skipped=1, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-6.bz2): Stopped
Sep 7 16:06:03 vm5 pengine[23199]: warning: common_apply_stickiness: Forcing msAP away from vm5 after 1 failures (max=1)
Sep 7 16:06:03 vm5 crmd[23200]: notice: run_graph: Transition 7 (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-7.bz2): Complete
>> after the master resource on vm6 failed
Sep 7 16:06:33 vm5 pengine[23199]: warning: common_apply_stickiness: Forcing msAP away from vm5 after 1 failures (max=1)
Sep 7 16:06:33 vm5 pengine[23199]: notice: LogActions: Recover prmAP:0 (Master vm6)
Sep 7 16:06:34 vm5 crmd[23200]: notice: run_graph: Transition 8 (Complete=3, Pending=0, Fired=0, Skipped=8, Incomplete=3, Source=/var/lib/pacemaker/pengine/pe-input-8.bz2): Stopped
Sep 7 16:06:34 vm5 pengine[23199]: warning: common_apply_stickiness: Forcing msAP away from vm5 after 1 failures (max=1)
Sep 7 16:06:34 vm5 pengine[23199]: warning: common_apply_stickiness: Forcing msAP away from vm6 after 1 failures (max=1)
Sep 7 16:06:34 vm5 pengine[23199]: notice: LogActions: Stop prmAP:0 (vm6)
Sep 7 16:06:34 vm5 crmd[23200]: notice: run_graph: Transition 9 (Complete=3, Pending=0, Fired=0, Skipped=1, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-9.bz2): Stopped
Sep 7 16:06:34 vm5 pengine[23199]: notice: LogActions: Start prmAP:0 (vm5)
Sep 7 16:06:34 vm5 pengine[23199]: notice: LogActions: Promote prmAP:0 (Stopped -> Master vm5)
Sep 7 16:06:34 vm5 pengine[23199]: notice: LogActions: Start prmAP:1 (vm6)
Sep 7 16:06:35 vm5 crmd[23200]: notice: run_graph: Transition 10 (Complete=4, Pending=0, Fired=0, Skipped=4, Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): Stopped
Sep 7 16:06:35 vm5 pengine[23199]: warning: common_apply_stickiness: Forcing msAP away from vm5 after 1 failures (max=1)
Sep 7 16:06:35 vm5 pengine[23199]: warning: common_apply_stickiness: Forcing msAP away from vm5 after 1 failures (max=1)
Sep 7 16:06:35 vm5 pengine[23199]: warning: common_apply_stickiness: Forcing msAP away from vm6 after 1 failures (max=1)
Sep 7 16:06:35 vm5 pengine[23199]: warning: common_apply_stickiness: Forcing msAP away from vm6 after 1 failures (max=1)
Sep 7 16:06:35 vm5 pengine[23199]: notice: LogActions: Stop prmAP:0 (vm5)
Sep 7 16:06:35 vm5 pengine[23199]: notice: LogActions: Stop prmAP:1 (vm6)
Sep 7 16:06:35 vm5 crmd[23200]: notice: run_graph: Transition 11 (Complete=4, Pending=0, Fired=0, Skipped=1, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-11.bz2): Stopped
Sep 7 16:06:35 vm5 pengine[23199]: notice: LogActions: Start prmAP:0 (vm5)
Sep 7 16:06:35 vm5 pengine[23199]: notice: LogActions: Promote prmAP:0 (Stopped -> Master vm5)
Sep 7 16:06:35 vm5 pengine[23199]: notice: LogActions: Start prmAP:1 (vm6)
Sep 7 16:06:35 vm5 crmd[23200]: notice: run_graph: Transition 12 (Complete=4, Pending=0, Fired=0, Skipped=4, Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-input-12.bz2): Stopped
:
Is it a known issue?
Best Regards,
Kazunori INOUE
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ms-resource-doesnot-stop.tar.bz2
Type: application/octet-stream
Size: 259981 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120907/487086a2/attachment-0003.obj>
More information about the Pacemaker
mailing list