[Pacemaker] Corosync starts after system reboot but fails to load/start any resources
Parshvi
parshvi.17 at gmail.com
Mon May 28 12:18:00 UTC 2012
Hi,
I have setup a two node cluster, with stonith disabled (Node-1 and Node-2),
ocfs2 as the file system (running in a separate cluster):
Use case:
1) One resource runs in a Master/Slave mode with CIP1.
2) 5 resources run in Active/Passive mode with CIP2, preferred node being Node-
1.
These resources are non sticky in nature (resource stickiness = 0)
3) 5 resources run in Active/Passive mode with CIP3, preferred node being Node-
2.
These resources are non sticky in nature (resource stickiness = 0)
4) There are few more resources running in Active/Passive with stickiness = 1,
preferred node being Node-1.
Test case:
Node-2 (Running as Primary) is rebooted.
Expected result (While Node-2 is offline):
All resources of Node-2 running in Active/Passive fail-over to Node-1
The slave instance of the M/S resource is promoted to Master on Node-1
When Node-2 is up after reboot:
The non-sticky 5 resources should fail-back to Node-2
A slave resource must start on Node-2
Observations:
-> When Node-2 is up after reboot, the following issues are observed:
1) NONE of the resources start on Node-2: The 5 non-sticky resources do not
fail-back on Node-2.
2) The slave instance is not started on Node-2.
The system is rebooted at 8:38 a.m.
A restart of corosync engine is initiated at 9:50 a.m. which fails to fix the
issue.
At 10:24 a.m. the system is rebooted again. This time the resources are started
normally.
crm_mon on Node-1: shows Node-2 as offline.
crm_mon on Node-2: shows Node-1 as online.
An hb_report could not be captured. Although I have logs (corosync logs + sys
logs)and pe-input files. Where can I publish them ? pastebin seems to be blocked
More information about the Pacemaker
mailing list