[Pacemaker] crm resource doesn´t move after hardware crash

Beo Banks beo.banks at googlemail.com
Tue Mar 18 11:05:32 UTC 2014


hi,

i have a hardware crash in a two-node drbd cluster.
the active node has a hardware failure is actual down.

i am wondering that my 2nd doesn´t migrate/move the resource.
the 2nd node want´s to fence the device but that´s not possible (it´s down)


how can i enable the services on the last "good" node?
and how can i optimize my config to handle that kind of error?

crm status

Last updated: Tue Mar 18 12:01:07 2014
Last change: Tue Mar 18 11:28:22 2014 via crmd on linux02
Stack: classic openais (with plugin)
Current DC: linux02 - partition WITHOUT quorum
Version: 1.1.10-14.el6_5.2-368c726
2 Nodes configured, 2 expected votes
21 Resources configured


Node linux01: UNCLEAN (offline)
Online: [ linux02 ]

 Resource Group: mysql
     mysql_fs   (ocf::heartbeat:Filesystem):    Started linux01
     mysql_ip   (ocf::heartbeat:IPaddr2):       Started linux01

.... and so on



cluster.log


Mar 18 11:54:43 [2234] linux02       crmd:   notice:
tengine_stonith_callback:      Stonith operation 17 for linux01 failed
(Timer expired): aborting transition.
Mar 18 11:54:43 [2234] linux02       crmd:     info:
abort_transition_graph:        tengine_stonith_callback:463 - Triggered
transition abort (complete=0) : Stonith failed
Mar 18 11:54:43 [2234] linux02       crmd:   notice: run_graph:
Transition 15 (Complete=9, Pending=0, Fired=0, Skipped=36, Incomplete=19,
Source=/var/lib/pacemaker/pengine/pe-warn-63.bz2): Stopped
Mar 18 11:54:43 [2234] linux02       crmd:   notice: too_many_st_failures:
Too many failures to fence linux01 (16), giving up
Mar 18 11:54:43 [2234] linux02       crmd:     info: do_log:        FSA:
Input I_TE_SUCCESS from notify_crmd() received in state S_TRANSITION_ENGINE
Mar 18 11:54:43 [2234] linux02       crmd:   notice: do_state_transition:
State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]
Mar 18 11:54:43 [2230] linux02 stonith-ng:     info: stonith_command:
Processed st_notify reply from linux02: OK (0)
Mar 18 11:54:43 [2234] linux02       crmd:   notice:
tengine_stonith_notify:        Peer linux01 was not terminated (reboot) by
linux02 for linux02: Timer expired
(ref=7939b264-699c-4d00-a89c-07e7e0193a80) by client crmd.2234
Mar 18 11:54:44 [2229] linux02        cib:     info: crm_client_new:
Connecting 0x155ac00 for uid=0 gid=0 pid=23360
id=b88b2690-0c3f-48ac-b8b4-3a47b7f9114a
Mar 18 11:54:44 [2229] linux02        cib:     info: cib_process_request:
Completed cib_query operation for section 'all': OK (rc=0,
origin=local/crm_mon/2, version=0.125.2)
Mar 18 11:54:44 [2229] linux02        cib:     info: crm_client_destroy:
Destroying 0 events
Mar 18 11:55:03 [2229] linux02        cib:     info: crm_client_new:
Connecting 0x155ac00 for uid=0 gid=0 pid=23415
id=62e7a9d8-588e-427f-8178-85febce00151
Mar 18 11:55:03 [2229] linux02        cib:     info: crm_client_new:
Connecting 0x1585de0 for uid=0 gid=0 pid=23416
id=79795042-699b-4347-abcb-4c7c96ed2291
Mar 18 11:55:03 [2229] linux02        cib:     info: cib_process_request:
Completed cib_query operation for section nodes: OK (rc=0,
origin=local/crm_attribute/2, version=0.125.2)
Mar 18 11:55:03 [2229] linux02        cib:     info: cib_process_request:
Completed cib_query operation for section nodes: OK (rc=0,
origin=local/crm_attribute/2, version=0.125.2)
Mar 18 11:55:03 [2229] linux02        cib:     info: crm_client_destroy:
Destroying 0 events
Mar 18 11:55:03 [2229] linux02        cib:     info: crm_client_destroy:
Destroying 0 events
Mar 18 11:55:43 [2230] linux02 stonith-ng:    error: remote_op_done:
Already sent notifications for 'reboot of linux01 by linux02'
(for=crmd.2234 at linux02.7939b264, state=4): Timer expired
Mar 18 11:55:59 [2229] linux02        cib:     info: crm_client_new:
Connecting 0x155ac00 for uid=0 gid=0 pid=23468
id=8dea3cab-9103-42fc-9747-76018c4a0500
Mar 18 11:55:59 [2229] linux02        cib:     info: cib_process_request:
Completed cib_query operation for section 'all': OK (rc=0,
origin=local/crm_mon/2, version=0.125.2)
Mar 18 11:55:59 [2229] linux02        cib:     info: crm_client_destroy:
Destroying 0 events
Mar 18 11:56:03 [2229] linux02        cib:     info: crm_client_new:
Connecting 0x155ac00 for uid=0 gid=0 pid=23523
id=b681390a-51a3-4d68-abf1-514ee8ab9351
Mar 18 11:56:03 [2229] linux02        cib:     info: crm_client_new:
Connecting 0x1585de0 for uid=0 gid=0 pid=23524
id=005421e4-b079-4a16-b4cc-0fc2c8c73246
Mar 18 11:56:03 [2229] linux02        cib:     info: cib_process_request:
Completed cib_query operation for section nodes: OK (rc=0,
origin=local/crm_attribute/2, version=0.125.2)
Mar 18 11:56:03 [2229] linux02        cib:     info: cib_process_request:
Completed cib_query operation for section nodes: OK (rc=0,
origin=local/crm_attribute/2, version=0.125.2)
Mar 18 11:56:03 [2229] linux02        cib:     info: crm_client_destroy:
Destroying 0 events
Mar 18 11:56:03 [2229] linux02        cib:     info: crm_client_destroy:
Destroying 0 events

thanks
beo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140318/fcf19110/attachment-0003.html>


More information about the Pacemaker mailing list