[Pacemaker] Stonith external/sbd problem

Nicola Sabatelli n.sabatelli at ct.rupar.puglia.it
Thu Apr 29 08:47:04 EDT 2010


 

I have a problem with STONITH plugin external/sbd.

I have configured the system in according to directive that I find at url
http://www.linux-ha.org/wiki/SBD_Fencing, and the device that I use is
configured with multipath software because this disk is residend on a
storage system.

I have create a resurse on my cluster using clove directive.

But when I try to start the resurse I have these errors:

 

from ha-log file:

 

Apr 29 14:37:51 clover-h stonithd: [16811]: info: external_run_cmd: Calling
'/usr/lib64/stonith/plugins/external/sbd status' returned 256

Apr 29 14:37:51 clover-h stonithd: [16811]: CRIT: external_status: 'sbd
status' failed with rc 256

Apr 29 14:37:51 clover-h stonithd: [10615]: WARN: start
stonith_external_sbd_LOCK_LUN:0 failed, because its hostlist is empty

 

from crm_verify:

 

crm_verify[18607]: 2010/04/29_14:39:27 info: main: =#=#=#=#= Getting XML
=#=#=#=#=

crm_verify[18607]: 2010/04/29_14:39:27 info: main: Reading XML from: live
cluster

crm_verify[18607]: 2010/04/29_14:39:27 notice: unpack_config: On loss of CCM
Quorum: Ignore

crm_verify[18607]: 2010/04/29_14:39:27 info: unpack_config: Node scores:
'red' = -INFINITY, 'yellow' = 0, 'green' = 0

crm_verify[18607]: 2010/04/29_14:39:27 info: determine_online_status: Node
clover-a.rsr.rupar.puglia.it is online

crm_verify[18607]: 2010/04/29_14:39:27 WARN: unpack_rsc_op: Processing
failed op stonith_external_sbd_LOCK_LUN:1_start_0 on
clover-a.rsr.rupar.puglia.it: unknown error (1)

crm_verify[18607]: 2010/04/29_14:39:27 info: find_clone: Internally renamed
stonith_external_sbd_LOCK_LUN:0 on clover-a.rsr.rupar.puglia.it to
stonith_external_sbd_LOCK_LUN:2 (ORPHAN)

crm_verify[18607]: 2010/04/29_14:39:27 info: determine_online_status: Node
clover-h.rsr.rupar.puglia.it is online

crm_verify[18607]: 2010/04/29_14:39:27 WARN: unpack_rsc_op: Processing
failed op stonith_external_sbd_LOCK_LUN:0_start_0 on
clover-h.rsr.rupar.puglia.it: unknown error (1)

crm_verify[18607]: 2010/04/29_14:39:27 notice: clone_print:  Master/Slave
Set: ms_drbd_1

crm_verify[18607]: 2010/04/29_14:39:27 notice: short_print:      Stopped: [
res_drbd_1:0 res_drbd_1:1 ]

crm_verify[18607]: 2010/04/29_14:39:27 notice: native_print:
res_Filesystem_TEST        (ocf::heartbeat:Filesystem):    Stopped

crm_verify[18607]: 2010/04/29_14:39:27 notice: native_print:
res_IPaddr2_ip_clover      (ocf::heartbeat:IPaddr2):       Stopped

crm_verify[18607]: 2010/04/29_14:39:27 notice: clone_print:  Clone Set:
cl_external_sbd_1

crm_verify[18607]: 2010/04/29_14:39:27 notice: native_print:
stonith_external_sbd_LOCK_LUN:0       (stonith:external/sbd): Started
clover-h.rsr.rupar.puglia.it FAILED

crm_verify[18607]: 2010/04/29_14:39:27 notice: native_print:
stonith_external_sbd_LOCK_LUN:1       (stonith:external/sbd): Started
clover-a.rsr.rupar.puglia.it FAILED

crm_verify[18607]: 2010/04/29_14:39:27 info: get_failcount:
cl_external_sbd_1 has failed 1000000 times on clover-h.rsr.rupar.puglia.it

crm_verify[18607]: 2010/04/29_14:39:27 WARN: common_apply_stickiness:
Forcing cl_external_sbd_1 away from clover-h.rsr.rupar.puglia.it after
1000000 failures (max=1000000)

crm_verify[18607]: 2010/04/29_14:39:27 info: get_failcount:
cl_external_sbd_1 has failed 1000000 times on clover-a.rsr.rupar.puglia.it

crm_verify[18607]: 2010/04/29_14:39:27 WARN: common_apply_stickiness:
Forcing cl_external_sbd_1 away from clover-a.rsr.rupar.puglia.it after
1000000 failures (max=1000000)

crm_verify[18607]: 2010/04/29_14:39:27 info: native_merge_weights:
ms_drbd_1: Rolling back scores from res_Filesystem_TEST

crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource
res_drbd_1:0 cannot run anywhere

crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource
res_drbd_1:1 cannot run anywhere

crm_verify[18607]: 2010/04/29_14:39:27 info: native_merge_weights:
ms_drbd_1: Rolling back scores from res_Filesystem_TEST

crm_verify[18607]: 2010/04/29_14:39:27 info: master_color: ms_drbd_1:
Promoted 0 instances of a possible 1 to master

crm_verify[18607]: 2010/04/29_14:39:27 info: master_color: ms_drbd_1:
Promoted 0 instances of a possible 1 to master

crm_verify[18607]: 2010/04/29_14:39:27 info: native_merge_weights:
res_Filesystem_TEST: Rolling back scores from res_IPaddr2_ip_clover

crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource
res_Filesystem_TEST cannot run anywhere

crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource
res_IPaddr2_ip_clover cannot run anywhere

crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource
stonith_external_sbd_LOCK_LUN:0 cannot run anywhere

crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource
stonith_external_sbd_LOCK_LUN:1 cannot run anywhere

crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Leave resource
res_drbd_1:0  (Stopped)

crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Leave resource
res_drbd_1:1  (Stopped)

crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Leave resource
res_Filesystem_TEST   (Stopped)

crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Leave resource
res_IPaddr2_ip_clover (Stopped)

crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Stop resource
stonith_external_sbd_LOCK_LUN:0        (clover-h.rsr.rupar.puglia.it)

crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Stop resource
stonith_external_sbd_LOCK_LUN:1        (clover-a.rsr.rupar.puglia.it)

Warnings found during check: config may not be valid

 

and from crm_mon:

 

============

Last updated: Thu Apr 29 14:39:57 2010

Stack: Heartbeat

Current DC: clover-h.rsr.rupar.puglia.it
(e39bb201-2a6f-457a-a308-be6bfe71309c) - partition with quorum

Version: 1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7

2 Nodes configured, unknown expected votes

4 Resources configured.

============

 

Online: [ clover-h.rsr.rupar.puglia.it clover-a.rsr.rupar.puglia.it ]

 

 Clone Set: cl_external_sbd_1

     stonith_external_sbd_LOCK_LUN:0    (stonith:external/sbd): Started
clover-h.rsr.rupar.puglia.it FAILED

     stonith_external_sbd_LOCK_LUN:1    (stonith:external/sbd): Started
clover-a.rsr.rupar.puglia.it FAILED

 

Operations:

* Node clover-a.rsr.rupar.puglia.it:

   stonith_external_sbd_LOCK_LUN:1: migration-threshold=1000000
fail-count=1000000

    + (24) start: rc=1 (unknown error)

* Node clover-h.rsr.rupar.puglia.it:

   stonith_external_sbd_LOCK_LUN:0: migration-threshold=1000000
fail-count=1000000

    + (25) start: rc=1 (unknown error)

 

Failed actions:

    stonith_external_sbd_LOCK_LUN:1_start_0
(node=clover-a.rsr.rupar.puglia.it, call=24, rc=1, status=complete): unknown
error

    stonith_external_sbd_LOCK_LUN:0_start_0
(node=clover-h.rsr.rupar.puglia.it, call=25, rc=1, status=complete): unknown
error

 

 

 

 

Ciao, Nicola.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20100429/dd8091e7/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 2743 bytes
Desc: not available
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20100429/dd8091e7/attachment.jpg>


More information about the Pacemaker mailing list