[Pacemaker] Cluster Volume Group is stuck
Karl Rößmann
K.Roessmann at fkf.mpg.de
Thu May 12 07:51:21 UTC 2011
Hi David,
startup-fencing is true
stonith is enabled
stonith-timeout is 60s
stonith-action is reboot
We have a Fibre Channel SAN with multipath driver as common device
for the Volume Groups.
I have SBD Stonith
--------------- This is the SBD Setting: --------------------------
multix244:~ # sbd -d
/dev/disk/by-id/scsi-3600a0b8000420d5a00001cf14dc3a9a2-part1 dump
Header version : 2
Number of slots : 255
Sector size : 512
Timeout (watchdog) : 60
Timeout (allocate) : 2
Timeout (loop) : 1
Timeout (msgwait) : 120
on a similar cluster with iSCSI device and no multipath driver
there is no problem.
karl
Quoting David Coulson <david at davidcoulson.net>:
>
>
> On 5/11/11 8:07 AM, Karl Rößmann wrote:
>> we have a three node cluster with a Cluster Volume Group vgsmet.
>>
>>
>> After powering off one Node, the Volume Group is stuck.
>> One of the ERROR messages is:
>> May 11 10:50:32 multix244 crmd: [8086]: ERROR: process_lrm_event:
>> LRM operation vgsmet:0_monitor_60000 (38) Timed Out
>> (timeout=60000ms)
>>
>>
>> If we power on the Node again the cluster recovers.
>
> Usually this is a fencing problem - How does your cluster manager
> (openais) have fencing configured?
>
> David
>
--
Karl Rößmann Tel. +49-711-689-1657
Max-Planck-Institut FKF Fax. +49-711-689-1632
Postfach 800 665
70506 Stuttgart email K.Roessmann at fkf.mpg.de
More information about the Pacemaker
mailing list