[Pacemaker] SLES11+HAE: Resources on a single node with two configured?

Aleksey Zholdak aleksey at zholdak.com
Fri May 7 05:06:07 EDT 2010


So, the problem is solved.

The solution is very interesting, because nowhere described.

If I follow the advice as to the article 
http://www.linux-ha.org/wiki/SBD_Fencing and using multipath, I increase 
timeouts when creating a partition for sbd...

sles2:~ # sbd -d /dev/mapper/SBD dump
Header version     : 2
Number of slots    : 255
Sector size        : 512
Timeout (watchdog) : 120
Timeout (allocate) : 2
Timeout (loop)     : 10
Timeout (msgwait)  : 180

180 and 120 secs is suitable for me.

But... When I creating sbd fencing primitive, I FORGOT to increase its 
timeout (stonith-timeout)!!!

crm configure primitive sbd_fense stonith:external/sbd params 
sbd_device="/dev/mapper/SBD" stonith-timeout="240s"

So, openais waits for stonith is about 60 secs (stonith-timeout default 
value for external/sbd) and kills it:

sles2 stonithd: [5819]: WARN: external_sbd_fense:0_1 process (PID 8688) 
timed out (try 1).  Killing with signal SIGTERM (15).
sles2 stonithd: [8688]: info: external_run_cmd: Calling 
'/usr/lib64/stonith/plugins/external/sbd reset sles1' returned 15
sles2 stonithd: [8688]: CRIT: external_reset_req: 'sbd reset' for host 
sles1 failed with rc 15
sles2 stonithd: [5819]: debug: Child process external_sbd_fense:0_1 [8688] 
exited, its exit code: 5 when signo=0
...

Very fun, right?

I really hope that my experience will be useful to someone, and the author 
of the article will add the recommendations about timeouts for sbd 
primitive creation.

P.S. Firewall also prevents the launch of resources. Someone can explain me 
how to run the resources with firewall?

-- 

С уважением,
ЖОЛДАК Алексей

ICQ   150074
MSN   aleksey at zholdak.com
Skype aleksey.zholdak
Voice +380442388043




More information about the Pacemaker mailing list