[Pacemaker] CMAN integration questions

Thu Mar 24 04:08:36 EDT 2011

24.03.2011 09:47, Andrew Beekhof wrote:
> On Wed, Mar 23, 2011 at 1:56 PM, Vladislav Bogdanov
> <bubble at hoster-ok.com> wrote:
>> Hi Andrew,
>>
>> 23.12.2010 14:14, Andrew Beekhof wrote:
>> ...
>>>> Especially I need to understand how pacemaker integrates with cman's
>>>> fencing/dlm subsystem:
>>>> *) Do I need to configure fencing in both cman and pacemaker?
>>>
>>> No.  Just in Pacemaker.
>>> fenced spins waiting for Pacemaker to make an API call that tells it
>>> that fencing completed, at which point the dlm can continue.
>>
>> It doesn't seem to be enough even with c6a01b02950b:
> 
> With just that patch or everything before it too?

Everything.

> 
>> When I killall -9 corosync on one node (vd01-b, cman id 2) which by the
>> chance was a DC, the I have following in log on will-be-new-DC (vd01-d)
>> which again by chance run stonith resource for vd01-b (only relevant log
>> lines):
>> ============
>> Mar 23 10:08:49 vd01-d corosync[1630]:   [TOTEM ] A processor failed,
>> forming new configuration.
>> Mar 23 10:09:01 vd01-d kernel: dlm: closing connection to node 2
>> Mar 23 10:09:01 vd01-d crmd: [1875]: info: cman_event_callback:
>> Membership 1582268: quorum retained
>> Mar 23 10:09:01 vd01-d crmd: [1875]: info: ais_status_callback: status:
>> vd01-b is now lost (was member)
>> Mar 23 10:09:01 vd01-d crmd: [1875]: info: crm_update_peer: Node vd01-b:
>> id=2 state=lost (new) addr=(null) votes=0 born=1582212 seen=1582264
>> proc=00000000000000000000000000111312
>> Mar 23 10:09:01 vd01-d corosync[1630]:   [CLM   ] Members Left:
>> Mar 23 10:09:01 vd01-d crmd: [1875]: WARN: check_dead_member: Our DC
>> node (vd01-b) left the cluster
>> Mar 23 10:09:01 vd01-d corosync[1630]:   [CLM   ] #011r(0) ip(10.5.4.65)
>> Mar 23 10:09:01 vd01-d crmd: [1875]: info: send_ais_text: Peer
>> overloaded or membership in flux: Re-sending message (Attempt 1 of 20)
>> Mar 23 10:09:01 vd01-d corosync[1630]:   [QUORUM] Members[15]: 1 3 4 5 6
>> 7 8 9 10 11 12 13 14 15 16
>> Mar 23 10:09:02 vd01-d corosync[1630]:   [MAIN  ] Completed service
>> synchronization, ready to provide service.
>> Mar 23 10:09:02 vd01-d fenced[1688]: fencing deferred to vd01-a
>> Mar 23 10:09:02 vd01-d crmd: [1875]: info: update_dc: Unset DC vd01-b
>> ============
>>
>> At this time fenced (on vd01-a which has cman id 1 and is a fencing
>> domain master) tries to kill that node but fails:
>> ============
>> Mar 23 10:09:02 vd01-a fenced[1748]: fencing node vd01-b
>> Mar 23 10:09:02 vd01-a fenced[1748]: fence vd01-b dev 0.0 agent none
>> result: error no method
>> Mar 23 10:09:02 vd01-a fenced[1748]: fence vd01-b failed
>> Mar 23 10:09:05 vd01-a fenced[1748]: fencing node vd01-b
>> Mar 23 10:09:05 vd01-a fenced[1748]: fence vd01-b dev 0.0 agent none
>> result: error no method
>> Mar 23 10:09:05 vd01-a fenced[1748]: fence vd01-b failed
>> Mar 23 10:09:08 vd01-a fenced[1748]: fencing node vd01-b
>> Mar 23 10:09:08 vd01-a fenced[1748]: fence vd01-b dev 0.0 agent none
>> result: error no method
>> Mar 23 10:09:08 vd01-a fenced[1748]: fence vd01-b failed
>> ============
>> All DLM-related staff is blocked.
>>
>> After 1 minute vd01-d takes over DC role.
>> ============
>> Mar 23 10:10:03 vd01-d crmd: [1875]: info: update_dc: Set DC to vd01-d
>> (3.0.5)
>> ============
>> After that all monitoring operations on resources which depend on DLM
>> (LVM, GFS) fail with timeout, all dependent resources are then stopped,
>> so cluster stops to be highly available.
>>
>> And only almost one more minute later pacemaker decides to stonith vd01-b:
>> ============
>> Mar 23 10:10:54 vd01-d crmd: [1875]: WARN: match_down_event: No match
>> for shutdown action on vd01-b
>> Mar 23 10:10:54 vd01-d crmd: [1875]: info: te_update_diff:
>> Stonith/shutdown of vd01-b not matched
>> Mar 23 10:10:55 vd01-d pengine: [1874]: WARN: pe_fence_node: Node vd01-b
>> will be fenced because it is un-expectedly down
>> Mar 23 10:10:55 vd01-d pengine: [1874]: WARN: determine_online_status:
>> Node vd01-b is unclean
>> ============
>>
>> and one minute later vd01-b is finally fenced.
>> ============
>> Mar 23 10:12:17 vd01-a crmd: [1935]: info: tengine_stonith_notify: Peer
>> vd01-b was terminated (reboot) by vd01-d for vd01-d
>> (ref=05cd139e-585d-452e-a22d-0ef188a64d81): OK
>> Mar 23 10:12:17 vd01-a crmd: [1935]: notice: tengine_stonith_notify:
>> Notified CMAN that 'vd01-b' is now fenced
>> Mar 23 10:12:17 vd01-a crmd: [1935]: notice: tengine_stonith_notify:
>> Confirmed CMAN fencing event for 'vd01-b'
>> Mar 23 10:12:17 vd01-a fenced[1748]: fence vd01-b overridden by
>> administrator intervention
>> ============
>>
>> Overall it took (10:08:49 - 10:12:17) three and a half minutes to fence
>> failed node.
>> So, for this kind of failures (crash of corosync) it could be much more
>> safer to duplicate fencing in both cman and pacemaker, because it would
>> take only 15-20 seconds to do the same. I'll check it a bit later, need
>> to configure fencing in cman, and also check a case when fencing domain
>> master fails.
>> Alternative could be if fenced asks pacemaker to fence failed node (is
>> this done this way?), but this will not help much if DC (my case) fails
>> because election of new DC takes some time too and (I assume) pacemaker
>> will refuse to do fencing without DC. And this time is enough for
>> monitor ops to fail (yes, I can configure bigger timeouts, but I
>> generally want cluster to be as smart as possible).
>>
>> Would you please comment on this?
>>
>> Best,
>> Vladislav
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>