[Pacemaker] [Partially SOLVED] pacemaker/dlm problems

Mon Dec 19 12:11:50 UTC 2011

19.12.2011 14:39, Vladislav Bogdanov wrote:
> 09.12.2011 08:44, Andrew Beekhof wrote:
>> On Fri, Dec 9, 2011 at 3:16 PM, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
>>> 09.12.2011 03:11, Andrew Beekhof wrote:
>>>> On Fri, Dec 2, 2011 at 1:32 AM, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
>>>>> Hi Andrew,
>>>>>
>>>>> I investigated on my test cluster what actually happens with dlm and
>>>>> fencing.
>>>>>
>>>>> I added more debug messages to dlm dump, and also did a re-kick of nodes
>>>>> after some time.
>>>>>
>>>>> Results are that stonith history actually doesn't contain any
>>>>> information until pacemaker decides to fence node itself.
>>>>
>>>> ...
>>>>
>>>>> From my PoV that means that the call to
>>>>> crm_terminate_member_no_mainloop() does not actually schedule fencing
>>>>> operation.
>>>>
>>>> You're going to have to remind me... what does your copy of
>>>> crm_terminate_member_no_mainloop() look like?
>>>> This is with the non-cman editions of the controlds too right?
>>>
>>> Just latest github's version. You changed some dlm_controld.pcmk
>>> functionality, so it asks stonithd for fencing results instead of XML
>>> magic. But call to crm_terminate_member_no_mainloop() remains the same
>>> there. But yes, that version communicates stonithd directly too.
>>>
>>> SO, the problem here is just with crm_terminate_member_no_mainloop()
>>> which for some reason skips actual fencing request.
>>
>> There should be some logs, either indicating that it tried, or that it failed.
> 
> Nothing about fencing.
> Only messages about history requests:
> 
> stonith-ng: [1905]: info: stonith_command: Processed st_fence_history
> from cluster-dlm: rc=0
> 
> I even moved all fencing code to dlm_controld to have better control on
> what does it do (and not to rebuild pacemaker to play with that code).
> dlm_tool dump prints the same line every second, stonith-ng prints
> history requests.
> 
> A little bit odd, by I saw one time that fencing request from
> cluster-dlm succeeded, but only right after node was fenced by
> pacemaker. As a result, node was switched off instead of reboot.
> 
> That raises one more question: is it correct to call st->cmds->fence()
> with third parameter set to "off"?
> I think that "reboot" is more consistent with the rest of fencing subsystem.
> 
> At the same time, stonith_admin -B succeeds.
> The main difference I see is st_opt_sync_call in a latter case.
> Will try to experiment with it.

Yeeeesssss!!!

Now I see following:
Dec 19 11:53:34 vd01-a cluster-dlm: [2474]: info:
pacemaker_terminate_member: Requesting that node 1090782474/vd01-b be fenced
Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info:
initiate_remote_stonith_op: Initiating remote operation reboot for
vd01-b: 21425fc0-4311-40fa-9647-525c3f258471
Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: crm_get_peer: Node
vd01-c now has id: 1107559690
Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: stonith_command:
Processed st_query from vd01-c: rc=0
Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: crm_get_peer: Node
vd01-d now has id: 1124336906
Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: stonith_command:
Processed st_query from vd01-d: rc=0
Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: stonith_command:
Processed st_query from vd01-a: rc=0
Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: call_remote_stonith:
Requesting that vd01-c perform op reboot vd01-b
Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: crm_get_peer: Node
vd01-b now has id: 1090782474
...
Dec 19 11:53:40 vd01-a stonith-ng: [1905]: info: stonith_command:
Processed st_fence_history from cluster-dlm: rc=0
Dec 19 11:53:40 vd01-a crmd: [1910]: info: tengine_stonith_notify: Peer
vd01-b was terminated (reboot) by vd01-c for vd01-a
(ref=21425fc0-4311-40fa-9647-525c3f258471): OK

But, then I see minor issue that node is marked to be fenced again:
Dec 19 11:53:40 vd01-a pengine: [1909]: WARN: pe_fence_node: Node vd01-b
will be fenced because it is un-expectedly down
...
Dec 19 11:53:40 vd01-a pengine: [1909]: WARN: stage6: Scheduling Node
vd01-b for STONITH
...
Dec 19 11:53:40 vd01-a crmd: [1910]: info: te_fence_node: Executing
reboot fencing operation (249) on vd01-b (timeout=60000)
...
Dec 19 11:53:40 vd01-a stonith-ng: [1905]: info: call_remote_stonith:
Requesting that vd01-c perform op reboot vd01-b

And so on.

I can't investigated this one in more depth, because I use fence_xvm in
this testing cluster, and it has issues when running more than one
stonith resource on a node. Also, my RA (in a cluster where this testing
cluster runs) undefines VM after failure, so fence_xvm does not see
fencing victim in a qpid and is unable to fence it again.

May be it is possible to look if node was just fenced and skip redundant
fencing?

Vladislav