[ClusterLabs] dlm_controld 4.0.4 exits when crmd is fencing another node
Vladislav Bogdanov
bubble at hoster-ok.com
Fri Jan 22 16:57:52 UTC 2016
22.01.2016 19:28, David Teigland wrote:
> On Fri, Jan 22, 2016 at 06:59:25PM +0300, Vladislav Bogdanov wrote:
>> Hi David, list,
>>
>> recently I tried to upgrade dlm from 4.0.2 to 4.0.4 and found that it
>> no longer handles fencing of a remote node initiated by other cluster components.
>> First I noticed that during valid fencing due to resource stop failure,
>> but it is easily reproduced with 'crm node fence XXX'.
>>
>> I took logs from both 4.0.2 and 4.0.4 and "normalized" (replaced timestamps)
>> their part after fencing is originated by pacemaker.
>
> There are very few commits there, and only two I could imagine being
> related. Could you try reverting them and see if that helps?
>
> 79e87eb5913f Make systemd stop dlm on corosync restart
There is no systemd on EL6, so this one is not a suspect.
> fb61984c9388 dlm_stonith: use kick_helper result
Tried reverting this one and a51b2bb ("If an error occurs unlink the
lock file and exit with status 1") one-by-one and both together, the
same result.
So problem seems to be somewhere deeper.
Best,
Vladislav
More information about the Users
mailing list