[ClusterLabs] FLoating IP failing over but not failing back with active/active LDAP (dirsrv)
Bernie Jones
bernie at securityconsulting.ltd.uk
Thu Mar 10 14:48:42 UTC 2016
A bit more info..
If, after I restart the failed dirsrv instance, I then perform a "pcs
resource cleanup dirsrv-daemon" to clear the FAIL messages then the failover
will work OK.
So it's as if the cleanup is changing the status in some way..
From: Bernie Jones [mailto:bernie at securityconsulting.ltd.uk]
Sent: 10 March 2016 08:47
To: 'Cluster Labs - All topics related to open-source clustering welcomed'
Subject: [ClusterLabs] FLoating IP failing over but not failing back with
active/active LDAP (dirsrv)
Hi all, could you advise please?
I'm trying to configure a floating IP with an active/active deployment of
389 directory server. I don't want pacemaker to manage LDAP but just to
monitor and switch the IP as required to provide resilience. I've seen some
other similar threads and based my solution on those.
I've amended the ocf for slapd to work with 389 DS and this tests out OK
(dirsrv).
I've then created my resources as below:
pcs resource create dirsrv-ip ocf:heartbeat:IPaddr2 ip="192.168.26.100"
cidr_netmask="32" op monitor timeout="20s" interval="5s" op start
interval="0" timeout="20" op stop interval="0" timeout="20"
pcs resource create dirsrv-daemon ocf:heartbeat:dirsrv op monitor
interval="10" timeout="5" op start interval="0" timeout="5" op stop
interval="0" timeout="5" meta "is-managed=false"
pcs resource clone dirsrv-daemon meta globally-unique="false"
interleave="true" target-role="Started" "master-max=2"
pcs constraint colocation add dirsrv-daemon-clone with dirsrv-ip
score=INFINITY
pcs property set no-quorum-policy=ignore
pcs resource defaults migration-threshold=1
pcs property set stonith-enabled=false
On startup all looks well:
____________________________________________________________________________
____________
Last updated: Thu Mar 10 08:28:03 2016
Last change: Thu Mar 10 08:26:14 2016
Stack: cman
Current DC: ga2.idam.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured
3 Resources configured
Online: [ ga1.idam.com ga2.idam.com ]
dirsrv-ip (ocf::heartbeat:IPaddr2): Started ga1.idam.com
Clone Set: dirsrv-daemon-clone [dirsrv-daemon]
dirsrv-daemon (ocf::heartbeat:dirsrv): Started ga2.idam.com
(unmanaged)
dirsrv-daemon (ocf::heartbeat:dirsrv): Started ga1.idam.com
(unmanaged)
____________________________________________________________________________
____________
Stop dirsrv on ga1:
Last updated: Thu Mar 10 08:28:43 2016
Last change: Thu Mar 10 08:26:14 2016
Stack: cman
Current DC: ga2.idam.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured
3 Resources configured
Online: [ ga1.idam.com ga2.idam.com ]
dirsrv-ip (ocf::heartbeat:IPaddr2): Started ga2.idam.com
Clone Set: dirsrv-daemon-clone [dirsrv-daemon]
dirsrv-daemon (ocf::heartbeat:dirsrv): Started ga2.idam.com
(unmanaged)
dirsrv-daemon (ocf::heartbeat:dirsrv): FAILED ga1.idam.com
(unmanaged)
Failed actions:
dirsrv-daemon_monitor_10000 on ga1.idam.com 'not running' (7): call=12,
status=complete, last-rc-change='Thu Mar 10 08:28:41 2016', queued=0ms,
exec=0ms
IP fails over to ga2 OK:
____________________________________________________________________________
____________
Restart dirsrv on ga1
Last updated: Thu Mar 10 08:30:01 2016
Last change: Thu Mar 10 08:26:14 2016
Stack: cman
Current DC: ga2.idam.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured
3 Resources configured
Online: [ ga1.idam.com ga2.idam.com ]
dirsrv-ip (ocf::heartbeat:IPaddr2): Started ga2.idam.com
Clone Set: dirsrv-daemon-clone [dirsrv-daemon]
dirsrv-daemon (ocf::heartbeat:dirsrv): Started ga2.idam.com
(unmanaged)
dirsrv-daemon (ocf::heartbeat:dirsrv): Started ga1.idam.com
(unmanaged)
Failed actions:
dirsrv-daemon_monitor_10000 on ga1.idam.com 'not running' (7): call=12,
status=complete, last-rc-change='Thu Mar 10 08:28:41 2016', queued=0ms,
exec=0ms
____________________________________________________________________________
____________
Stop dirsrv on ga2:
Last updated: Thu Mar 10 08:31:14 2016
Last change: Thu Mar 10 08:26:14 2016
Stack: cman
Current DC: ga2.idam.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured
3 Resources configured
Online: [ ga1.idam.com ga2.idam.com ]
dirsrv-ip (ocf::heartbeat:IPaddr2): Started ga2.idam.com
Clone Set: dirsrv-daemon-clone [dirsrv-daemon]
dirsrv-daemon (ocf::heartbeat:dirsrv): FAILED ga2.idam.com
(unmanaged)
dirsrv-daemon (ocf::heartbeat:dirsrv): Started ga1.idam.com
(unmanaged)
Failed actions:
dirsrv-daemon_monitor_10000 on ga2.idam.com 'not running' (7): call=11,
status=complete, last-rc-change='Thu Mar 10 08:31:12 2016', queued=0ms,
exec=0ms
dirsrv-daemon_monitor_10000 on ga1.idam.com 'not running' (7): call=12,
status=complete, last-rc-change='Thu Mar 10 08:28:41 2016', queued=0ms,
exec=0ms
But IP stays on failed node
Looking in the logs it seems that the cluster is not aware that ga1 is
available even though the status output shows it is.
If I repeat the tests but with ga2 started up first the behaviour is similar
i.e. it fails over to ga1 but not back to ga2.
Many thanks,
Bernie
_____
<https://www.avast.com/antivirus> Avast logo
This email has been checked for viruses by Avast antivirus software.
www.avast.com <https://www.avast.com/antivirus>
_____
<https://www.avast.com/antivirus> Avast logo
This email has been checked for viruses by Avast antivirus software.
www.avast.com <https://www.avast.com/antivirus>
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20160310/bc577f94/attachment.htm>
More information about the Users
mailing list