[ClusterLabs] How can I prevent multiple start of IPaddr 2 in an environment using fence_mpath?

Tue Apr 17 05:56:04 EDT 2018

Hi, Andrei

Thanks for your comment.

We are not assuming node level fencing in the current environment.

I tried the power_timeout setting that you taught.
However, fence_mpath immediately returns the status off when you execute the off action.
https://github.com/ClusterLabs/fence-agents/blob/v4.0.25/fence/agents/lib/fencing.py.py#L744
Therefore, we could not wait to stop IPaddr2 using this option.

I read the code and learned the power_wait option.
With this option you can delay the completion of STONITH by the specified amount of time, 
so it seems to meet our requirements.

Thanks, Yusuke
> -----Original Message-----
> From: Users [mailto:users-bounces at clusterlabs.org] On Behalf Of Andrei
> Borzenkov
> Sent: Friday, April 06, 2018 2:04 PM
> To: users at clusterlabs.org
> Subject: Re: [ClusterLabs] How can I prevent multiple start of IPaddr 2 in an
> environment using fence_mpath?
> 
> 06.04.2018 07:30, 飯田 雄介 пишет:
> > Hi, all
> > I am testing the environment using fence_mpath with the following settings.
> >
> > =======
> >   Stack: corosync
> >   Current DC: x3650f (version 1.1.17-1.el7-b36b869) - partition with quorum
> >   Last updated: Fri Apr  6 13:16:20 2018
> >   Last change: Thu Mar  1 18:38:02 2018 by root via cibadmin on x3650e
> >
> >   2 nodes configured
> >   13 resources configured
> >
> >   Online: [ x3650e x3650f ]
> >
> >   Full list of resources:
> >
> >    fenceMpath-x3650e    (stonith:fence_mpath):  Started x3650e
> >    fenceMpath-x3650f    (stonith:fence_mpath):  Started x3650f
> >    Resource Group: grpPostgreSQLDB
> >        prmFsPostgreSQLDB1       (ocf::heartbeat:Filesystem):    Started
> x3650e
> >        prmFsPostgreSQLDB2       (ocf::heartbeat:Filesystem):    Started
> x3650e
> >        prmFsPostgreSQLDB3       (ocf::heartbeat:Filesystem):    Started
> x3650e
> >        prmApPostgreSQLDB        (ocf::heartbeat:pgsql): Started x3650e
> >    Resource Group: grpPostgreSQLIP
> >        prmIpPostgreSQLDB        (ocf::heartbeat:IPaddr2):       Started
> x3650e
> >    Clone Set: clnDiskd1 [prmDiskd1]
> >        Started: [ x3650e x3650f ]
> >    Clone Set: clnDiskd2 [prmDiskd2]
> >        Started: [ x3650e x3650f ]
> >    Clone Set: clnPing [prmPing]
> >        Started: [ x3650e x3650f ]
> > =======
> >
> > When split-brain occurs in this environment, x3650f executes fence and the
> resource is started with x3650f.
> >
> > === view of x3650e ====
> >   Stack: corosync
> >   Current DC: x3650e (version 1.1.17-1.el7-b36b869) - partition WITHOUT
> quorum
> >   Last updated: Fri Apr  6 13:16:36 2018
> >   Last change: Thu Mar  1 18:38:02 2018 by root via cibadmin on x3650e
> >
> >   2 nodes configured
> >   13 resources configured
> >
> >   Node x3650f: UNCLEAN (offline)
> >   Online: [ x3650e ]
> >
> >   Full list of resources:
> >
> >    fenceMpath-x3650e    (stonith:fence_mpath):  Started x3650e
> >    fenceMpath-x3650f    (stonith:fence_mpath):  Started[ x3650e x3650f ]
> >    Resource Group: grpPostgreSQLDB
> >        prmFsPostgreSQLDB1       (ocf::heartbeat:Filesystem):    Started
> x3650e
> >        prmFsPostgreSQLDB2       (ocf::heartbeat:Filesystem):    Started
> x3650e
> >        prmFsPostgreSQLDB3       (ocf::heartbeat:Filesystem):    Started
> x3650e
> >        prmApPostgreSQLDB        (ocf::heartbeat:pgsql): Started x3650e
> >    Resource Group: grpPostgreSQLIP
> >        prmIpPostgreSQLDB        (ocf::heartbeat:IPaddr2):       Started
> x3650e
> >    Clone Set: clnDiskd1 [prmDiskd1]
> >        prmDiskd1        (ocf::pacemaker:diskd): Started x3650f (UNCLEAN)
> >        Started: [ x3650e ]
> >    Clone Set: clnDiskd2 [prmDiskd2]
> >        prmDiskd2        (ocf::pacemaker:diskd): Started x3650f (UNCLEAN)
> >        Started: [ x3650e ]
> >    Clone Set: clnPing [prmPing]
> >        prmPing  (ocf::pacemaker:ping):  Started x3650f (UNCLEAN)
> >        Started: [ x3650e ]
> >
> > === view of x3650f ====
> >   Stack: corosync
> >   Current DC: x3650f (version 1.1.17-1.el7-b36b869) - partition WITHOUT
> quorum
> >   Last updated: Fri Apr  6 13:16:36 2018
> >   Last change: Thu Mar  1 18:38:02 2018 by root via cibadmin on x3650e
> >
> >   2 nodes configured
> >   13 resources configured
> >
> >   Online: [ x3650f ]
> >   OFFLINE: [ x3650e ]
> >
> >   Full list of resources:
> >
> >    fenceMpath-x3650e    (stonith:fence_mpath):  Started x3650f
> >    fenceMpath-x3650f    (stonith:fence_mpath):  Started x3650f
> >    Resource Group: grpPostgreSQLDB
> >        prmFsPostgreSQLDB1       (ocf::heartbeat:Filesystem):    Started
> x3650f
> >        prmFsPostgreSQLDB2       (ocf::heartbeat:Filesystem):    Started
> x3650f
> >        prmFsPostgreSQLDB3       (ocf::heartbeat:Filesystem):    Started
> x3650f
> >        prmApPostgreSQLDB        (ocf::heartbeat:pgsql): Started x3650f
> >    Resource Group: grpPostgreSQLIP
> >        prmIpPostgreSQLDB        (ocf::heartbeat:IPaddr2):       Started
> x3650f
> >    Clone Set: clnDiskd1 [prmDiskd1]
> >        Started: [ x3650f ]
> >        Stopped: [ x3650e ]
> >    Clone Set: clnDiskd2 [prmDiskd2]
> >        Started: [ x3650f ]
> >        Stopped: [ x3650e ]
> >    Clone Set: clnPing [prmPing]
> >        Started: [ x3650f ]
> >        Stopped: [ x3650e ]
> > =======
> >
> > However, IPaddr2 of x3650e will not stop until pgsql monitor error occurs.
> > At this time, IPaddr2 is temporarily started on two nodes.
> >
> > === view of after pgsql monitor error ===
> >   Stack: corosync
> >   Current DC: x3650e (version 1.1.17-1.el7-b36b869) - partition WITHOUT
> quorum
> >   Last updated: Fri Apr  6 13:16:56 2018
> >   Last change: Thu Mar  1 18:38:02 2018 by root via cibadmin on x3650e
> >
> >   2 nodes configured
> >   13 resources configured
> >
> >   Node x3650f: UNCLEAN (offline)
> >   Online: [ x3650e ]
> >
> >   Full list of resources:
> >
> >    fenceMpath-x3650e    (stonith:fence_mpath):  Started x3650e
> >    fenceMpath-x3650f    (stonith:fence_mpath):  Started[ x3650e x3650f ]
> >    Resource Group: grpPostgreSQLDB
> >        prmFsPostgreSQLDB1       (ocf::heartbeat:Filesystem):    Started
> x3650e
> >        prmFsPostgreSQLDB2       (ocf::heartbeat:Filesystem):    Started
> x3650e
> >        prmFsPostgreSQLDB3       (ocf::heartbeat:Filesystem):    Started
> x3650e
> >        prmApPostgreSQLDB        (ocf::heartbeat:pgsql): Stopped
> >    Resource Group: grpPostgreSQLIP
> >        prmIpPostgreSQLDB        (ocf::heartbeat:IPaddr2):       Stopped
> >    Clone Set: clnDiskd1 [prmDiskd1]
> >        prmDiskd1        (ocf::pacemaker:diskd): Started x3650f (UNCLEAN)
> >        Started: [ x3650e ]
> >    Clone Set: clnDiskd2 [prmDiskd2]
> >        prmDiskd2        (ocf::pacemaker:diskd): Started x3650f (UNCLEAN)
> >        Started: [ x3650e ]
> >    Clone Set: clnPing [prmPing]
> >        prmPing  (ocf::pacemaker:ping):  Started x3650f (UNCLEAN)
> >        Started: [ x3650e ]
> >
> >   Node Attributes:
> >   * Node x3650e:
> >       + default_ping_set                        : 100
> >       + diskcheck_status                        : normal
> >       + diskcheck_status_internal               : normal
> >
> >   Migration Summary:
> >   * Node x3650e:
> >      prmApPostgreSQLDB: migration-threshold=1 fail-count=1
> last-failure='Fri Apr  6 13:16:39 2018'
> >
> >   Failed Actions:
> >   * prmApPostgreSQLDB_monitor_10000 on x3650e 'not running' (7): call=60,
> status=complete, exitreason='Configuration file
> /dbfp/pgdata/data/postgresql.conf doesn't exist',
> >       last-rc-change='Fri Apr  6 13:16:39 2018', queued=0ms, exec=0ms
> > ======
> >
> > We regard this behavior as a problem.
> > Is there a way to avoid this behavior?
> >
> 
> 
> Use node level stonith agent instead of storage resource fencing? :)
> 
> Seriously, storage fencing just ensures that other node(s) cannot access the
> same resources and so damage data by uncontrolled concurrent access.
> Otherwise node with fenced off resource continues to run "normally".
> 
> See also https://access.redhat.com/articles/3078811 for some statements
> regarding use of storage fencing.
> 
> The only workaround for two node cluster I can think of is to artificially delay
> stonith agent completion to be longer than monitor timeout. This way node will
> not begin failover resources until resources are (hopefully) stopped on other
> node. You can probably do it with power_timeout property.
> 
> For three+ nodes setting no-quorum-policy=stop may work, although it does not
> solve the problem of intentional node fencing of healthy node (e.g. due to
> resource stop failure).
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org