[ClusterLabs] Resources are stopped and started when one node rejoins

Vladislav Bogdanov bubble at hoster-ok.com
Thu Aug 31 08:34:03 EDT 2017


31.08.2017 14:53, Octavian Ciobanu wrote:
> I'm back with confirmation that DLM is the one that triggers Mount
> resources to stop when the stopped/suspended node rejoins
>
> When the DLM resource is started, on the rejoining node, it tries to get
> a free journal but always gets one that is already occupied by another
> node and triggers a domino effect, one node jumps to the next node
> occupied dlm journal, stopping and restarting the mount resources.
>
> I have plenty of DLM journals (allocated 10 for 3 node configuration) so
> for sure there are unused ones available.
>
> There is a way to make DLM keep its journal when the node is stopped and
> use it when it starts ? or a way to make it keep it's active allocation
> forcing the rejoining node to look for another unoccupied journal
> without pushing away the nodes that occupy the journal it allocates for
> the node that rejoins?

Did you try my suggestion from the previous e-mail?

'Probably you also do not need 'ordered="true"' for your DLM clone? 
Knowing what is DLM, it does not need ordering, its instances may be 
safely started in the parallel.'


>
> Best regards
>
> On Mon, Aug 28, 2017 at 6:04 PM, Octavian Ciobanu
> <coctavian1979 at gmail.com <mailto:coctavian1979 at gmail.com>> wrote:
>
>     Thank you for info.
>     Looking over the output of the crm_simulate I've noticed the
>     "notice" messages and with the help of the debug mode I've found
>     this sequence in log
>
>     Aug 28 16:23:19 [13802] node03 crm_simulate:    debug:
>     native_assign_node:      Assigning node01 to DLM:2
>     Aug 28 16:23:19 [13802] node03 crm_simulate:   notice:
>     color_instance:  Pre-allocation failed: got node01 instead of node02
>     Aug 28 16:23:19 [13802] node03 crm_simulate:     info:
>     native_deallocate:       Deallocating DLM:2 from core01
>     Aug 28 16:23:19 [13802] node03 crm_simulate:    debug:
>     native_assign_node:      Assigning core01 to DLM:3
>     Aug 28 16:23:19 [13802] node03 crm_simulate:   notice:
>     color_instance:  Pre-allocation failed: got node01 instead of node03
>     Aug 28 16:23:19 [13802] node03 crm_simulate:     info:
>     native_deallocate:       Deallocating DLM:3 from core01
>     Aug 28 16:23:19 [13802] node03 crm_simulate:    debug:
>     native_assign_node:      Assigning core01 to DLM:0
>     Aug 28 16:23:19 [13802] node03 crm_simulate:     info:
>     rsc_merge_weights:       DLM:2: Rolling back scores from iSCSI2-clone
>     Aug 28 16:23:19 [13802] node03 crm_simulate:     info:
>     rsc_merge_weights:       DLM:2: Rolling back scores from iSCSI2-clone
>     Aug 28 16:23:19 [13802] node03 crm_simulate:    debug:
>     native_assign_node:      Assigning node03 to DLM:2
>     Aug 28 16:23:19 [13802] node03 crm_simulate:     info:
>     rsc_merge_weights:       DLM:3: Rolling back scores from iSCSI2-clone
>     Aug 28 16:23:19 [13802] node03 crm_simulate:     info:
>     rsc_merge_weights:       DLM:3: Rolling back scores from iSCSI2-clone
>     Aug 28 16:23:19 [13802] node03 crm_simulate:    debug:
>     native_assign_node:      Assigning node02 to DLM:3
>
>     This suggests that the restarted node attempts to occupy a dlm
>     journal that is allocated to another note and by doing so triggers a
>     chain reaction leading to all resources being restarted on all nodes.
>
>     I will try a different approach (based on your suggestions) on
>     starting DLM, iSCSI and Mount resources and see if this changes
>     anything.
>
>     If based on the log you have any suggestions they are welcome.
>
>     Thank you again for the help.
>
>     On Mon, Aug 28, 2017 at 3:53 PM, Vladislav Bogdanov
>     <bubble at hoster-ok.com <mailto:bubble at hoster-ok.com>> wrote:
>
>         28.08.2017 14:03, Octavian Ciobanu wrote:
>
>             Hey Vladislav,
>
>             Thank you for the info. I've tried you suggestions but the
>             behavior is
>             still the same. When an offline/standby node rejoins the
>             cluster all the
>             resources are first stopped and then started. I've added the
>             changes
>             I've made, see below in reply message, next to your suggestions.
>
>
>         Logs on DC (node where you see logs from the pengine process)
>         should contain references to pe-input-XX.bz2 files. Something
>         like "notice: Calculated transition XXXX, saving inputs in
>         /var/lib/pacemaker/pengine/pe-input-XX.bz2"
>         Locate one for which Stop actions occur.
>         You can replay them with 'crm_simulate -S -x
>         /var/lib/pacemaker/pengine/pe-input-XX.bz2' to see if that is
>         the correct one (look in the middle of output).
>
>         After that you may add some debugging:
>         PCMK_debug=yes PCMK_logfile=./pcmk.log crm_simulate -S -x
>         /var/lib/pacemaker/pengine/pe-input-XX.bz2
>
>         That will produce a big file with all debugging messages enabled.
>
>         Try to locate a reason for restarts there.
>
>         Best,
>         Vladislav
>
>         Also please look inline (may be info there will be enough so you
>         won't need to debug).
>
>
>             Once again thank you for info.
>
>             Best regards.
>             Octavian Ciobanu
>
>             On Sat, Aug 26, 2017 at 8:17 PM, Vladislav Bogdanov
>             <bubble at hoster-ok.com <mailto:bubble at hoster-ok.com>
>             <mailto:bubble at hoster-ok.com <mailto:bubble at hoster-ok.com>>>
>             wrote:
>
>                 26.08.2017 19 <tel:26.08.2017%2019>
>             <tel:26.08.2017%2019>:36, Octavian Ciobanu wrote:
>
>                     Thank you for your reply.
>
>                     There is no reason to set location for the
>             resources, I think,
>                     because
>                     all the resources are set with clone options so they
>             are started
>                     on all
>                     nodes at the same time.
>
>
>                 You still need to colocate "upper" resources with their
>                 dependencies. Otherwise pacemaker will try to start them
>             even if
>                 their dependencies fail. Order without colocation has
>             very limited
>                 use (usually when resources may run on different nodes).
>             For clones
>                 that is even more exotic.
>
>
>             I've added collocation
>
>             pcs constraint colocation add iSCSI1-clone with DLM-clone
>             pcs constraint colocation add iSCSI2-clone with DLM-clone
>             pcs constraint colocation add iSCSI3-clone with DLM-clone
>             pcs constraint colocation add Mount1-clone with iSCSI1-clone
>             pcs constraint colocation add Mount2-clone with iSCSI2-clone
>             pcs constraint colocation add Mount4-clone with iSCSI3-clone
>
>             The result is the same ... all clones are first stopped and
>             then started
>             beginning with DLM resource and ending with the Mount ones.
>
>
>         Yep, that was not meant to fix your problem. Just to prevent
>         future issues.
>
>
>
>                 For you original question: ensure you have
>             interleave=true set for
>                 all your clones. You seem to miss it for iSCSI ones.
>                 interleave=false (default) is for different uses (when upper
>                 resources require all clone instances to be up).
>
>
>             Modified iSCSI resources and added interleave="true" and
>             still no change
>             in behavior.
>
>
>         Weird... Probably you also do not need 'ordered="true"' for your
>         DLM clone? Knowing what is DLM, it does not need ordering, its
>         instances may be safely started in the parallel.
>
>
>
>                 Also, just a minor note, iSCSI resources do not actually
>             depend on
>                 dlm, mounts should depend on it.
>
>
>             I know but the mount resource must know when the iSCSI
>             resource to whom
>             is connected is started so the only solution I've seen was
>             to place DLM
>             before iSCSI and then Mount. If there is another solution, a
>             proper way
>             to do it, please can you give a reference or a place from
>             where to read
>             on how to do it ?
>
>
>         You would want to colocate (and order) mount with both DLM and
>         iSCSI. Multiple colocations/orders for the same resource are
>         allowed.
>         For mount you need DLM running and iSCSI disk connected. But you
>         actually do not need DLM to connect iSCSI disk (so DLM and iSCSI
>         resources may start in the parallel).
>
>
>
>                     And when it comes to stickiness I forgot to
>                     mention that but it set to 200. and also I have stonith
>                     configured  to
>                     use vmware esxi.
>
>                     Best regards
>                     Octavian Ciobanu
>
>                     On Sat, Aug 26, 2017 at 6:16 PM, John Keates
>             <john at keates.nl <mailto:john at keates.nl>
>                     <mailto:john at keates.nl <mailto:john at keates.nl>>
>                     <mailto:john at keates.nl <mailto:john at keates.nl>
>             <mailto:john at keates.nl <mailto:john at keates.nl>>>> wrote:
>
>                         While I am by no means a CRM/Pacemaker expert, I
>             only see the
>                         resource primitives and the order constraints.
>             Wouldn’t you need
>                         location and/or colocation as well as stickiness
>             settings to
>                     prevent
>                         this from happening? What I think it might be
>             doing is
>                     seeing the
>                         new node, then trying to move the resources (but
>             not finding
>                     it a
>                         suitable target) and then moving them back where
>             they came
>                     from, but
>                         fast enough for you to only see it as a restart.
>
>                         If you crm_resource -P, it should also restart all
>                     resources, but
>                         put them in the preferred spot. If they end up
>             in the same
>                     place,
>                         you probably didn’t put and weighing in the
>             config or have
>                         stickiness set to INF.
>
>                         Kind regards,
>
>                         John Keates
>
>                             On 26 Aug 2017, at 14:23, Octavian Ciobanu
>                             <coctavian1979 at gmail.com
>             <mailto:coctavian1979 at gmail.com>
>                         <mailto:coctavian1979 at gmail.com
>             <mailto:coctavian1979 at gmail.com>>
>                         <mailto:coctavian1979 at gmail.com
>             <mailto:coctavian1979 at gmail.com>
>                         <mailto:coctavian1979 at gmail.com
>             <mailto:coctavian1979 at gmail.com>>>> wrote:
>
>                             Hello all,
>
>                             While playing with cluster configuration I
>             noticed a strange
>                             behavior. If I stop/standby cluster services
>             on one node and
>                             reboot it, when it joins the cluster all the
>             resources
>                         that were
>                             started and working on active nodes get
>             stopped and
>                         restarted.
>
>                             My testing configuration is based on 4
>             nodes. One node is a
>                             storage node that makes 3 iSCSI targets
>             available for
>                         the other
>                             nodes to use,it is not configured to join
>             cluster, and
>                         three nodes
>                             that are configured in a cluster using the
>             following
>                         commands.
>
>                             pcs resource create DLM
>             ocf:pacemaker:controld op monitor
>                             interval="60" on-fail="fence" clone meta
>             clone-max="3"
>                             clone-node-max="1" interleave="true"
>             ordered="true"
>                             pcs resource create iSCSI1 ocf:heartbeat:iscsi
>                             portal="10.0.0.1:3260 <http://10.0.0.1:3260>
>             <http://10.0.0.1:3260>
>                         <http://10.0.0.1:3260/>"
>                             target="iqn.2017-08.example.com
>             <http://iqn.2017-08.example.com>
>                         <http://iqn.2017-08.example.com
>             <http://iqn.2017-08.example.com>>
>                             <http://iqn.2017-08.example.com
>             <http://iqn.2017-08.example.com>
>                         <http://iqn.2017-08.example.com
>             <http://iqn.2017-08.example.com>>>:tgt1" op start interval="0"
>                             timeout="20" op stop interval="0"
>             timeout="20" op monitor
>                             interval="120" timeout="30" clone meta
>             clone-max="3"
>                             clone-node-max="1"
>                             pcs resource create iSCSI2 ocf:heartbeat:iscsi
>                             portal="10.0.0.1:3260 <http://10.0.0.1:3260>
>             <http://10.0.0.1:3260>
>                         <http://10.0.0.1:3260/>"
>                             target="iqn.2017-08.example.com
>             <http://iqn.2017-08.example.com>
>                         <http://iqn.2017-08.example.com
>             <http://iqn.2017-08.example.com>>
>                             <http://iqn.2017-08.example.com
>             <http://iqn.2017-08.example.com>
>                         <http://iqn.2017-08.example.com
>             <http://iqn.2017-08.example.com>>>:tgt2" op start interval="0"
>                             timeout="20" op stop interval="0"
>             timeout="20" op monitor
>                             interval="120" timeout="30" clone meta
>             clone-max="3"
>                             clone-node-max="1"
>                             pcs resource create iSCSI3 ocf:heartbeat:iscsi
>                             portal="10.0.0.1:3260 <http://10.0.0.1:3260>
>             <http://10.0.0.1:3260>
>                         <http://10.0.0.1:3260/>"
>                             target="iqn.2017-08.example.com
>             <http://iqn.2017-08.example.com>
>                         <http://iqn.2017-08.example.com
>             <http://iqn.2017-08.example.com>>
>                             <http://iqn.2017-08.example.com
>             <http://iqn.2017-08.example.com>
>
>                         <http://iqn.2017-08.example.com
>             <http://iqn.2017-08.example.com>>>:tgt3" op start interval="0"
>
>                             timeout="20" op stop interval="0"
>             timeout="20" op monitor
>                             interval="120" timeout="30" clone meta
>             clone-max="3"
>                             clone-node-max="1"
>                             pcs resource create Mount1
>             ocf:heartbeat:Filesystem
>                             device="/dev/disk/by-label/MyCluster:Data1"
>                         directory="/mnt/data1"
>                             fstype="gfs2"
>             options="noatime,nodiratime,rw" op monitor
>                             interval="90" on-fail="fence" clone meta
>             clone-max="3"
>                             clone-node-max="1" interleave="true"
>                             pcs resource create Mount2
>             ocf:heartbeat:Filesystem
>                             device="/dev/disk/by-label/MyCluster:Data2"
>                         directory="/mnt/data2"
>                             fstype="gfs2"
>             options="noatime,nodiratime,rw" op monitor
>                             interval="90" on-fail="fence" clone meta
>             clone-max="3"
>                             clone-node-max="1" interleave="true"
>                             pcs resource create Mount3
>             ocf:heartbeat:Filesystem
>                             device="/dev/disk/by-label/MyCluster:Data3"
>                         directory="/mnt/data3"
>                             fstype="gfs2"
>             options="noatime,nodiratime,rw" op monitor
>                             interval="90" on-fail="fence" clone meta
>             clone-max="3"
>                             clone-node-max="1" interleave="true"
>                             pcs constraint order DLM-clone then iSCSI1-clone
>                             pcs constraint order DLM-clone then iSCSI2-clone
>                             pcs constraint order DLM-clone then iSCSI3-clone
>                             pcs constraint order iSCSI1-clone then
>             Mount1-clone
>                             pcs constraint order iSCSI2-clone then
>             Mount2-clone
>                             pcs constraint order iSCSI3-clone then
>             Mount3-clone
>
>                             If I issue the command "pcs cluster standby
>             node1" or
>                         "pcs cluster
>                             stop" on node 1 and after that I reboot the
>             node. When
>                         the node
>                             gets back online (unstandby if it was put in
>             standby
>                         mode) all the
>                             "MountX" resources get stopped on node 3 and
>             4 and
>                         started again.
>
>                             Can anyone help me figure out where and what
>             is the
>                         mistake in my
>                             configuration as I would like to keep the
>             started
>                         resources on
>                             active nodes (avoid stop and start of
>             resources)?
>
>                             Thank you in advance
>                             Octavian Ciobanu
>                             _______________________________________________
>                             Users mailing list: Users at clusterlabs.org
>             <mailto:Users at clusterlabs.org>
>                         <mailto:Users at clusterlabs.org
>             <mailto:Users at clusterlabs.org>>
>                             <mailto:Users at clusterlabs.org
>             <mailto:Users at clusterlabs.org>
>                         <mailto:Users at clusterlabs.org
>             <mailto:Users at clusterlabs.org>>>
>
>             http://lists.clusterlabs.org/mailman/listinfo/users
>             <http://lists.clusterlabs.org/mailman/listinfo/users>
>
>             <http://lists.clusterlabs.org/mailman/listinfo/users
>             <http://lists.clusterlabs.org/mailman/listinfo/users>>
>
>             <http://lists.clusterlabs.org/mailman/listinfo/users
>             <http://lists.clusterlabs.org/mailman/listinfo/users>
>
>             <http://lists.clusterlabs.org/mailman/listinfo/users
>             <http://lists.clusterlabs.org/mailman/listinfo/users>>>
>
>                             Project Home: http://www.clusterlabs.org
>                             Getting started:
>
>             http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>>
>
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>>>
>                             Bugs: http://bugs.clusterlabs.org
>
>
>
>                         _______________________________________________
>                         Users mailing list: Users at clusterlabs.org
>             <mailto:Users at clusterlabs.org>
>                     <mailto:Users at clusterlabs.org
>             <mailto:Users at clusterlabs.org>>
>             <mailto:Users at clusterlabs.org <mailto:Users at clusterlabs.org>
>
>                     <mailto:Users at clusterlabs.org
>             <mailto:Users at clusterlabs.org>>>
>
>             http://lists.clusterlabs.org/mailman/listinfo/users
>             <http://lists.clusterlabs.org/mailman/listinfo/users>
>                     <http://lists.clusterlabs.org/mailman/listinfo/users
>             <http://lists.clusterlabs.org/mailman/listinfo/users>>
>
>             <http://lists.clusterlabs.org/mailman/listinfo/users
>             <http://lists.clusterlabs.org/mailman/listinfo/users>
>                     <http://lists.clusterlabs.org/mailman/listinfo/users
>             <http://lists.clusterlabs.org/mailman/listinfo/users>>>
>
>                         Project Home: http://www.clusterlabs.org
>                         Getting started:
>
>             http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>>
>
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>>>
>                         Bugs: http://bugs.clusterlabs.org
>
>
>
>
>                     _______________________________________________
>                     Users mailing list: Users at clusterlabs.org
>             <mailto:Users at clusterlabs.org>
>                     <mailto:Users at clusterlabs.org
>             <mailto:Users at clusterlabs.org>>
>                     http://lists.clusterlabs.org/mailman/listinfo/users
>             <http://lists.clusterlabs.org/mailman/listinfo/users>
>                     <http://lists.clusterlabs.org/mailman/listinfo/users
>             <http://lists.clusterlabs.org/mailman/listinfo/users>>
>
>                     Project Home: http://www.clusterlabs.org
>                     Getting started:
>
>             http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>>
>                     Bugs: http://bugs.clusterlabs.org
>
>
>
>                 _______________________________________________
>                 Users mailing list: Users at clusterlabs.org
>             <mailto:Users at clusterlabs.org> <mailto:Users at clusterlabs.org
>             <mailto:Users at clusterlabs.org>>
>                 http://lists.clusterlabs.org/mailman/listinfo/users
>             <http://lists.clusterlabs.org/mailman/listinfo/users>
>                 <http://lists.clusterlabs.org/mailman/listinfo/users
>             <http://lists.clusterlabs.org/mailman/listinfo/users>>
>
>                 Project Home: http://www.clusterlabs.org
>                 Getting started:
>                 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>                 <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>>
>                 Bugs: http://bugs.clusterlabs.org
>
>
>
>
>             _______________________________________________
>             Users mailing list: Users at clusterlabs.org
>             <mailto:Users at clusterlabs.org>
>             http://lists.clusterlabs.org/mailman/listinfo/users
>             <http://lists.clusterlabs.org/mailman/listinfo/users>
>
>             Project Home: http://www.clusterlabs.org
>             Getting started:
>             http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>             <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>             Bugs: http://bugs.clusterlabs.org
>
>
>
>         _______________________________________________
>         Users mailing list: Users at clusterlabs.org
>         <mailto:Users at clusterlabs.org>
>         http://lists.clusterlabs.org/mailman/listinfo/users
>         <http://lists.clusterlabs.org/mailman/listinfo/users>
>
>         Project Home: http://www.clusterlabs.org
>         Getting started:
>         http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>         <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>         Bugs: http://bugs.clusterlabs.org
>
>
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>





More information about the Users mailing list