[ClusterLabs] Help required for N+1 redundancy setup

Fri Jan 8 17:13:27 UTC 2016

> I think stickiness will do what you want here. Set a stickiness higher
> than the original node's preference, and the resource will want to stay
> where it is.

Not sure I understand this. Stickiness will ensure that resources don't
move back when original node comes back up, isn't it?
But in my case, I want the newly standby node to become the backup node for
all other nodes. i.e. it should now be able to run all my resource groups
albeit with a lower score. How do I achieve that?
Also can you answer, how to get the values of node that goes active and the
node that goes down inside the OCF agent?  Do I need to use notification or
some simpler alternative is available?
Thanks.

On Fri, Jan 8, 2016 at 9:30 PM, Ken Gaillot <kgaillot at redhat.com> wrote:

> On 01/08/2016 06:55 AM, Nikhil Utane wrote:
> > Would like to validate my final config.
> >
> > As I mentioned earlier, I will be having (upto) 5 active servers and 1
> > standby server.
> > The standby server should take up the role of active that went down. Each
> > active has some unique configuration that needs to be preserved.
> >
> > 1) So I will create total 5 groups. Each group has a "heartbeat::IPaddr2
> > resource (for virtual IP) and my custom resource.
> > 2) The virtual IP needs to be read inside my custom OCF agent, so I will
> > make use of attribute reference and point to the value of IPaddr2 inside
> my
> > custom resource to avoid duplication.
> > 3) I will then configure location constraint to run the group resource
> on 5
> > active nodes with higher score and lesser score on standby.
> > For e.g.
> > Group              Node            Score
> > ---------------------------------------------
> > MyGroup1        node1           500
> > MyGroup1        node6           0
> >
> > MyGroup2        node2           500
> > MyGroup2        node6           0
> > ..
> > MyGroup5        node5           500
> > MyGroup5        node6           0
> >
> > 4) Now if say node1 were to go down, then stop action on node1 will first
> > get called. Haven't decided if I need to do anything specific here.
>
> To clarify, if node1 goes down intentionally (e.g. standby or stop),
> then all resources on it will be stopped first. But if node1 becomes
> unavailable (e.g. crash or communication outage), it will get fenced.
>
> > 5) But when the start action of node 6 gets called, then using crm
> command
> > line interface, I will modify the above config to swap node 1 and node 6.
> > i.e.
> > MyGroup1        node6           500
> > MyGroup1        node1           0
> >
> > MyGroup2        node2           500
> > MyGroup2        node1           0
> >
> > 6) To do the above, I need the newly active and newly standby node names
> to
> > be passed to my start action. What's the best way to get this information
> > inside my OCF agent?
>
> Modifying the configuration from within an agent is dangerous -- too
> much potential for feedback loops between pacemaker and the agent.
>
> I think stickiness will do what you want here. Set a stickiness higher
> than the original node's preference, and the resource will want to stay
> where it is.
>
> > 7) Apart from node name, there will be other information which I plan to
> > pass by making use of node attributes. What's the best way to get this
> > information inside my OCF agent? Use crm command to query?
>
> Any of the command-line interfaces for doing so should be fine, but I'd
> recommend using one of the lower-level tools (crm_attribute or
> attrd_updater) so you don't have a dependency on a higher-level tool
> that may not always be installed.
>
> > Thank You.
> >
> > On Tue, Dec 22, 2015 at 9:44 PM, Nikhil Utane <
> nikhil.subscribed at gmail.com>
> > wrote:
> >
> >> Thanks to you Ken for giving all the pointers.
> >> Yes, I can use service start/stop which should be a lot simpler. Thanks
> >> again. :)
> >>
> >> On Tue, Dec 22, 2015 at 9:29 PM, Ken Gaillot <kgaillot at redhat.com>
> wrote:
> >>
> >>> On 12/22/2015 12:17 AM, Nikhil Utane wrote:
> >>>> I have prepared a write-up explaining my requirements and current
> >>> solution
> >>>> that I am proposing based on my understanding so far.
> >>>> Kindly let me know if what I am proposing is good or there is a better
> >>> way
> >>>> to achieve the same.
> >>>>
> >>>>
> >>>
> https://drive.google.com/file/d/0B0zPvL-Tp-JSTEJpcUFTanhsNzQ/view?usp=sharing
> >>>>
> >>>> Let me know if you face any issue in accessing the above link. Thanks.
> >>>
> >>> This looks great. Very well thought-out.
> >>>
> >>> One comment:
> >>>
> >>> "8. In the event of any failover, the standby node will get notified
> >>> through an event and it will execute a script that will read the
> >>> configuration specific to the node that went down (again using
> >>> crm_attribute) and become active."
> >>>
> >>> It may not be necessary to use the notifications for this. Pacemaker
> >>> will call your resource agent with the "start" action on the standby
> >>> node, after ensuring it is stopped on the previous node. Hopefully the
> >>> resource agent's start action has (or can have, with configuration
> >>> options) all the information you need.
> >>>
> >>> If you do end up needing notifications, be aware that the feature will
> >>> be disabled by default in the 1.1.14 release, because changes in syntax
> >>> are expected in further development. You can define a compile-time
> >>> constant to enable them.
> >>>
> >>>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20160108/49002413/attachment.htm>