[ClusterLabs] Help required for N+1 redundancy setup

Nikhil Utane nikhil.subscribed at gmail.com
Wed Mar 16 10:22:15 UTC 2016


I see following info gets updated in CIB. Can I use this or there is better
way?

<node_state id="*node1*" uname="node1" in_ccm="false" crmd="offline"
crm-debug-origin="peer_update_callback" join="*down*" expected="member">

On Wed, Mar 16, 2016 at 12:40 PM, Nikhil Utane <nikhil.subscribed at gmail.com>
wrote:

> Hi Ken,
>
> Sorry about the long delay. This activity was de-focussed but now it's
> back on track.
>
> One part of question that is still not answered is on the newly active
> node, how to find out which was the node that went down?
> Anything that gets updated in the status section that can be read and
> figured out?
>
> Thanks.
> Nikhil
>
> On Sat, Jan 9, 2016 at 3:31 AM, Ken Gaillot <kgaillot at redhat.com> wrote:
>
>> On 01/08/2016 11:13 AM, Nikhil Utane wrote:
>> >> I think stickiness will do what you want here. Set a stickiness higher
>> >> than the original node's preference, and the resource will want to stay
>> >> where it is.
>> >
>> > Not sure I understand this. Stickiness will ensure that resources don't
>> > move back when original node comes back up, isn't it?
>> > But in my case, I want the newly standby node to become the backup node
>> for
>> > all other nodes. i.e. it should now be able to run all my resource
>> groups
>> > albeit with a lower score. How do I achieve that?
>>
>> Oh right. I forgot to ask whether you had an opt-out
>> (symmetric-cluster=true, the default) or opt-in
>> (symmetric-cluster=false) cluster. If you're opt-out, every node can run
>> every resource unless you give it a negative preference.
>>
>> Partly it depends on whether there is a good reason to give each
>> instance a "home" node. Often, there's not. If you just want to balance
>> resources across nodes, the cluster will do that by default.
>>
>> If you prefer to put certain resources on certain nodes because the
>> resources require more physical resources (RAM/CPU/whatever), you can
>> set node attributes for that and use rules to set node preferences.
>>
>> Either way, you can decide whether you want stickiness with it.
>>
>> > Also can you answer, how to get the values of node that goes active and
>> the
>> > node that goes down inside the OCF agent?  Do I need to use
>> notification or
>> > some simpler alternative is available?
>> > Thanks.
>> >
>> >
>> > On Fri, Jan 8, 2016 at 9:30 PM, Ken Gaillot <kgaillot at redhat.com>
>> wrote:
>> >
>> >> On 01/08/2016 06:55 AM, Nikhil Utane wrote:
>> >>> Would like to validate my final config.
>> >>>
>> >>> As I mentioned earlier, I will be having (upto) 5 active servers and 1
>> >>> standby server.
>> >>> The standby server should take up the role of active that went down.
>> Each
>> >>> active has some unique configuration that needs to be preserved.
>> >>>
>> >>> 1) So I will create total 5 groups. Each group has a
>> "heartbeat::IPaddr2
>> >>> resource (for virtual IP) and my custom resource.
>> >>> 2) The virtual IP needs to be read inside my custom OCF agent, so I
>> will
>> >>> make use of attribute reference and point to the value of IPaddr2
>> inside
>> >> my
>> >>> custom resource to avoid duplication.
>> >>> 3) I will then configure location constraint to run the group resource
>> >> on 5
>> >>> active nodes with higher score and lesser score on standby.
>> >>> For e.g.
>> >>> Group              Node            Score
>> >>> ---------------------------------------------
>> >>> MyGroup1        node1           500
>> >>> MyGroup1        node6           0
>> >>>
>> >>> MyGroup2        node2           500
>> >>> MyGroup2        node6           0
>> >>> ..
>> >>> MyGroup5        node5           500
>> >>> MyGroup5        node6           0
>> >>>
>> >>> 4) Now if say node1 were to go down, then stop action on node1 will
>> first
>> >>> get called. Haven't decided if I need to do anything specific here.
>> >>
>> >> To clarify, if node1 goes down intentionally (e.g. standby or stop),
>> >> then all resources on it will be stopped first. But if node1 becomes
>> >> unavailable (e.g. crash or communication outage), it will get fenced.
>> >>
>> >>> 5) But when the start action of node 6 gets called, then using crm
>> >> command
>> >>> line interface, I will modify the above config to swap node 1 and
>> node 6.
>> >>> i.e.
>> >>> MyGroup1        node6           500
>> >>> MyGroup1        node1           0
>> >>>
>> >>> MyGroup2        node2           500
>> >>> MyGroup2        node1           0
>> >>>
>> >>> 6) To do the above, I need the newly active and newly standby node
>> names
>> >> to
>> >>> be passed to my start action. What's the best way to get this
>> information
>> >>> inside my OCF agent?
>> >>
>> >> Modifying the configuration from within an agent is dangerous -- too
>> >> much potential for feedback loops between pacemaker and the agent.
>> >>
>> >> I think stickiness will do what you want here. Set a stickiness higher
>> >> than the original node's preference, and the resource will want to stay
>> >> where it is.
>> >>
>> >>> 7) Apart from node name, there will be other information which I plan
>> to
>> >>> pass by making use of node attributes. What's the best way to get this
>> >>> information inside my OCF agent? Use crm command to query?
>> >>
>> >> Any of the command-line interfaces for doing so should be fine, but I'd
>> >> recommend using one of the lower-level tools (crm_attribute or
>> >> attrd_updater) so you don't have a dependency on a higher-level tool
>> >> that may not always be installed.
>> >>
>> >>> Thank You.
>> >>>
>> >>> On Tue, Dec 22, 2015 at 9:44 PM, Nikhil Utane <
>> >> nikhil.subscribed at gmail.com>
>> >>> wrote:
>> >>>
>> >>>> Thanks to you Ken for giving all the pointers.
>> >>>> Yes, I can use service start/stop which should be a lot simpler.
>> Thanks
>> >>>> again. :)
>> >>>>
>> >>>> On Tue, Dec 22, 2015 at 9:29 PM, Ken Gaillot <kgaillot at redhat.com>
>> >> wrote:
>> >>>>
>> >>>>> On 12/22/2015 12:17 AM, Nikhil Utane wrote:
>> >>>>>> I have prepared a write-up explaining my requirements and current
>> >>>>> solution
>> >>>>>> that I am proposing based on my understanding so far.
>> >>>>>> Kindly let me know if what I am proposing is good or there is a
>> better
>> >>>>> way
>> >>>>>> to achieve the same.
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>
>> https://drive.google.com/file/d/0B0zPvL-Tp-JSTEJpcUFTanhsNzQ/view?usp=sharing
>> >>>>>>
>> >>>>>> Let me know if you face any issue in accessing the above link.
>> Thanks.
>> >>>>>
>> >>>>> This looks great. Very well thought-out.
>> >>>>>
>> >>>>> One comment:
>> >>>>>
>> >>>>> "8. In the event of any failover, the standby node will get notified
>> >>>>> through an event and it will execute a script that will read the
>> >>>>> configuration specific to the node that went down (again using
>> >>>>> crm_attribute) and become active."
>> >>>>>
>> >>>>> It may not be necessary to use the notifications for this. Pacemaker
>> >>>>> will call your resource agent with the "start" action on the standby
>> >>>>> node, after ensuring it is stopped on the previous node. Hopefully
>> the
>> >>>>> resource agent's start action has (or can have, with configuration
>> >>>>> options) all the information you need.
>> >>>>>
>> >>>>> If you do end up needing notifications, be aware that the feature
>> will
>> >>>>> be disabled by default in the 1.1.14 release, because changes in
>> syntax
>> >>>>> are expected in further development. You can define a compile-time
>> >>>>> constant to enable them.
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20160316/ca73ce74/attachment.htm>


More information about the Users mailing list