[Pacemaker] Enable remote monitoring

Thu Dec 6 23:09:09 EST 2012

On Fri, Dec 7, 2012 at 3:00 PM, Gao,Yan <ygao at suse.com> wrote:
> On 12/07/12 07:38, Andrew Beekhof wrote:
>>
>> On 06/12/2012, at 10:42 PM, Lars Marowsky-Bree <lmb at suse.com> wrote:
>>
>>> On 2012-12-06T22:25:40, Andrew Beekhof <andrew at beekhof.net> wrote:
>>>
>>>> But any failures of the nagios agents would count against the VM's
>>>> migration-threshold.
>>>> So if moving were the right thing to do, it would have done it already.
>>>
>>> OK. I think this was due to me still being stuck on the workings of an
>>> order constraint, but of course if the failures are instead attributed
>>> to the container, this would happen automatically already. True.
>>>
>>> (Incidentally, I like "attribute", "ascribe" better than "delegate"
>>> because to me, they better fit what's going on, if we sticked with
>>> "delegate-failures". Just saying. ;-)
>>
>> My use of "delegate" comes from my time with ObjectiveC where its common practice to use them for "I'm not going to handle X but here is something that does" style functionality.
>> Which fits nicely with what we're doing here.
>>
>> container="vm"  also works though.
>>
>>>
>>>>> We already have on-fail settings. How would these play together?
>>>> Good question. My initial thought was that it would be up to on-fail
>>>> settings in the VM.
>>>
>>> I'd prefer to keep that separate (as proposed below). Because if an
>>> action of the *VM* really fails, I may want an admin to look into it
>>> (why could the bloody hypervisor not start/stop it?), which is different
>>> from restarting the VM if one of the resources within it needs that.
>>>
>>>>> Would it even make sense to have on-fail="restart-container"? (Or a
>>>>> nicer wording.)
>>>>>
>>>>> Hmmm. That might work. We allow a "container" to be specified as a meta
>>>>> attribute.
>>>>>
>>>>> If set, on-fail would default to restart container for most actions. But
>>>>> admins could actually modify it - say, they might want to set
>>>>> monitor on-fail="ignore" to just get notified. And when we move forward
>>>>> to whiteboxes, we could have start/monitor/promote/demote
>>>>> on-fail="restart" (like now) and stop on-fail="restart-container".
>>>>>
>>>>> That appears reasonably neat?
>>>> It does actually.
>>>> I wasn't originally thinking it was necessary but it makes sense now
>>>> that you point it out.
>>>
>>> Yes, I think I like this too now.
>>>
>>> Uhm. Would "container" imply ordering + colocation, or would we still
>>> need them grouped (resource_set'ed, whatever)?
>>
>> Ordering: absolutely
> Would any user not like the implied order? Instead want an asymmetrical
> or some curious one?

Conceptually it doesn't make any sense IMHO.
By definition things cant be in/on the container if the container
doesn't exist yet.

The one thing we've not addressed yet is probing, thats going to be fun :)

> Although it seems just putting a  mandatory
> "container:start -> resource:start" internally should applies for most
> cases, and it would simplify the configuration of the "white" container.
>
> Regards,
>   Gao,Yan
>
>
>> Colocation is less clear, I think the default is no but David has suggested an additional meta attribute to turn it on.
>>
>>
>
> --
> Gao,Yan <ygao at suse.com>
> Software Engineer
> China Server Team, SUSE.
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org