[Pacemaker] Enable remote monitoring

Lars Marowsky-Bree lmb at suse.com
Thu Dec 6 06:42:26 EST 2012


On 2012-12-06T22:25:40, Andrew Beekhof <andrew at beekhof.net> wrote:

> But any failures of the nagios agents would count against the VM's
> migration-threshold.
> So if moving were the right thing to do, it would have done it already.

OK. I think this was due to me still being stuck on the workings of an
order constraint, but of course if the failures are instead attributed
to the container, this would happen automatically already. True.

(Incidentally, I like "attribute", "ascribe" better than "delegate"
because to me, they better fit what's going on, if we sticked with
"delegate-failures". Just saying. ;-)

> > We already have on-fail settings. How would these play together?
> Good question. My initial thought was that it would be up to on-fail
> settings in the VM.

I'd prefer to keep that separate (as proposed below). Because if an
action of the *VM* really fails, I may want an admin to look into it
(why could the bloody hypervisor not start/stop it?), which is different
from restarting the VM if one of the resources within it needs that.

> > Would it even make sense to have on-fail="restart-container"? (Or a
> > nicer wording.)
> >
> > Hmmm. That might work. We allow a "container" to be specified as a meta
> > attribute.
> >
> > If set, on-fail would default to restart container for most actions. But
> > admins could actually modify it - say, they might want to set
> > monitor on-fail="ignore" to just get notified. And when we move forward
> > to whiteboxes, we could have start/monitor/promote/demote
> > on-fail="restart" (like now) and stop on-fail="restart-container".
> >
> > That appears reasonably neat?
> It does actually.
> I wasn't originally thinking it was necessary but it makes sense now
> that you point it out.

Yes, I think I like this too now.

Uhm. Would "container" imply ordering + colocation, or would we still
need them grouped (resource_set'ed, whatever)?

My, design is hard. ;-)


Regards,
    Lars

-- 
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde





More information about the Pacemaker mailing list