[Pacemaker] Manging Virtual Machine's resource

Fri May 16 14:52:16 EDT 2008

On Fri, 2008-05-16 at 20:15 +0200, Lars Marowsky-Bree wrote:

> That's all just meta-data, right?

All of those were metadata differences.

> monitor is _not_ 1:1 the LSB status. That's exactly why we're not using
> status.  ;-)
> http://www.linux-foundation.org/spec/refspecs/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html

Ah, ok.

> > pacemaker:
> >  * promote / demote resource operations
> >  * UUIDs used to identify resource instances (I like this better than
> > what we do with type:primary_attr in rgmanager)
> 
> Yeah, well, the UUIDs are not the grandest idea we ever had - nowadays
> at least the GUI tries to generate a shorter unique id w/o the full
> cumbersomeness of UUIDs.

It's not too bad, though.  For what it's worth, what we did in rgmanager
did a XML-ish pairing:

   OCF_RESOURCE_INSTANCE="resource_class:primary_attribute"

In most resource agents, we have a 'name' attribute and we use that as
the 'primary' attr.  Generally speaking, the 'primary' attribute was
defined to be unique across a given resource type; that is, in the
"resource-specific" namespace, the names could be the same. Ex:

   OCF_RESOURCE_INSTANCE="filesystem:Apache Service"
   OCF_RESOURCE_INSTANCE="script:Apache Service"

It is a bit more readable than UUIDs, but required hints from the
metadata whereas UUIDs do not.

> > General:
> >  * resource migrate is likely done differently; not sure though (maybe
> > you can tell me?):
> >     <resource-agent> migrate <target_host_name>

> The migrate_from also is our way of checking whether the migration
> succeeded; I guess in your case you then run a monitor/status on the
> target?

Correct.  Basically we tell target to start looking for it and start
migrating.  Not the best thing ever.  Like UUIDs - easy, but ugly ;)

> > There will be more that I will come across, no doubt.  Those are just
> > the ones on the surface.  I do not believe any of them are hard to deal
> > with.
> 
> Right. I was in particular interested in understanding those differences
> which affect the RA API, as that could possibly affect the usability of
> RAs written for RHCS vs those written for ours. I think it's probably a
> good idea to find some time to sit down and chat how to resolve these.

> I've got a presentation from last year's BrainShare on what our scripts
> do, that should be a usable starting point. Not much has changed since.

Linky! :)

> A compatible divergence can't possibly be a diverge ;-)

I meant diverged from the original spec itself - not specifically from
each other.

> > Certainly, pacemaker knows all of the above!
> 
> Right, of course. The external/xen STONITH script which we already have
> could likely use crm_resource to find out and/or control the state of
> the resource representing the DomU in the Dom0 cluster.

Possibly.  Is this article accurate?

  http://etbe.coker.com.au/2007/06/24/xen-and-heartbeat/

If so, that requires *way* too much configuration, and is too
Xen-specific.

> > I doubt it would be difficult to make the existing fence agent/host
> > preferentially use pacemaker to locate & kill VMs when possible (as
> > opposed to simply talking to libvirt + AIS Checkpoint APIs as it does
> > now).
> 
> I think at least some interaction here would be needed, because
> otherwise, pacemaker/LRM would eventually run the monitor action, find
> out that it's gone and restart it, which might not be what is desired
> ;-)

Right, exactly.  If I want "power off" (virtually speaking) of a virtual
machine, the CRM would make it in to a "reboot"... yucky!

-- Lon