[Pacemaker] Manging Virtual Machine's resource

Fri May 16 12:15:03 EDT 2008

On Fri, 2008-05-16 at 15:58 +0200, Lars Marowsky-Bree wrote:
> On 2008-05-16T09:28:07, Lon Hohberger <lhh at redhat.com> wrote:
> 
> > Because rgmanager and pacemaker diverged slightly on the OCF RA API
> > metadata bits, there might be some small tweaks required.
> 
> Which reminds me to ask - this sounds as if you already did an analysis
> of these differences, can you share your thoughts on what they are and
> how we might overcome them?

Not completely done, but here are some of the differences from the RA
API - 

rgmanager:
 * parent/child relationships for implicit start-after/stop-before
   * attribute inheritance (we have talked about this in the past;
     it isn't hard, and may be beneficial)
   * specification of child resource type ordering to prevent major
     "gotchas" when defining resource groups (e.g. putting a
     script on a file system but putting them in the wrong order,
     causing errors)
 * 'primary' attribute specification (not OCF compliant) is used to
identify resource instances
 * use of LSB 'status' to implement OCF 'monitor' function (status isn't
specified in the RA API, but the monitor function as specified appears
to map to the LSB status function... so most of our agents do
monitor->status, though depth is still supported - maybe yours are the
same; haven't fully investigated)
 * multiple references to the same resource instance - reference counts
are used to prevent starting the same resource on the same node multiple
times
 * rgmanager allows reconfiguration of resource parameters without
restarting the resource; maybe pacemaker does too; haven't checked; uses
<parameter name="xxx" reconfig="1" .../> in the meta-data to enable it.

pacemaker:
 * promote / demote resource operations
 * UUIDs used to identify resource instances (I like this better than
what we do with type:primary_attr in rgmanager)
 * clone resources and operations used to start (more or less) the same
resource on multiple nodes

General:
 * resource migrate is likely done differently; not sure though (maybe
you can tell me?):
    <resource-agent> migrate <target_host_name>

There will be more that I will come across, no doubt.  Those are just
the ones on the surface.  I do not believe any of them are hard to deal
with.

I think we both diverged in a compatible way here:
 * <parameter ... required="1" .../> means this parameter must be
specified for a given resource instance.

> > For the fencing agent to work with domUs (that are moving around), dom0s
> > need to be in a CMAN+openais cluster right now, but we could drop this
> > and make this an openais-only requirement if you'd like (it shouldn't be
> > hard, but we'd lose physical node fencing information in the
> > short-term).
> 
> How so? At least in case of pacemaker/crm, the CRM orchestrates the
> migration, and thus always knows where it's running. (And this is DomUs
> as a resource, so I'm not sure how fencing fits into this picture.)

I believe the idea was to use virtual machines resources, with those
virtual machines in a cluster of their own.

To clarify the requirements as stated: they were in the context of an
existing implementation.

Generally, with clustered virtual machines that can run on more than one
physical node, at a bare minimum, you need to know only a few things on
the physical hosts in order to implement fencing:

 * where a particular vm is and its current state, or
 * where that vm "was", and
   * the state of the host running the vm, and
   * if "bad" or "Dead", whether fencing has completed

Certainly, pacemaker knows all of the above!

I doubt it would be difficult to make the existing fence agent/host
preferentially use pacemaker to locate & kill VMs when possible (as
opposed to simply talking to libvirt + AIS Checkpoint APIs as it does
now).

-- Lon