[Pacemaker] Pacemaker remote nodes, naming, and attributes

Lindsay Todd rltodd.ml1 at gmail.com
Wed Jul 10 12:42:48 EDT 2013


Thanks!  But there is still a problem.

I am now working from the master branch and building RPMs (well, I have to
also rebuild from the srpm to change the build number, since the RPMs built
directly are always 1.1.10-1).  The patch is in the git log, and indeed
things are better ...  But I still see the spurious VMs shutting down.
 What is much improved is that they do get restarted, and basically I end
up in the state I want to be.  Can almost live with this, and I was going
to start changing my cluster config to be asymmetric when I noticed the in
the midst of the spurious transitions, crmd is dumping core.

So I'll append another crm_report to bug 5164, as well as a gdb traceback.


On Fri, Jul 5, 2013 at 5:06 PM, David Vossel <dvossel at redhat.com> wrote:

> ----- Original Message -----
> > From: "David Vossel" <dvossel at redhat.com>
> > To: "The Pacemaker cluster resource manager" <
> pacemaker at oss.clusterlabs.org>
> > Sent: Wednesday, July 3, 2013 4:20:37 PM
> > Subject: Re: [Pacemaker] Pacemaker remote nodes, naming, and attributes
> >
> > ----- Original Message -----
> > > From: "Lindsay Todd" <rltodd.ml1 at gmail.com>
> > > To: "The Pacemaker cluster resource manager"
> > > <pacemaker at oss.clusterlabs.org>
> > > Sent: Wednesday, July 3, 2013 2:12:05 PM
> > > Subject: Re: [Pacemaker] Pacemaker remote nodes, naming, and attributes
> > >
> > > Well, I'm not getting failures right now simply with attributes, but I
> can
> > > induce a failure by stopping the vm-db02 (it puts db02 into an unclean
> > > state, and attempts to migrate the unrelated vm-compute-test). I've
> > > collected the commands from my latest interactions, a crm_report, and
> a gdb
> > > traceback from the core file that crmd dumped, into bug 5164.
> >
> >
> > Thanks, hopefully I can start investigating this Friday
> >
> > -- Vossel
>
> Yeah, this is a bad one.  Adding the node attributes using crm_attribute
> for the remote-node did some unexpected things to the crmd component.
>  Somehow the remote-node was getting entered into the cluster node cache...
> which made it look like we had both a cluster-node and remote-node named
> the same thing... not good.
>
> I think I got that part worked out.  Try this patch.
>
>
> https://github.com/ClusterLabs/pacemaker/commit/67dfff76d632f1796c9ded8fd367aa49258c8c32
>
> Rather than trying to patch RCs, it might be worth trying out the master
> branch on github (which already has this patch).  If you aren't already,
> use rpms to make your life easier.  Running 'make rpm' in the source
> directory will generate them for you.
>
> There was another bug fixed recently in pacemaker_remote involving the
> directory created for resource agents to store their temporary data (stuff
> like pid files).  I believe the fix was not introduced until 1.1.10rc6.
>
> -- Vossel
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130710/b79ec56f/attachment-0003.html>


More information about the Pacemaker mailing list