[Pacemaker] Pacemaker remote nodes, naming, and attributes
Lindsay Todd
rltodd.ml1 at gmail.com
Wed Jul 10 17:11:00 UTC 2013
Hmm, I'll still submit the bug report, but it seems like crmd is dumping
core while attempting to fence a node. If I use fence_node to fence a real
cluster node, that also causes crmd to dump core. But apart from that, I
don't really see why pacemaker is trying to fence anything.
On Wed, Jul 10, 2013 at 12:42 PM, Lindsay Todd <rltodd.ml1 at gmail.com> wrote:
> Thanks! But there is still a problem.
>
> I am now working from the master branch and building RPMs (well, I have to
> also rebuild from the srpm to change the build number, since the RPMs built
> directly are always 1.1.10-1). The patch is in the git log, and indeed
> things are better ... But I still see the spurious VMs shutting down.
> What is much improved is that they do get restarted, and basically I end
> up in the state I want to be. Can almost live with this, and I was going
> to start changing my cluster config to be asymmetric when I noticed the in
> the midst of the spurious transitions, crmd is dumping core.
>
> So I'll append another crm_report to bug 5164, as well as a gdb traceback.
>
>
> On Fri, Jul 5, 2013 at 5:06 PM, David Vossel <dvossel at redhat.com> wrote:
>
>> ----- Original Message -----
>> > From: "David Vossel" <dvossel at redhat.com>
>> > To: "The Pacemaker cluster resource manager" <
>> pacemaker at oss.clusterlabs.org>
>> > Sent: Wednesday, July 3, 2013 4:20:37 PM
>> > Subject: Re: [Pacemaker] Pacemaker remote nodes, naming, and attributes
>> >
>> > ----- Original Message -----
>> > > From: "Lindsay Todd" <rltodd.ml1 at gmail.com>
>> > > To: "The Pacemaker cluster resource manager"
>> > > <pacemaker at oss.clusterlabs.org>
>> > > Sent: Wednesday, July 3, 2013 2:12:05 PM
>> > > Subject: Re: [Pacemaker] Pacemaker remote nodes, naming, and
>> attributes
>> > >
>> > > Well, I'm not getting failures right now simply with attributes, but
>> I can
>> > > induce a failure by stopping the vm-db02 (it puts db02 into an unclean
>> > > state, and attempts to migrate the unrelated vm-compute-test). I've
>> > > collected the commands from my latest interactions, a crm_report, and
>> a gdb
>> > > traceback from the core file that crmd dumped, into bug 5164.
>> >
>> >
>> > Thanks, hopefully I can start investigating this Friday
>> >
>> > -- Vossel
>>
>> Yeah, this is a bad one. Adding the node attributes using crm_attribute
>> for the remote-node did some unexpected things to the crmd component.
>> Somehow the remote-node was getting entered into the cluster node cache...
>> which made it look like we had both a cluster-node and remote-node named
>> the same thing... not good.
>>
>> I think I got that part worked out. Try this patch.
>>
>>
>> https://github.com/ClusterLabs/pacemaker/commit/67dfff76d632f1796c9ded8fd367aa49258c8c32
>>
>> Rather than trying to patch RCs, it might be worth trying out the master
>> branch on github (which already has this patch). If you aren't already,
>> use rpms to make your life easier. Running 'make rpm' in the source
>> directory will generate them for you.
>>
>> There was another bug fixed recently in pacemaker_remote involving the
>> directory created for resource agents to store their temporary data (stuff
>> like pid files). I believe the fix was not introduced until 1.1.10rc6.
>>
>> -- Vossel
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130710/bb90fc33/attachment.htm>
More information about the Pacemaker
mailing list