[Pacemaker] Orphan problem when creating a clone of a group
Uwe Grawert
grawert at b1-systems.de
Mon Nov 29 13:42:42 UTC 2010
Was: Re: [Pacemaker] crm resource restart doesn't restart the correct resource
Zitat von Dejan Muhamedagic <dejanmm at fastmail.fm>:
>> This is happening, because, when the clone is created,
>> pacemaker stops the primitive but does not wait for the stop action
>> to return, and just starts the primitive over. And that off course
>> causes problems.
>
> Hmm, don't quite understand what is going on. Is that primitive
> part of the group? Can you describe in more detail what is going
> on.
I have a group (grp_fs) consisting of a LVM and several Filesystem
resources, in that order. That group is started and all resources are
running. Now I do clone this group by issuing:
crm configure clone clo_fs grp_fs
That does stop all resources and starts them again as clone. But
Pacemaker does not seem to wait until the stop action has finished. I
have modified the LVM RA to log the action command issued to the agent
and the value returned by the agent:
14:24:11 [ 14495 ] Action: start
14:24:11 [ 14494 ] Action: stop
14:24:13 [ 14494 ] RC: 1
14:24:14 [ 14495 ] RC: 0
14:24:14 [ 14599 ] Action: monitor
14:24:14 [ 14599 ] RC: 0
In brackets you see the PID. As can be seen, Pacemaker first issues a
start command and then immediately a stop afterwards, not waiting for
the first command to return. That produces an orphan resource. That
involves that the state of the LVM resource (which is now cloned) is
uncertain. It can happen to start but it can also fail.
--
Uwe Grawert
Linux / Unix Consultant & Trainer
Tel.: +49 151 12051100
Mail: grawert at b1-systems.de
B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537
More information about the Pacemaker
mailing list