[Pacemaker] Orphan problem when creating a clone of a group

Uwe Grawert grawert at b1-systems.de
Mon Nov 29 13:42:42 UTC 2010


Was: Re: [Pacemaker] crm resource restart doesn't restart the correct resource

Zitat von Dejan Muhamedagic <dejanmm at fastmail.fm>:

>> This is happening, because, when the clone is created,
>> pacemaker stops the primitive but does not wait for the stop action
>> to return, and just starts the primitive over. And that off course
>> causes problems.
>
> Hmm, don't quite understand what is going on. Is that primitive
> part of the group? Can you describe in more detail what is going
> on.

I have a group (grp_fs) consisting of a LVM and several Filesystem  
resources, in that order. That group is started and all resources are  
running. Now I do clone this group by issuing:

crm configure clone clo_fs grp_fs

That does stop all resources and starts them again as clone. But  
Pacemaker does not seem to wait until the stop action has finished. I  
have modified the LVM RA to log the action command issued to the agent  
and the value returned by the agent:

14:24:11 [ 14495 ] Action: start
14:24:11 [ 14494 ] Action: stop
14:24:13 [ 14494 ] RC: 1
14:24:14 [ 14495 ] RC: 0
14:24:14 [ 14599 ] Action: monitor
14:24:14 [ 14599 ] RC: 0

In brackets you see the PID. As can be seen, Pacemaker first issues a  
start command and then immediately a stop afterwards, not waiting for  
the first command to return. That produces an orphan resource. That  
involves that the state of the LVM resource (which is now cloned) is  
uncertain. It can happen to start but it can also fail.

-- 
Uwe Grawert
Linux / Unix Consultant & Trainer
Tel.: +49 151 12051100
Mail: grawert at b1-systems.de

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537







More information about the Pacemaker mailing list