[Pacemaker] crm resource restart doesn't restart the correct resource
Dejan Muhamedagic
dejanmm at fastmail.fm
Fri Nov 26 09:15:34 UTC 2010
Hi,
On Thu, Nov 25, 2010 at 11:03:30PM +0100, Pavlos Parissis wrote:
> On Thu, 25 Nov 2010 07:09:28 -0500
> Vadym Chepkov <vchepkov at gmail.com> wrote:
>
> >
> > On Nov 25, 2010, at 7:01 AM, Pavlos Parissis wrote:
> >
> > > On 25 November 2010 12:44, Vadym Chepkov <vchepkov at gmail.com> wrote:
> > >>
> > >> On Nov 25, 2010, at 6:31 AM, Pavlos Parissis wrote:
> > >>
> > >>> Hi,
> > >>> When issue crm resource restart pbx_01 PE restarts the wrong resource.
> > >>> The pbx_01 belongs to a resource group and the last resource of that
> > >>> group is restarted.
crm resource restart is broken. It just won't work for groups.
The restart is actually a stop followed by start. The start
precludes the stop of all but the first resource. It will have to
be synchronous, that is to wait for all resources to stop first,
then to start them. We'll fix it, just didn't get around yet to
do that.
Thanks,
Dejan
> > >> This is why cluster has groups. groups define collocation/ordering, so if you
> > >> stop a resource everything depending on it has to be stopped, and group
> > >> describes this dependency.
> > > If that was the case then sshd_01 should have been restarted it as well.
> >
> > Well it tried, but failed, I see it in the log
> Is this the log which you are referring to?
>
> 12:04:43 pbxsrv3 pengine: [6396]: notice: unpack_rsc_op: Hard error -
> sshd_01_monitor_0 failed with rc=5: Preventing sshd_01 from
> re-starting on pbxsrv2
>
>
> this is normal, because sshd_01 is not supposed to run on pbxsrv2 node, it runs only on pbxsrv1 and pbxsrv3. This error is harmless according to this post http://www.gossamer-threads.com/lists/linuxha/pacemaker/67208#67208
>
> If you read the log on DC, you think that pbx_01, sshd_01 were actually restarted
> 12:04:43 pbxsrv3 pengine: [6396]: notice: LogActions: Stop resource
> pbx_01 (pbxsrv1)
> 12:04:43 pbxsrv3 pengine: [6396]: notice: LogActions: Stop resource
> sshd_01 (pbxsrv1)
> 12:04:43 pbxsrv3 pengine: [6396]: notice: LogActions: Stop resource
> mailAlert_01 (pbxsrv1)
>
> but they weren't.
>
> Cheers,
> Pavlos
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
More information about the Pacemaker
mailing list