[Pacemaker] [Openais] Problem with cluster linux HA

Thu Feb 4 11:57:19 UTC 2010

On Wed, Jan 20, 2010 at 11:42 AM, Galera, Daniel <daniel.galera at hp.com> wrote:
> Andrew
>
> I just executed the report after I did a Stop Resource that failed and now
> resource is unclean/down and node esesslx0003b appears as offline. Probably
> because stop failed.

Unlikely.
What exactly did you stop?  Whatever it did, it seems to have caused a
network outage.

> That was done around 11.30h
> Here you have the report.
>
> Is there a way of making esesslx0003b again alive/online without rebooting
> both nodes?
>
> dani
>
> -----Original Message-----
> From: Galera, Daniel
> Sent: miércoles, 20 de enero de 2010 11:27
> To: Andrew Beekhof
> Cc: pacemaker at oss.clusterlabs.org; openais at lists.linux-foundation.org;
> linux-ha at lists.linux-ha.org
> Subject: RE: [Openais] Problem with cluster linux HA
>
> Hi Andrew
> Here you have the hb_report
> Resource ("HPOS" group) is running in esesslx0003a. I try to move it to
> esesslx0003b. I just use the GUI for moving the resource, I only select
> "Move resource" and then I select node esesslx0003b. this should be similar
> to crm_resource -M -r HPOS -H esesslx0003b
>
> -----Original Message-----h
> From: Andrew Beekhof [mailto:andrew at beekhof.net]
> Sent: martes, 19 de enero de 2010 14:37
> To: Galera, Daniel
> Cc: pacemaker at oss.clusterlabs.org; openais at lists.linux-foundation.org;
> linux-ha at lists.linux-ha.org
> Subject: Re: [Openais] Problem with cluster linux HA
>
> On Mon, Jan 18, 2010 at 2:46 PM, Galera, Daniel <daniel.galera at hp.com>
> wrote:
>> Hell all,
>>
>> I have 2 Suse Linux Enterprise 11 Servers with High Av. Extension. I'm
>> configuring a cluster with 2 nodes for the cluster and only 1 group to
> run.
>> I use SBD as STONITH. I set the cluster correctly without problems.
>> Now i want to have an application clustered named HPOS. for that i
>> need to have
> in
>> the group: SFEX --:> to lock the drive LVM --> to activate the VG
> Filesystem
>> --> to mount the 3 filesystems needed IP --> to bring online IP of the
>> cluster and then two LSB to run the 2 processes of application HPOS.
> anyway,
>> the application is not the problem. The problem is that when i want to
> test
>> cluster and for example MOVE resource to the other node (server1)...
>> the group becomes down and server2 appears as offline with Stonith
> UNCLEAN.
>
> Usually its when a resource fails to stop.
> Please use hb_report to generate a tarball and indicate which node you tried
> to move the resource to (and how)
>
>> that
>> info checking from server1 if at that moment i check crm_mon from
>> server2,
> i
>> see server2 as online but server1 down. No idea what the problem is.
>>
>> Attached you the cluster XML config file.
>>
>> Attached the log files of 2 nodes when i executed the MOVE RESOURCE
>> that failed.
>>
>> am i missing any resource location or any other expected thing?
>>
>> do you have any cluster example so i can configure correctluy mine?
>>
>> regards
>>
>> Dani
>>
>> _______________________________________________
>> Openais mailing list
>> Openais at lists.linux-foundation.org
>> https://lists.linux-foundation.org/mailman/listinfo/openais
>>
>