[Pacemaker] will online node shoots the standby node when no cluster services?

Andrew Beekhof andrew at beekhof.net
Thu Oct 18 19:25:32 EDT 2012


On Fri, Oct 19, 2012 at 3:29 AM, Arnold Krille <arnold at arnoldarts.de> wrote:
> On Thursday 18 October 2012 11:24:25 Andrew Beekhof wrote:
>> On Thu, Oct 18, 2012 at 9:58 AM, Arnold Krille <arnold at arnoldarts.de> wrote:
>> > On Wed, 17 Oct 2012 14:21:24 -0400 Digimer <lists at alteeve.ca> wrote:
>> >> On 10/17/2012 02:10 PM, Jean-Francois Malouin wrote:
>> >> > Hi,
>> >> >
>> >> > A simple question for a simple 2-nodes cluster running
>> >> > pacemaker-1.0.9, corosync-1.2.1 (Debian/Squeeze):
>> >> >
>> >> > will the online node stonith the other standby node if I stop the
>> >> > cluster services on it? (I need to open the chassis)
>> >> >
>> >> > thanks!
>> >> > jf
>> >>
>> >> No.
>> >>
>> >> The idea behind fencing is to restore a node to a known state. If you
>> >> gracefully shutdown the cluster stack, then it is able to inform the
>> >> peer node that it is leaving and will not be offering any clustered
>> >> services. Thus, it is in a known state and all is fine.
>> >
>> > If my understanding is correct, the same applies when you "only" put
>> > the node in standby?
>> > At least I couldn't manually fence the node long after I did put it to
>> > standby...
>>
>> That doesn't sound right.  How did you try and fence it?
>
> "crm node fence nebel2", where nebel2 is the host concerned and currently the
> only one with a working ipmi-implementation on the mobo.

Hmmm, I'm not familiar with that command. Do you know how it is
supposed to work?
Depending on your version you might have more luck with: stonith_admin
--fence nebel2
This bypasses the CIB+PE+CRMD and goes straight to the fencing subsystem.

>
> Just some minutes ago I had another try at fencing. At first the given command
> (with an active nebel2) did nothing. Then I thought maybe I should activate
> stonith in the cluster config.

Ah, yes.  Having fencing disabled would definitely prevent the
CIB+PE+CRMD method from working.

> After committing the change in crm, two of the
> three nodes fenced itself and all where marked as UNCLEAN.

I think you really need to file a bug for this.  There are some very
strange issues going on in your cluster.

> Only deactivating
> stonith and rebooting the last remaining node gave back a valid cluster with
> quorum...
>
> Very strange. But as this is our productive/development cluster (testing at
> our office so we don't test at the clients, but productive because our office-
> people work on the cluster to get a realistic work-load), we also experiment
> and test the ipmi-implementations of the various vendors:
>  - the ipmi on the supermicro-board seems to work, testing with the
> external/ipmi agent says its okay, only fencing itself seems to fail. And of
> course the mobo says that there is a cpu present but it can't get its
> temperature and it finds the powerdistribution-board but none of the two power-
> supplies.
>  - one of the intel-boards works fine. As long as you have the ipmi either in
> the untagged network or in a tagged vlan>1. Tagged vlan1 doesn't work for
> ipmi:-(
>
> Anyway, have a nice evening,
>
> Arnold
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>




More information about the Pacemaker mailing list