[Pacemaker] fenced node can not become online after rejoin it to the cluster

Andrew Beekhof andrew at beekhof.net
Sun Dec 2 22:06:58 EST 2012


On Wed, Nov 28, 2012 at 5:43 PM, bin chen <free2coder at gmail.com> wrote:
> Hi,all.
> I use the corosync 2.0 + pacemaker 1.1.7.I have three nodes in the
> cluster,h183,h185,h186 and there is a stonith resource in the cluster.
> Then I disable h185`s network so that it can not communicate with other
> nodes. I see that my cluster calls the fence script to fence h185 and the
> status of h185 is UNCLEAN.
>
> Node h185 (956999872): UNCLEAN (offline)
> Online: [ h183 h186 ]
>
>  Clone Set: fence-clone [fence]
>      Started: [ h183 h185 h186 ]
>      Stopped: [ fence:3 ]
>
> After fence finished ,h185 becomes OFFLINE:
>
> Online: [ h183 h186 ]
> OFFLINE: [ h185]
>
>  Clone Set: fence-clone [fence]
>      Started: [ h183 h186 ]
>      Stopped: [ fence:1 fence:3 ]
>
> Then I enable the networking of h185,h185 is still OFFLINE:
> Online: [ h183 h186 ]
> OFFLINE: [ h185 ]
>
>  Clone Set: fence-clone [fence]
>      Started: [ h183 h186 ]
>      Stopped: [ fence:1 fence:3 ]
>
> when I kill the pacemakerd and restart it in h185,h185`s status become
> online.
>
> log :
> 2012-11-05T01:27:26.639034+08:00 h183 crmd[2441]:     info:
> pcmk_quorum_notification: Membership 492880: quorum retained (3)
> 2012-11-05T01:27:26.639060+08:00 h183 crmd[2441]:     info:
> ais_status_callback: status: h185 is now member (was lost)
> 2012-11-05T01:27:26.639065+08:00 h183 crmd[2441]:     info: crm_update_peer:
> Node h185: id=956999872 state=member (new) addr=(null) votes=0 born=0
> seen=492880 proc=00000000000000000000000000111312
> 2012-11-05T01:27:26.639084+08:00 h183 crmd[2441]:     info: send_ais_text:
> Peer overloaded or membership in flux: Re-sending message (Attempt 1 of 20)
>
> who can help me?

Not many people based on what you've provided so far.
We'd need logs and config from h185 as a minimum.

At a guess, corosync is binding to a loopback address because the
network is down when the node starts.
Assuming your fencing is working at all.




More information about the Pacemaker mailing list