[Pacemaker] Question on ILO stonith resource config and restarting
Andreas Mock
andreas.mock at web.de
Thu Oct 30 01:39:14 UTC 2008
Aaron Bush schrieb:
> I am mostly concerned that I ended up with a node that had no associated
> stonith resource available to shoot it if it was truly down since the
> resource did not restart like I thought it should once the network cable
> was reconnected.
>
Hi Aaron,
without knowing the details: Is the stonith plugin implemented to time
out and return FALSE?
In this case failure count should be raised for that stonith plugin
resource and you get a
change for the resource score.
A list member once contributed a script showscore.sh which shows the
current score of a
resource in the cluster. You should watch your stonith resource in that
failure case.
Probably the score gets so bad that the resource can't be started
anywhere. But just a guess.
The best what you can do IMHO is ignore the failures for score
calculation, but react on them
externally (e.g. nagios monitoring). Failure count would raise with each
try but score should be
kept constant.
But probably Dejan can bring additional light to this. :-)
Best regards
Andreas Mock
More information about the Pacemaker
mailing list