[Pacemaker] CLUSTERIP/iptables interaction
Dejan Muhamedagic
dejanmm at fastmail.fm
Tue Dec 15 10:41:25 UTC 2009
Hi,
On Tue, Dec 15, 2009 at 09:45:39AM +0100, Michael Schwartzkopff wrote:
> Am Dienstag, 15. Dezember 2009 09:37:01 schrieb Chris Picton:
> > On Tue, 15 Dec 2009 07:13:29 +0000, Chris Picton wrote:
> > >>> > The monitor op shouldn't make any changes. If the rule has gone
> > >>> > away, the monitor op should return failure to indicate the resource
> > >>> > is broken, which will result in Pacemaker telling the the failed
> > >>> > resource to stop, and start again. Actually, from the logs it looks
> > >>> > like a restart was attempted, and the stop op reported success, but
> > >>> > the subsequent start failed for some reason.
> > >>> >
> > >>> > Regards,
> > >>> >
> > >>> > Tim
> > >>>
> > >>> Exactly. So the RA seems to have a problem handeling this error
> > >>> scenario correctly.
> > >>
> > >> OK. Anybody knows how should it work and where's the problem. It seems
> > >> like it can't find some proc file.
> > >
> > > I will have a go at fixing the RA today to do the following: 1. Detect
> > > the error in monitor and return the correct value 2. Stop the resource
> > > cleanly
> > > 3. Start it up again.
> > >
> > >
> > > Will let you know how it goes.
> >
> > The below patch seems to detect this specific failure, and stop the
> > resource cleanly.
> >
> > The start operation is able to start it up again without errors
> >
> > Chris
> >
> > -------------------
> > --- IPaddr2.orig 2009-12-15 10:07:58.000000000 +0200
> > +++ IPaddr2.new 2009-12-15 10:22:03.000000000 +0200
> > @@ -548,6 +548,7 @@
> > # returns:
> > # ok = served (for CIP: + hash bucket)
> > # partial = served and no hash bucket (CIP only)
> > +# partial2 = served and no CIP iptables rule
> > # no = nothing
> > #
> > ip_served() {
> > @@ -577,6 +578,10 @@
> > fi
> >
> > # Special handling for the CIP:
> > + if [ ! -e $IP_CIP_FILE ]; then
> > + echo "partial2"
> > + return 0
> > + fi
> > if egrep -q "(^|,)${IP_INC_NO}(,|$)" $IP_CIP_FILE ; then
> > echo "ok"
> > return 0
> > @@ -620,7 +625,7 @@
> > exit $OCF_SUCCESS
> > fi
> >
> > - if [ -n "$IP_CIP" ] && [ $ip_status = "no" ]; then
> > + if [ -n "$IP_CIP" ] && [ $ip_status = "no" ] || [ $ip_status =
> > "partial2" ]; then
> > $MODPROBE ip_conntrack
> > $IPTABLES -I INPUT -d $BASEIP -i $NIC -j CLUSTERIP \
> > --new \
> > @@ -691,13 +696,14 @@
> > fi
> > fi
> > local ip_status=`ip_served`
> > + ocf_log info "IP status = $ip_status, IP_CIP=$IP_CIP"
> >
> > if [ $ip_status = "no" ]; then
> >
> > : Requested interface not in use
> >
> > exit $OCF_SUCCESS
> > fi
> >
> > - if [ -n "$IP_CIP" ]; then
> > + if [ -n "$IP_CIP" ] && [ $ip_status != "partial2" ]; then
> > if [ $ip_status = "partial" ]; then
> > exit $OCF_SUCCESS
> > fi
> > @@ -743,7 +749,7 @@
> > ok)
> > return $OCF_SUCCESS
> > ;;
> > - partial|no)
> > + partial|no|partial2)
> > exit $OCF_NOT_RUNNING
> > ;;
> > *)
>
> Thank you very much for you help!
The patch looks good to me, though I don't really understand
what's going on :) At any rate, we need a bugzilla entry for this
fix. Who can describe the problem?
Cheers,
Dejan
>
> --
> Dr. Michael Schwartzkopff
> MultiNET Services GmbH
> Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany
> Tel: +49 - 89 - 45 69 11 0
> Fax: +49 - 89 - 45 69 11 21
> mob: +49 - 174 - 343 28 75
>
> mail: misch at multinet.de
> web: www.multinet.de
>
> Sitz der Gesellschaft: 85630 Grasbrunn
> Registergericht: Amtsgericht München HRB 114375
> Geschäftsführer: Günter Jurgeneit, Hubert Martens
>
> ---
>
> PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B
> Skype: misch42
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
More information about the Pacemaker
mailing list