[Pacemaker] CLUSTERIP/iptables interaction
Michael Schwartzkopff
misch at multinet.de
Tue Dec 15 06:23:07 EST 2009
Am Dienstag, 15. Dezember 2009 11:41:25 schrieb Dejan Muhamedagic:
> Hi,
>
> On Tue, Dec 15, 2009 at 09:45:39AM +0100, Michael Schwartzkopff wrote:
> > Am Dienstag, 15. Dezember 2009 09:37:01 schrieb Chris Picton:
> > > On Tue, 15 Dec 2009 07:13:29 +0000, Chris Picton wrote:
> > > >>> > The monitor op shouldn't make any changes. If the rule has gone
> > > >>> > away, the monitor op should return failure to indicate the
> > > >>> > resource is broken, which will result in Pacemaker telling the
> > > >>> > the failed resource to stop, and start again. Actually, from the
> > > >>> > logs it looks like a restart was attempted, and the stop op
> > > >>> > reported success, but the subsequent start failed for some
> > > >>> > reason.
> > > >>> >
> > > >>> > Regards,
> > > >>> >
> > > >>> > Tim
> > > >>>
> > > >>> Exactly. So the RA seems to have a problem handeling this error
> > > >>> scenario correctly.
> > > >>
> > > >> OK. Anybody knows how should it work and where's the problem. It
> > > >> seems like it can't find some proc file.
> > > >
> > > > I will have a go at fixing the RA today to do the following: 1.
> > > > Detect the error in monitor and return the correct value 2. Stop the
> > > > resource cleanly
> > > > 3. Start it up again.
> > > >
> > > >
> > > > Will let you know how it goes.
> > >
> > > The below patch seems to detect this specific failure, and stop the
> > > resource cleanly.
> > >
> > > The start operation is able to start it up again without errors
> > >
> > > Chris
> > >
> > > -------------------
> > > --- IPaddr2.orig 2009-12-15 10:07:58.000000000 +0200
> > > +++ IPaddr2.new 2009-12-15 10:22:03.000000000 +0200
> > > @@ -548,6 +548,7 @@
> > > # returns:
> > > # ok = served (for CIP: + hash bucket)
> > > # partial = served and no hash bucket (CIP only)
> > > +# partial2 = served and no CIP iptables rule
> > > # no = nothing
> > > #
> > > ip_served() {
> > > @@ -577,6 +578,10 @@
> > > fi
> > >
> > > # Special handling for the CIP:
> > > + if [ ! -e $IP_CIP_FILE ]; then
> > > + echo "partial2"
> > > + return 0
> > > + fi
> > > if egrep -q "(^|,)${IP_INC_NO}(,|$)" $IP_CIP_FILE ; then
> > > echo "ok"
> > > return 0
> > > @@ -620,7 +625,7 @@
> > > exit $OCF_SUCCESS
> > > fi
> > >
> > > - if [ -n "$IP_CIP" ] && [ $ip_status = "no" ]; then
> > > + if [ -n "$IP_CIP" ] && [ $ip_status = "no" ] || [ $ip_status =
> > > "partial2" ]; then
> > > $MODPROBE ip_conntrack
> > > $IPTABLES -I INPUT -d $BASEIP -i $NIC -j CLUSTERIP \
> > > --new \
> > > @@ -691,13 +696,14 @@
> > > fi
> > > fi
> > > local ip_status=`ip_served`
> > > + ocf_log info "IP status = $ip_status, IP_CIP=$IP_CIP"
> > >
> > > if [ $ip_status = "no" ]; then
> > >
> > > : Requested interface not in use
> > >
> > > exit $OCF_SUCCESS
> > > fi
> > >
> > > - if [ -n "$IP_CIP" ]; then
> > > + if [ -n "$IP_CIP" ] && [ $ip_status != "partial2" ]; then
> > > if [ $ip_status = "partial" ]; then
> > > exit $OCF_SUCCESS
> > > fi
> > > @@ -743,7 +749,7 @@
> > > ok)
> > > return $OCF_SUCCESS
> > > ;;
> > > - partial|no)
> > > + partial|no|partial2)
> > > exit $OCF_NOT_RUNNING
> > > ;;
> > > *)
> >
> > Thank you very much for you help!
>
> The patch looks good to me, though I don't really understand
> what's going on :) At any rate, we need a bugzilla entry for this
> fix. Who can describe the problem?
>
> Cheers,
>
> Dejan
See: http://developerbugs.linux-foundation.org/show_bug.cgi?id=2281
Greetings,
--
Dr. Michael Schwartzkopff
MultiNET Services GmbH
Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany
Tel: +49 - 89 - 45 69 11 0
Fax: +49 - 89 - 45 69 11 21
mob: +49 - 174 - 343 28 75
mail: misch at multinet.de
web: www.multinet.de
Sitz der Gesellschaft: 85630 Grasbrunn
Registergericht: Amtsgericht München HRB 114375
Geschäftsführer: Günter Jurgeneit, Hubert Martens
---
PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B
Skype: misch42
More information about the Pacemaker
mailing list