[Pacemaker] [PATCH] pingd calls "goto retry" if it gets EAGAIN or EINTR
David Vossel
dvossel at redhat.com
Wed Apr 11 19:09:11 CEST 2012
----- Original Message -----
> From: "Junko IKEDA" <tsukishima.ha at gmail.com>
> To: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
> Sent: Wednesday, April 11, 2012 3:58:02 AM
> Subject: Re: [Pacemaker] [PATCH] pingd calls "goto retry" if it gets EAGAIN or EINTR
>
> > I'm not sure if this is correct. I believe EAGAIN is the return
> > code we get when the read timeout occurs. With this logic would
> > we not get get stuck in a retry loop if we never received
> > anything.
>
> We can set timeout to pingd, so "retry loop" would be prevented.
>
> >
> > It might be safe to do this for the EINTR return code though. I
> > don't know enough off the top of my head to understand why this
> > would occur in your situation though.
> >
> > Do you know what return code you are getting that causes this?
>
> On the original code, ping_read() function will return FALSE in the
> following condition;
> - bytes from recvmsg() < 0
> - errnor is EAGAIN or EINTR
>
> https://github.com/ClusterLabs/pacemaker-1.0/blob/master/tools/pingd.c#L863
> https://github.com/ClusterLabs/pacemaker-1.0/blob/master/tools/pingd.c#L907
>
> If ping_read() return FALSE, stand_alone_ping() will say
> "unreachable"
> like this;
> ex.) info: stand_alone_ping: Node 192.168.201.254 is unreachable
> (read)
> https://github.com/ClusterLabs/pacemaker-1.0/blob/master/tools/pingd.c#L1149
>
> I think the above condition is just temporary one, so pingd should go
> retry.
> I'm trying to get the return code with the attached test patch.
> Please see ha-log, too.
> There is no messages(info: ping_read: 1: bytes=XX, errno=XX, rc=XX),
> so my thought might be a imaginary fears...
>
> By the way, we can see a lot of "info: ping_read: 4: bytes=56,
> errno=11, rc=0" in ha-log,
> so the following part would be reasonable, wouldn't it?
>
> > @@ -898,6 +901,9 @@ ping_read(ping_node *node, int *lenp)
> > } else if(rc > 0) {
> > crm_free(packet);
> > return TRUE;
> > + } else {
> > + crm_info("Retrying...");
> > + goto retry;
> > }
> >
> > } else {
Does that else statement ever get hit?
>
> Thanks,
> Junko
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
More information about the Pacemaker
mailing list