[Pacemaker] [PATCH] pingd calls "goto retry" if it gets EAGAIN or EINTR

Junko IKEDA tsukishima.ha at gmail.com
Wed Apr 11 10:58:02 CEST 2012


> I'm not sure if this is correct.  I believe EAGAIN is the return code we get when the read timeout occurs.  With this logic would we not get get stuck in a retry loop if we never received anything.

We can set timeout to pingd, so "retry loop" would be prevented.

>
> It might be safe to do this for the EINTR return code though. I don't know enough off the top of my head to understand why this would occur in your situation though.
>
> Do you know what return code you are getting that causes this?

On the original code, ping_read() function will return FALSE in the
following condition;
- bytes from recvmsg() < 0
- errnor is EAGAIN or EINTR

https://github.com/ClusterLabs/pacemaker-1.0/blob/master/tools/pingd.c#L863
https://github.com/ClusterLabs/pacemaker-1.0/blob/master/tools/pingd.c#L907

If ping_read() return FALSE, stand_alone_ping() will say "unreachable"
like this;
ex.) info: stand_alone_ping: Node 192.168.201.254 is unreachable (read)

https://github.com/ClusterLabs/pacemaker-1.0/blob/master/tools/pingd.c#L1149

I think the above condition is just temporary one, so pingd should go retry.
I'm trying to get the return code with the attached test patch.
Please see ha-log, too.
There is no messages(info: ping_read: 1: bytes=XX, errno=XX, rc=XX),
so my thought might be a imaginary fears...

By the way, we can see a lot of "info: ping_read: 4: bytes=56,
errno=11, rc=0" in ha-log,
so the following part would be reasonable, wouldn't it?

> @@ -898,6 +901,9 @@ ping_read(ping_node *node, int *lenp)
 >       } else if(rc > 0) {
 >           crm_free(packet);
 >           return TRUE;
 > +     } else {
 > +         crm_info("Retrying...");
 > +         goto retry;
 >       }
 >
 >      } else {

Thanks,
Junko
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pingd-test.diff
Type: application/octet-stream
Size: 1721 bytes
Desc: not available
URL: <http://oss.clusterlabs.org/pipermail/pacemaker/attachments/20120411/6ac4cb13/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ha-log
Type: application/octet-stream
Size: 174263 bytes
Desc: not available
URL: <http://oss.clusterlabs.org/pipermail/pacemaker/attachments/20120411/6ac4cb13/attachment-0003.obj>


More information about the Pacemaker mailing list