[Pacemaker] CoroSync's UDPu transport for public IP addresses?

Dmitry Koterov dmitry.koterov at gmail.com
Wed Jan 14 11:07:27 EST 2015


Yes, now I have the clear experiment. Sorry, I misinformed you about
"adding new UDPU member" - when I use DNS names in ringX_addr, I don't see
such messages (for now). But, anyway, DNS names in ringX_addr seem not
working, and no relevant messages are in default logs. Maybe add some
validations for ringX_addr?

I'm having resolvable DNS names:

root at node1:/etc/corosync# ping -c1 -W100 node1 | grep from
64 bytes from node1 (127.0.1.1): icmp_seq=1 ttl=64 time=0.039 ms

root at node1:/etc/corosync# ping -c1 -W100 node2 | grep from
64 bytes from node2 (188.166.54.190): icmp_seq=1 ttl=55 time=88.3 ms

root at node1:/etc/corosync# ping -c1 -W100 node3 | grep from
64 bytes from node3 (128.199.116.218): icmp_seq=1 ttl=51 time=252 ms


With corosync.conf below, nothing works:
...
nodelist {
  node {
    ring0_addr: node1
  }
  node {
    ring0_addr: node2
  }
  node {
    ring0_addr: node3
  }
}
...
Jan 14 10:47:44 node1 corosync[15061]:  [MAIN  ] Corosync Cluster Engine
('2.3.3'): started and ready to provide service.
Jan 14 10:47:44 node1 corosync[15061]:  [MAIN  ] Corosync built-in
features: dbus testagents rdma watchdog augeas pie relro bindnow
Jan 14 10:47:44 node1 corosync[15062]:  [TOTEM ] Initializing transport
(UDP/IP Unicast).
Jan 14 10:47:44 node1 corosync[15062]:  [TOTEM ] Initializing
transmit/receive security (NSS) crypto: aes256 hash: sha1
Jan 14 10:47:44 node1 corosync[15062]:  [TOTEM ] The network interface
[a.b.c.d] is now up.
Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine loaded:
corosync configuration map access [0]
Jan 14 10:47:44 node1 corosync[15062]:  [QB    ] server name: cmap
Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine loaded:
corosync configuration service [1]
Jan 14 10:47:44 node1 corosync[15062]:  [QB    ] server name: cfg
Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine loaded:
corosync cluster closed process group service v1.01 [2]
Jan 14 10:47:44 node1 corosync[15062]:  [QB    ] server name: cpg
Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine loaded:
corosync profile loading service [4]
Jan 14 10:47:44 node1 corosync[15062]:  [WD    ] No Watchdog, try modprobe
<a watchdog>
Jan 14 10:47:44 node1 corosync[15062]:  [WD    ] no resources configured.
Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine loaded:
corosync watchdog service [7]
Jan 14 10:47:44 node1 corosync[15062]:  [QUORUM] Using quorum provider
corosync_votequorum
Jan 14 10:47:44 node1 corosync[15062]:  [QUORUM] Quorum provider:
corosync_votequorum failed to initialize.
Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine
'corosync_quorum' failed to load for reason 'configuration error: nodelist
or quorum.expected_votes must be configured!'
Jan 14 10:47:44 node1 corosync[15062]:  [MAIN  ] Corosync Cluster Engine
exiting with status 20 at service.c:356.


But with IP addresses specified in ringX_addr, everything works:
...
nodelist {
  node {
    ring0_addr: 104.236.71.79
  }
  node {
    ring0_addr: 188.166.54.190
  }
  node {
    ring0_addr: 128.199.116.218
  }
}
...
Jan 14 10:48:28 node1 corosync[15155]:  [MAIN  ] Corosync Cluster Engine
('2.3.3'): started and ready to provide service.
Jan 14 10:48:28 node1 corosync[15155]:  [MAIN  ] Corosync built-in
features: dbus testagents rdma watchdog augeas pie relro bindnow
Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] Initializing transport
(UDP/IP Unicast).
Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] Initializing
transmit/receive security (NSS) crypto: aes256 hash: sha1
Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] The network interface
[a.b.c.d] is now up.
Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
corosync configuration map access [0]
Jan 14 10:48:28 node1 corosync[15156]:  [QB    ] server name: cmap
Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
corosync configuration service [1]
Jan 14 10:48:28 node1 corosync[15156]:  [QB    ] server name: cfg
Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
corosync cluster closed process group service v1.01 [2]
Jan 14 10:48:28 node1 corosync[15156]:  [QB    ] server name: cpg
Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
corosync profile loading service [4]
Jan 14 10:48:28 node1 corosync[15156]:  [WD    ] No Watchdog, try modprobe
<a watchdog>
Jan 14 10:48:28 node1 corosync[15156]:  [WD    ] no resources configured.
Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
corosync watchdog service [7]
Jan 14 10:48:28 node1 corosync[15156]:  [QUORUM] Using quorum provider
corosync_votequorum
Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
corosync vote quorum service v1.0 [5]
Jan 14 10:48:28 node1 corosync[15156]:  [QB    ] server name: votequorum
Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
corosync cluster quorum service v0.1 [3]
Jan 14 10:48:28 node1 corosync[15156]:  [QB    ] server name: quorum
Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] adding new UDPU member
{a.b.c.d}
Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] adding new UDPU member
{e.f.g.h}
Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] adding new UDPU member
{i.j.k.l}
Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] A new membership
(m.n.o.p:80) was formed. Members joined: 1760315215
Jan 14 10:48:28 node1 corosync[15156]:  [QUORUM] Members[1]: 1760315215
Jan 14 10:48:28 node1 corosync[15156]:  [MAIN  ] Completed service
synchronization, ready to provide service.


On Mon, Jan 5, 2015 at 6:45 PM, Jan Friesse <jfriesse at redhat.com> wrote:

> Dmitry,
>
>
> > Sure, in logs I see "adding new UDPU member {IP_ADDRESS}" (so DNS names
> > are definitely resolved), but in practice the cluster does not work, as I
> > said above. So validations of ringX_addr in corosync.conf would be very
> > helpful in corosync.
>
> that's weird. Because as long as DNS is resolved, corosync works only
> with IP. This means, code path is exactly same with IP or with DNS. Do
> you have logs from corosync?
>
> Honza
>
>
> >
> > On Fri, Jan 2, 2015 at 2:49 PM, Jan Friesse <jfriesse at redhat.com> wrote:
> >
> >> Dmitry,
> >>
> >>
> >>  No, I meant that if you pass a domain name in ring0_addr, there are no
> >>> errors in logs, corosync even seems to find nodes (based on its logs),
> And
> >>> crm_node -l shows them, but in practice nothing really works. A verbose
> >>> error message would be very helpful in such case.
> >>>
> >>
> >> This sounds weird. Are you sure that DNS names really maps to correct IP
> >> address? In logs there should be something like "adding new UDPU member
> >> {IP_ADDRESS}".
> >>
> >> Regards,
> >>   Honza
> >>
> >>
> >>> On Tuesday, December 30, 2014, Daniel Dehennin <
> >>> daniel.dehennin at baby-gnu.org>
> >>> wrote:
> >>>
> >>>  Dmitry Koterov <dmitry.koterov at gmail.com <javascript:;>> writes:
> >>>>
> >>>>  Oh, seems I've found the solution! At least two mistakes was in my
> >>>>> corosync.conf (BTW logs did not say about any errors, so my
> conclusion
> >>>>> is
> >>>>> based on my experiments only).
> >>>>>
> >>>>> 1. nodelist.node MUST contain only IP addresses. No hostnames! They
> >>>>>
> >>>> simply
> >>>>
> >>>>> do not work, "crm status" shows no nodes. And no warnings are in logs
> >>>>> regarding this.
> >>>>>
> >>>>
> >>>> You can add name like this:
> >>>>
> >>>>      nodelist {
> >>>>        node {
> >>>>          ring0_addr: <public-ip-address-of-the-first-machine>
> >>>>          name: node1
> >>>>        }
> >>>>        node {
> >>>>          ring0_addr: <public-ip-address-of-the-second-machine>
> >>>>          name: node2
> >>>>        }
> >>>>      }
> >>>>
> >>>> I used it on Ubuntu Trusty with udpu.
> >>>>
> >>>> Regards.
> >>>>
> >>>> --
> >>>> Daniel Dehennin
> >>>> Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF
> >>>> Fingerprint: 3E69 014E 5C23 50E8 9ED6  2AAD CC1E 9E5B 7A6F E2DF
> >>>>
> >>>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>>
> >>> Project Home: http://www.clusterlabs.org
> >>> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >>> Bugs: http://bugs.clusterlabs.org
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>
> >> Project Home: http://www.clusterlabs.org
> >> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org
> >>
> >
> >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20150114/65dec688/attachment-0003.html>


More information about the Pacemaker mailing list