[Pacemaker] CoroSync's UDPu transport for public IP addresses?
Dmitry Koterov
dmitry.koterov at gmail.com
Tue Dec 30 12:21:15 UTC 2014
Oh, seems I've found the solution! At least two mistakes was in my
corosync.conf (BTW logs did not say about any errors, so my conclusion is
based on my experiments only).
1. nodelist.node MUST contain only IP addresses. No hostnames! They simply
do not work, "crm status" shows no nodes. And no warnings are in logs
regarding this.
2. quorum {} MUST NOT be empty (in the config sample it IS empty): in my
case, the following fixed the problem together with (1):
quorum {
provider: corosync_votequorum
two_node: 1
}
So, below is my final corosync.conf. Now "crm status" shows "Online: [
node1 node2 ]", UDPu transport is used, no virtual network exists at all
(only public IP addresses are specified in corosync.conf).
========================
# This seems to be a really WORKING configuration.
# Ubuntu 14.04, corosync 2.3.3, pacemaker 1.1.10
totem {
version: 2
cluster_name: cluster
crypto_cipher: none
crypto_hash: none
clear_node_high_bit: yes
interface {
ringnumber: 0
bindnetaddr: <public-ip-address-of-the-current-machine>
mcastport: 5405
ttl: 1
}
transport: udpu
heartbeat_failures_allowed: 3
}
logging {
fileline: off
to_logfile: no
to_syslog: yes
debug: on
timestamp: off
logger_subsys {
subsys: QUORUM
debug: off
}
}
nodelist {
node {
ring0_addr: <public-ip-address-of-the-first-machine>
}
node {
ring0_addr: <public-ip-address-of-the-second-machine>
}
}
quorum {
provider: corosync_votequorum
two_node: 1
}
=========================
On Tue, Dec 30, 2014 at 12:34 PM, Dmitry Koterov <dmitry.koterov at gmail.com>
wrote:
> On Mon, Dec 29, 2014 at 1:50 PM, Dejan Muhamedagic <dejanmm at fastmail.fm>
>> wrote:
>> >> On Mon, Dec 29, 2014 at 06:11:49AM +0300, Dmitry Koterov wrote:
>> >> Hello.
>> >>
>> >> I have a geographically distributed cluster, all machines have public
>> IP
>> >> addresses. No virtual IP subnet exists, so no multicast is available.
>> >>
>> >> I thought that UDPu transport can work in such environment, doesn't it?
>> >>
>> >> To test everything in advance, I've set up a corosync+pacemaker on
>> Ubuntu
>> >> 14.04 with the following corosync.conf:
>> >>
>> >> totem {
>> >> transport: udpu
>> >> interface {
>> >> ringnumber: 0
>> >> bindnetaddr: ip-address-of-the-current-machine
>> >> mcastport: 5405
>> >> }
>>
> >> ...
>
> >> }
>
> >> nodelist {
>> >> node {
>> >> ring0_addr: node1
>> >> }
>> >> node {
>> >> ring0_addr: node2
>> >> }
>> >> }
>
> >> root at node1:/etc/corosync# crm status | grep node
>> >> OFFLINE: [ node1 node2 ]
>> >> and "crm node online" (as all other attempts to make crm to do
>> something) are timed out with "communication error".
>
>
>> Dmitry, which version do you have?
>
>
> root at node1:~# corosync -v
> Corosync Cluster Engine, version '2.3.3'
> Copyright (c) 2006-2009 Red Hat, Inc.
>
> - so nodelist is defenitely enough, and totem->interface->member is
> deprecated.
>
> So, am I at least right that the configuration with UDPu SHOULD work with
> geo-distributed nodes with only public IP addresses and no private/virtual
> subnetwork? If yes, how could I debug it?
>
> Here's some more info (x.x.x.x is a public IP associated to node1):
>
> root at node1:~# netstat -nap|grep coro
> udp 0 0 x.x.x.x:41083 0.0.0.0:*
> 7037/corosync
> udp 0 0 x.x.x.x:49299 0.0.0.0:*
> 7037/corosync
> udp 0 0 x.x.x.x:5405 0.0.0.0:*
> 7037/corosync
> unix 2 [ ACC ] STREAM LISTENING 52458 7037/corosync
> @quorum
> unix 2 [ ACC ] STREAM LISTENING 52455 7037/corosync
> @cmap
> unix 2 [ ACC ] STREAM LISTENING 52456 7037/corosync
> @cfg
> unix 2 [ ACC ] STREAM LISTENING 52457 7037/corosync
> @cpg
> unix 3 [ ] STREAM CONNECTED 52512 7037/corosync
> @cpg
> unix 3 [ ] STREAM CONNECTED 52625 7037/corosync
> @cpg
> unix 3 [ ] STREAM CONNECTED 52504 7037/corosync
> @cfg
> unix 3 [ ] STREAM CONNECTED 52520 7037/corosync
> @quorum
> unix 2 [ ] DGRAM 52420 7037/corosync
> unix 3 [ ] STREAM CONNECTED 52643 7037/corosync
> @quorum
> unix 3 [ ] STREAM CONNECTED 52568 7037/corosync
> @cpg
> unix 3 [ ] STREAM CONNECTED 52588 7037/corosync
> @cpg
> unix 3 [ ] STREAM CONNECTED 52554 7037/corosync
> @cpg
>
> root at node1:~# crm status
> Last updated: Tue Dec 30 04:33:40 2014
> Last change: Sun Dec 28 21:40:41 2014 via crmd on node2
> Stack: corosync
> Current DC: NONE
> 2 Nodes configured
> 0 Resources configured
> OFFLINE: [ node1 node2 ]
>
> root at node1:~# crm node online
> Error setting standby=off (section=nodes, set=nodes-1084751873):
> Communication error on send
> Error performing operation: Communication error on send
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20141230/5ea542d5/attachment.htm>
More information about the Pacemaker
mailing list