[ClusterLabs] Corosync on a home network

Mon Sep 11 18:14:18 EDT 2017

Is the firewalld service running?  Just did a quick test on my Centos 7 installation and by default SSH is allowed through the firewall, but corosync cannot connect to the other nodes.

Try: systemctl stop firewalld.service

> On 12 Sep 2017, at 8:04 am, J Martin Rushton <martinrushton56 at btinternet.com> wrote:
> 
> Hi,
> 
> I posted the message below on the CentOS mailing list and was advised to
> repost here.  Since I posted I have also been advised to issue "echo 1 >
> /sys/class/net/br3/bridge/multicast_querier" on the main node and have
> tried it, but to no avail.
> 
> As it says in the original, any help will be gratefully received.
> 
> Regards,
> Martin
> 
> -----------------original message--------------------------
> 
> I've been trying to build a model cluster using three virtual machines
> on my home server.  Each VM boots off its own dedicated partition
> (CentOS 7.3).  One partition is designated to be the common /home
> partition for the VMs, (on the real machine it will mount as /cluster).
> I'm intending to run GFS2 on the shared partition, so I need to
> configure DLM and corosync.  That's where I'm getting bogged down.
> 
> The VMs and the real machine are bridged onto one ethernet.  There is
> another ethernet in the main machine on a different network, but that is
> not used for clustering.  The ethernet port is connected to a switch
> which in turn connects to a BT Home Hub 6.  All four adresses are
> static, Network Manager is off, ssh works across the nodes without a
> password and ping gives sensible times.
> 
> --------------%<-------------------
> # brctl show
> bridge name	bridge id	STP enabled	interfaces
> br3		XXXXXXXXX	no		enp3s0
> 						vnet0
> 						vnet1
> 						vnet2
> virbr0		XXXXXXXXX	yes		virbr0-nic
> --------------%<-------------------
> 
> When I start corosync each node starts up but does not see the others.
> For instance I see:
> 
> --------------%<----------------------
> # corosync-quorumtool
> Quorum information
> ------------------
> Date:             Sun Sep 10 12:56:56 2017
> Quorum provider:  corosync_votequorum
> Nodes:            1
> Node ID:          3
> Ring ID:          3/28648
> Quorate:          No
> 
> Votequorum information
> ----------------------
> Expected votes:   4
> Highest expected: 4
> Total votes:      1
> Quorum:           3 Activity blocked
> Flags:
> 
> Membership information
> ----------------------
>    Nodeid      Votes Name
>         3          1 192.168.1.52 (local)
> ----------------%<-------------------
> 
> All four nodes are similar, but with different node IDs, IP addresses
> and Ring IDs.
> 
> The documentation warns that not all routers will handle multicast
> datagrams correctly.  I therefore attempted to force unicast
> communication by making the following changes from the distributed
> corosync.conf:
> 
> 	transport: updu
> 	cluster_name: <set to the same as the domain>
> #	crypto_cipher: none
> #	crypto_hash: none
> #		mcastaddr: 239.255.1.1
> #		mcastport: 5405
> #		ttl: 1
> 
> The following are unchanged:
> 
> 	version: 2
> 	secauth: off
> 		ringnumber: 0
> 		bindnetaddr: 192.168.1.0
> 
> The nodelist is:
> 
> ---------%<----------------
> nodelist {
> 	node {
> 		ring0_addr: 192.168.1.2
> 		nodeid: 1
> 	}
> 	node {
> 		ring0_addr: 192.168.1.51
> 		nodeid: 2
> 	}
> 	node {
> 		ring0_addr: 192.168.1.52
> 		nodeid: 3
> 	}
> 	node {
> 		ring0_addr: 192.168.1.53
> 		nodeid: 4
> 	}
> }
> --------%<------------------
> 
> logging and quorum are as supplied.
> 
> Any help will be gratefully received.
> 
> Regards,
> Martin
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org