[Pacemaker] Remote Access not Working
Colin
colin.hch at gmail.com
Mon Nov 9 16:24:21 UTC 2009
Hi All,
just tried to get the remote access to the cluster up-and-running, but
with more error than success...
Starting point was a working cluster installation. Then I did
# cibadmin --modify -X '<cib remote-clear-port="6900"/>'
# /etc/init.d/corosync stop
# /etc/init.d/corosync start
to get the listener, erm, listening:
# netstat -ant | grep 6900
tcp 0 0 0.0.0.0:6900 0.0.0.0:* LISTEN
For a first test I also changed the password of the "hacluster" user.
Then, on another machine, I set up the environment variables as follows:
# env | grep CIB
CIB_server=192.168.80.10
CIB_user=hacluster
CIB_port=6900
And issued a simple command, crm_resource --list. The crm_resource
command asks for a password and then hangs, on the cluster machine I
find the following in /var/log/daemon.log:
Nov 9 17:15:10 mz-dom0-001-4000 cib: [15698]: debug:
cib_remote_listen: New clear-text connection
Nov 9 17:15:10 mz-dom0-001-4000 cib: [15698]: ERROR: crm_xml_err: XML
Error: Entity: line 1: parsererror : Start tag expected, '<' not found
Nov 9 17:15:10 mz-dom0-001-4000 cib: [15698]: ERROR: crm_xml_err: XML
Error: #026#003#002
Nov 9 17:15:10 mz-dom0-001-4000 cib: [15698]: ERROR: crm_xml_err: XML Error: ^
Nov 9 17:15:10 mz-dom0-001-4000 cib: [15698]: WARN: string2xml:
Parsing failed (domain=1, level=3, code=4): Start tag expected, '<'
not found
Nov 9 17:15:10 mz-dom0-001-4000 cib: [15698]: ERROR: string2xml:
Couldn't parse 3 chars: #026#003#002
Nov 9 17:15:10 mz-dom0-001-4000 cib: [15698]: ERROR:
cib_recv_remote_msg: Couldn't parse: '#026#003#002'
Nov 9 17:15:26 mz-dom0-001-4000 cib: [15698]: ERROR:
cib_recv_remote_msg: Empty reply
Nov 9 17:15:27 mz-dom0-001-4000 cib: [15698]: ERROR:
cib_recv_remote_msg: Empty reply
Nov 9 17:15:28 mz-dom0-001-4000 cib: [15698]: ERROR:
cib_recv_remote_msg: Empty reply
Nov 9 17:15:29 mz-dom0-001-4000 cib: [15698]: ERROR:
cib_recv_remote_msg: Empty reply
Nov 9 17:15:30 mz-dom0-001-4000 cib: [15698]: ERROR:
cib_recv_remote_msg: Empty reply
.........
This continues forever, an error message every second, and the process
does not stop itself the normal way:
# /etc/init.d/corosync stop
Stopping corosync daemon: corosync.
# ps aux | grep cib
105 15698 0.3 0.7 13844 4588 ? S 17:12 0:01
/usr/lib/heartbeat/cib
This seems to prevent other processes from cleanly shutting down, too.
Am I doing something obviously wrong?
Thanks, Colin
PS: AFAICS the remote access does not support something like failover,
or connections to multiple cluster hosts, so I'll have to roll my own
wrapper that takes care of the issue?
More information about the Pacemaker
mailing list