[Pacemaker] Node name problems after upgrading to 1.1.9

Bernardo Cabezas Serra bcabezas at apsl.net
Fri Jun 28 04:42:01 EDT 2013


Hello Andrew,

El 27/06/13 14:44, Andrew Beekhof escribió:
> You should see additional logs sent to /var/log/pacemaker.log

Finally yesterday issue happened again. This time, node "selavi" was DC,
and node "turifel" joined the cluster. Cluster was in status unmanaged.

Unfortunately, I have no pacemaker trace logs on selavi, but have it for
turifel.

So here are the related logs: corosync log for node selavi, and
pacemaker log for the same time at turifel node.

selavi corosync log:
http://pastebin.com/QfUSpv0z

turifel pacemaker log:
http://pastebin.com/r8umCk43


Hope it helps. I have downgraded again versions to "old" stable cluster
(corosync 2.3.0 and pacemaker 1.1.8), so this issue is not a problem
form me right now. But if i can help testing something, please tell me.

There are things I don't understand on these logs, like this one on
selavi corosync log (lines 53 to 55):

warning: crm_get_peer: Node 'turifel' and 'turifel' share the same
cluster nodeid: 168385835
error: do_dc_join_filter_offer: Node turifel is not a member
error: do_dc_join_filter_offer: join-3: NACK'ing node turifel (ref
join_request-crmd-1372347834-39)


And pacemaker log where turifel tries to take DC to selavi (lines 297,
316...)

Best regards,
Bernardo

-- 
APSL
Bernardo Cabezas Serra





More information about the Pacemaker mailing list