[ClusterLabs] Antw: Resources wont start on new node unless it is the only active node

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Wed Nov 9 03:33:02 EST 2016


>>> Ryan Anstey <ryan at treasuremart.net> schrieb am 08.11.2016 um 19:54 in Nachricht
<CAPj0oHxtumVgCbGO7ff6xTtd+=3FZbWiSBVmrHOCFuBRLMR-pg at mail.gmail.com>:
> I've been running a ceph cluster with pacemaker for a few months now.
> Everything has been working normally, but when I added a fourth node it
> won't work like the others, even though their OS is the same and the

Welcome to the club ;-)

> configs are all synced via salt. I also don't understand pacemaker that
> well since I followed a guide for it. If anyone could steer me in the right
> direction I would greatly appreciate it. Thank you!

I would start examining/showing the cluster status (I use "crm_mon -1Arfj"). Everything online? Same status from each node?

> 
> - My resources only start if the new node is the only active node.
> - Once started on new node, if they are moved back to one of the original
> nodes, it won't go back to the new one.
> - My resources work 100% if I start them manually (without pacemaker).
> - (In the logs/configs below, my resources are named "unifi", "rbd_unifi"
> being the main one that's not working.)
> 
> Log when running cleaning up the resource on the NEW node:
> 
> Nov 08 09:25:20 h4 Filesystem(fs_unifi)[18044]: WARNING: Couldn't find
> device [/dev/rbd/rbd/unifi]. Expected /dev/??? to exist
> Nov 08 09:25:20 h4 lrmd[3564]: notice: lxc_unifi_monitor_0:18018:stderr [
> unifi doesn't exist ]
> Nov 08 09:25:20 h4 crmd[3567]: notice: Operation lxc_unifi_monitor_0: not
> running (node=h4, call=484, rc=7, cib-update=390, confirmed=true)
> Nov 08 09:25:20 h4 crmd[3567]: notice: h4-lxc_unifi_monitor_0:484 [ unifi
> doesn't exist\n ]
> Nov 08 09:25:20 h4 crmd[3567]: notice: Operation fs_unifi_monitor_0: not
> running (node=h4, call=480, rc=7, cib-update=391, confirmed=true)
> Nov 08 09:25:20 h4 crmd[3567]: notice: Operation rbd_unifi_monitor_0: not
> running (node=h4, call=476, rc=7, cib-update=392, confirmed=true)
> 
> Log when running cleaning up the resource on the OLD node:
> 
> Nov 08 09:21:18 h3 crmd[11394]: warning: No match for shutdown action on
> 167838209

This indicates a node communication problem!

[...]

Regards,
Ulrich






More information about the Users mailing list