[Pacemaker] reboot of non-vm host results in VM restart -- of chickens and eggs and VMs

Thu Dec 19 13:48:07 EST 2013

Maybe the problem is this, the cluster try to start the vm and libvirtd
isn't started


2013/12/19 emmanuel segura <emi2fast at gmail.com>

> if don't set your vm to start at boot time, you don't to put in cluster
> libvirtd, maybe the problem isn't this, but why put the os services in
> cluster, for example crond ...... :)
>
>
> 2013/12/19 Bob Haxo <bhaxo at sgi.com>
>
>>  Hello,
>>
>> Earlier emails related to this topic:
>> [pacemaker] chicken-egg-problem with libvirtd and a VM within cluster
>> [pacemaker] VirtualDomain problem after reboot of one node
>>
>>
>> My configuration:
>>
>> RHEL6.5/CMAN/gfs2/Pacemaker/crmsh
>>
>> pacemaker-libs-1.1.10-14.el6_5.1.x86_64
>> pacemaker-cli-1.1.10-14.el6_5.1.x86_64
>> pacemaker-1.1.10-14.el6_5.1.x86_64
>> pacemaker-cluster-libs-1.1.10-14.el6_5.1.x86_64
>>
>> Two node HA VM cluster using real shared drive, not drbd.
>>
>> Resources (relevant to this discussion):
>> primitive p_fs_images ocf:heartbeat:Filesystem \
>> primitive p_libvirtd lsb:libvirtd \
>> primitive virt ocf:heartbeat:VirtualDomain \
>>
>> services chkconfig on: cman, clvmd, pacemaker
>> services chkconfig off: corosync, gfs2, libvirtd
>>
>> Observation:
>>
>> Rebooting the NON-host system results in the restart of the VM merrily
>> running on the host system.
>>
>> Apparent cause:
>>
>> Upon startup, Pacemaker apparently checks the status of configured
>> resources. However, the status request for the virt
>> (ocf:heartbeat:VirtualDomain) resource fails with:
>>
>> Dec 18 12:19:30 [4147] mici-admin2       lrmd:  warning: child_timeout_callback:        virt_monitor_0 process (PID 4158) timed out
>> Dec 18 12:19:30 [4147] mici-admin2       lrmd:  warning: operation_finished:    virt_monitor_0:4158 - timed out after 200000ms
>> Dec 18 12:19:30 [4147] mici-admin2       lrmd:   notice: operation_finished:    virt_monitor_0:4158:stderr [ error: Failed to reconnect to the hypervisor ]
>> Dec 18 12:19:30 [4147] mici-admin2       lrmd:   notice: operation_finished:    virt_monitor_0:4158:stderr [ error: no valid connection ]
>> Dec 18 12:19:30 [4147] mici-admin2       lrmd:   notice: operation_finished:    virt_monitor_0:4158:stderr [ error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory ]
>>
>>
>> This failure then snowballs into an "orphan" situation in which the
>> running VM is restarted.
>>
>> There was the suggestion of chkconfig on libvirtd (and presumably
>> deleting the resource) so that the /var/run/libvirt/libvirt-sock has been
>> created by service libvirtd. With libvirtd started by the system, there is
>> no un-needed reboot of the VM.
>>
>> However, it may be that removing libvirtd from Pacemaker control leaves
>> the VM vdisk filesystem susceptible to corruption during a reboot induced
>> failover.
>>
>> Question:
>>
>> Is there an accepted Pacemaker configuration such that the un-needed
>> restart of the VM does not occur with the reboot of the non-host system?
>>
>> Regards,
>> Bob Haxo
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>>
>
>
> --
> esta es mi vida e me la vivo hasta que dios quiera
>


-- 
esta es mi vida e me la vivo hasta que dios quiera
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20131219/fbaade89/attachment-0003.html>