[Pacemaker] Pacemaker unnecessarily (?) restarts a vm on active node when other node brought out of standby
Ian
cl-3627 at jusme.com
Wed May 14 14:34:35 UTC 2014
Andrew Beekhof wrote:
> On 14 May 2014, at 5:23 am, Ian <cl-3627 at jusme.com> wrote:
>
> Hmmm, master-max=2... I'd bet that is something the code might not be
> handling optimally.
> Can you attach a crm_report tarball for the period covered by your
> test?
Attached. Sequence was:
[root at sv07 ~]# date; ssh sv06 date
Wed May 14 15:06:23 BST 2014
Wed May 14 15:06:23 BST 2014
[root at sv07 ~]# pcs status
Cluster name: jusme
Last updated: Wed May 14 15:06:35 2014
Last change: Wed May 14 15:02:05 2014 via crm_attribute on sv07
Stack: cman
Current DC: sv07 - partition with quorum
Version: 1.1.10-14.el6_5.2-368c726
2 Nodes configured
7 Resources configured
Node sv06: standby
Online: [ sv07 ]
Full list of resources:
Master/Slave Set: vm_storage_core_dev-master [vm_storage_core_dev]
Masters: [ sv07 ]
Stopped: [ sv06 ]
Clone Set: vm_storage_core-clone [vm_storage_core]
Started: [ sv07 ]
Stopped: [ sv06 ]
Master/Slave Set: nfs_server_dev-master [nfs_server_dev]
Masters: [ sv07 ]
Stopped: [ sv06 ]
res_vm_nfs_server (ocf::heartbeat:VirtualDomain): Started sv07
[root at sv07 ~]# pcs cluster unstandby sv06
[root at sv07 ~]# date; ssh sv06 date
Wed May 14 15:07:18 BST 2014
Wed May 14 15:07:18 BST 2014
[root at sv07 ~]# pcs status
Cluster name: jusme
Last updated: Wed May 14 15:07:29 2014
Last change: Wed May 14 15:06:52 2014 via crm_attribute on sv07
Stack: cman
Current DC: sv07 - partition with quorum
Version: 1.1.10-14.el6_5.2-368c726
2 Nodes configured
7 Resources configured
Online: [ sv06 sv07 ]
Full list of resources:
Master/Slave Set: vm_storage_core_dev-master [vm_storage_core_dev]
Masters: [ sv07 ]
Slaves: [ sv06 ]
Clone Set: vm_storage_core-clone [vm_storage_core]
Started: [ sv07 ]
Stopped: [ sv06 ]
Master/Slave Set: nfs_server_dev-master [nfs_server_dev]
Masters: [ sv07 ]
Slaves: [ sv06 ]
res_vm_nfs_server (ocf::heartbeat:VirtualDomain): Started sv07
## About 1 minute later vm_storage_core_dev gets automatically promoted
to
## master/master, provoking the unwanted gfs/vm restart...
[root at sv07 ~]# date; ssh sv06 date
Wed May 14 15:08:27 BST 2014
Wed May 14 15:08:27 BST 2014
[root at sv07 ~]# pcs status
Cluster name: jusme
Last updated: Wed May 14 15:08:28 2014
Last change: Wed May 14 15:06:52 2014 via crm_attribute on sv07
Stack: cman
Current DC: sv07 - partition with quorum
Version: 1.1.10-14.el6_5.2-368c726
2 Nodes configured
7 Resources configured
Online: [ sv06 sv07 ]
Full list of resources:
Master/Slave Set: vm_storage_core_dev-master [vm_storage_core_dev]
Masters: [ sv06 sv07 ]
Clone Set: vm_storage_core-clone [vm_storage_core]
Started: [ sv06 sv07 ]
Master/Slave Set: nfs_server_dev-master [nfs_server_dev]
Masters: [ sv07 ]
Slaves: [ sv06 ]
res_vm_nfs_server (ocf::heartbeat:VirtualDomain): Started sv07
[root at sv07 ~]# crm_report -f "2014-05-14 15:05:00" report-20140514-1
Ian.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: report-20140514-1.tar.bz2
Type: application/x-bzip2
Size: 227398 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140514/792131fa/attachment-0004.bz2>
More information about the Pacemaker
mailing list