[Pacemaker] [Question and Problem] In vSphere5.1 environment, IO blocking of pengine occurs at the time of shared disk trouble for a long time.
renayama19661014 at ybb.ne.jp
renayama19661014 at ybb.ne.jp
Wed May 15 05:31:29 UTC 2013
Hi Andrew,
> > Thank you for comments.
> >
> >>> The guest located it to the shared disk.
> >>
> >> What is on the shared disk? The whole OS or app-specific data (i.e. nothing pacemaker needs directly)?
> >
> > Shared disk has all the OS and the all data.
>
> Oh. I can imagine that being problematic.
> Pacemaker really isn't designed to function without disk access.
I think so, too.
I thought so, and I did the following suggestion.
> >>> For example...
> >>> 1. crmd watches a request to pengine with a timer...
> >>> 2. pengine writes in it with a timer and watches processing....
> >>> ..etc...
But, there may be a better method.
>
> You might be able to get away with it if you turn off saving PE files to disk though.
>
> > The placement of this shared disk is similar in KVM where the problem does not occur.
>
> That it works in KVM in this situation is kind of surprising.
> Or perhaps I misunderstand.
About the movement on KVM, I confirm the details once again.
However, the movement on KVM is clearly different from the movement on vSphere5.1.
Best Regards,
Hideo Yamauchi.
>
> >
> > * We understand that we are different in movement in the difference of the hyper visor.
> > * However, it seems to be necessary to evade this problem to use Pacemaker in vSphere5.1 environment.
> >
> > Best Regards,
> > Hideo Yamauchi.
> >
> >
> > --- On Wed, 2013/5/15, Andrew Beekhof <andrew at beekhof.net> wrote:
> >
> >>
> >> On 13/05/2013, at 4:14 PM, renayama19661014 at ybb.ne.jp wrote:
> >>
> >>> Hi All,
> >>>
> >>> We constituted a simple cluster in environment of vSphere5.1.
> >>>
> >>> We composed it of two ESXi servers and shared disk.
> >>>
> >>> The guest located it to the shared disk.
> >>
> >> What is on the shared disk? The whole OS or app-specific data (i.e. nothing pacemaker needs directly)?
> >>
> >>>
> >>>
> >>> Step 1) Constitute a cluster.(A DC node is an active node.)
> >>>
> >>> ============
> >>> Last updated: Mon May 13 14:16:09 2013
> >>> Stack: Heartbeat
> >>> Current DC: pgsr01 (85a81130-4fed-4932-ab4c-21ac2320186f) - partition with quorum
> >>> Version: 1.0.13-30bb726
> >>> 2 Nodes configured, unknown expected votes
> >>> 2 Resources configured.
> >>> ============
> >>>
> >>> Online: [ pgsr01 pgsr02 ]
> >>>
> >>> Resource Group: test-group
> >>> Dummy1 (ocf::pacemaker:Dummy): Started pgsr01
> >>> Dummy2 (ocf::pacemaker:Dummy): Started pgsr01
> >>> Clone Set: clnPingd
> >>> Started: [ pgsr01 pgsr02 ]
> >>>
> >>> Node Attributes:
> >>> * Node pgsr01:
> >>> + default_ping_set : 100
> >>> * Node pgsr02:
> >>> + default_ping_set : 100
> >>>
> >>> Migration summary:
> >>> * Node pgsr01:
> >>> * Node pgsr02:
> >>>
> >>>
> >>> Step 2) Strace does the pengine process of the DC node.
> >>>
> >>> [root at pgsr01 ~]# ps -ef |grep heartbeat
> >>> root 2072 1 0 13:56 ? 00:00:00 heartbeat: master control process
> >>> root 2075 2072 0 13:56 ? 00:00:00 heartbeat: FIFO reader
> >>> root 2076 2072 0 13:56 ? 00:00:00 heartbeat: write: bcast eth1
> >>> root 2077 2072 0 13:56 ? 00:00:00 heartbeat: read: bcast eth1
> >>> root 2078 2072 0 13:56 ? 00:00:00 heartbeat: write: bcast eth2
> >>> root 2079 2072 0 13:56 ? 00:00:00 heartbeat: read: bcast eth2
> >>> 496 2082 2072 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/ccm
> >>> 496 2083 2072 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/cib
> >>> root 2084 2072 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/lrmd -r
> >>> root 2085 2072 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/stonithd
> >>> 496 2086 2072 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/attrd
> >>> 496 2087 2072 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/crmd
> >>> 496 2089 2087 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/pengine
> >>> root 2182 1 0 14:15 ? 00:00:00 /usr/lib64/heartbeat/pingd -D -p /var/run//pingd-default_ping_set -a default_ping_set -d 5s -m 100 -i 1 -h 192.168.101.254
> >>> root 2287 1973 0 14:16 pts/0 00:00:00 grep heartbea
> >>>
> >>> [root at pgsr01 ~]# strace -p 2089
> >>> Process 2089 attached - interrupt to quit
> >>> restart_syscall(<... resuming interrupted call ...>) = 0
> >>> times({tms_utime=5, tms_stime=6, tms_cutime=0, tms_cstime=0}) = 429527557
> >>> recvfrom(5, 0xa93ff7, 953, 64, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
> >>> poll([{fd=5, events=0}], 1, 0) = 0 (Timeout)
> >>> recvfrom(5, 0xa93ff7, 953, 64, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
> >>> poll([{fd=5, events=0}], 1, 0) = 0 (Timeout)
> >>> (snip)
> >>>
> >>>
> >>> Step 3) Disconnect the shared disk which an active node was placed.
> >>>
> >>> Step 4) Cut off pingd of the standby node.
> >>> The score of pingd is reflected definitely, but handling of pengine blocks it.
> >>>
> >>> ~ # esxcfg-vswitch -N vmnic1 -p "ap-db" vSwitch1
> >>> ~ # esxcfg-vswitch -N vmnic2 -p "ap-db" vSwitch1
> >>>
> >>>
> >>> (snip)
> >>> brk(0xd05000) = 0xd05000
> >>> brk(0xeed000) = 0xeed000
> >>> brk(0xf2d000) = 0xf2d000
> >>> fstat(6, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
> >>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f86a255a000
> >>> write(6, "BZh51AY&SY\327\373\370\203\0\t(_\200UPX\3\377\377%cT \277\377\377"..., 2243) = 2243
> >>> brk(0xb1d000) = 0xb1d000
> >>> fsync(6 ------------------------------> BLOCKED
> >>> (snip)
> >>>
> >>>
> >>> ============
> >>> Last updated: Mon May 13 14:19:15 2013
> >>> Stack: Heartbeat
> >>> Current DC: pgsr01 (85a81130-4fed-4932-ab4c-21ac2320186f) - partition with quorum
> >>> Version: 1.0.13-30bb726
> >>> 2 Nodes configured, unknown expected votes
> >>> 2 Resources configured.
> >>> ============
> >>>
> >>> Online: [ pgsr01 pgsr02 ]
> >>>
> >>> Resource Group: test-group
> >>> Dummy1 (ocf::pacemaker:Dummy): Started pgsr01
> >>> Dummy2 (ocf::pacemaker:Dummy): Started pgsr01
> >>> Clone Set: clnPingd
> >>> Started: [ pgsr01 pgsr02 ]
> >>>
> >>> Node Attributes:
> >>> * Node pgsr01:
> >>> + default_ping_set : 100
> >>> * Node pgsr02:
> >>> + default_ping_set : 0 : Connectivity is lost
> >>>
> >>> Migration summary:
> >>> * Node pgsr01:
> >>> * Node pgsr02:
> >>>
> >>>
> >>> Step 4) Reconnect communication of pingd of the standby node.
> >>> The score of pingd is reflected definitely, but handling of pengine blocks it.
> >>>
> >>>
> >>> ~ # esxcfg-vswitch -M vmnic1 -p "ap-db" vSwitch1
> >>> ~ # esxcfg-vswitch -M vmnic2 -p "ap-db" vSwitch1
> >>>
> >>> ============
> >>> Last updated: Mon May 13 14:19:40 2013
> >>> Stack: Heartbeat
> >>> Current DC: pgsr01 (85a81130-4fed-4932-ab4c-21ac2320186f) - partition with quorum
> >>> Version: 1.0.13-30bb726
> >>> 2 Nodes configured, unknown expected votes
> >>> 2 Resources configured.
> >>> ============
> >>>
> >>> Online: [ pgsr01 pgsr02 ]
> >>>
> >>> Resource Group: test-group
> >>> Dummy1 (ocf::pacemaker:Dummy): Started pgsr01
> >>> Dummy2 (ocf::pacemaker:Dummy): Started pgsr01
> >>> Clone Set: clnPingd
> >>> Started: [ pgsr01 pgsr02 ]
> >>>
> >>> Node Attributes:
> >>> * Node pgsr01:
> >>> + default_ping_set : 100
> >>> * Node pgsr02:
> >>> + default_ping_set : 100
> >>>
> >>> Migration summary:
> >>> * Node pgsr01:
> >>> * Node pgsr02:
> >>>
> >>>
> >>> --------- A block state of pengine continues -----
> >>>
> >>> Step 5) Cut off pingd of the active node.
> >>> The score of pingd is reflected definitely, but handling of pengine blocks it.
> >>>
> >>>
> >>> ~ # esxcfg-vswitch -N vmnic1 -p "ap-db" vSwitch1
> >>> ~ # esxcfg-vswitch -N vmnic2 -p "ap-db" vSwitch1
> >>>
> >>>
> >>> ============
> >>> Last updated: Mon May 13 14:20:32 2013
> >>> Stack: Heartbeat
> >>> Current DC: pgsr01 (85a81130-4fed-4932-ab4c-21ac2320186f) - partition with quorum
> >>> Version: 1.0.13-30bb726
> >>> 2 Nodes configured, unknown expected votes
> >>> 2 Resources configured.
> >>> ============
> >>>
> >>> Online: [ pgsr01 pgsr02 ]
> >>>
> >>> Resource Group: test-group
> >>> Dummy1 (ocf::pacemaker:Dummy): Started pgsr01
> >>> Dummy2 (ocf::pacemaker:Dummy): Started pgsr01
> >>> Clone Set: clnPingd
> >>> Started: [ pgsr01 pgsr02 ]
> >>>
> >>> Node Attributes:
> >>> * Node pgsr01:
> >>> + default_ping_set : 0 : Connectivity is lost
> >>> * Node pgsr02:
> >>> + default_ping_set : 100
> >>>
> >>> Migration summary:
> >>> * Node pgsr01:
> >>> * Node pgsr02:
> >>>
> >>> --------- A block state of pengine continues -----
> >>>
> >>>
> >>> After that the movement to the standby node of the resource does not happen because in condition transition is not made because a block state of pengine continues.
> >>> In the vSphere environment, time considerably passes, and blocking is canceled, and transition is generated.
> >>> * The IO blocking of pengine seems to occur repeatedly
> >>> * Other processes may be blocked, too.
> >>> * It took it from trouble to FO completion more than one hour.
> >>>
> >>> This problem shows that resource movement may not occur after disk trouble in vSphere environment.
> >>>
> >>> Because our user thinks that I use Pacemaker in vSphere environment, the solution to this problem is necessary.
> >>>
> >>> Do not you know the example which solved a similar problem on vSphere?
> >>>
> >>> We think that it is necessary to evade a block of pengine if there is not a solution example.
> >>>
> >>> For example...
> >>> 1. crmd watches a request to pengine with a timer...
> >>> 2. pengine writes in it with a timer and watches processing....
> >>> ..etc...
> >>>
> >>> * This problem does not seem to occur in KVM.
> >>> * There is the possibility of the difference of the hyper visor.
> >>> * In addition, even an actual machine of Linux did not generate the problem.
> >>>
> >>>
> >>> Best Regards,
> >>> Hideo Yamauchi.
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>>
> >>> Project Home: http://www.clusterlabs.org
> >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >>> Bugs: http://bugs.clusterlabs.org
> >>
> >>
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
>
More information about the Pacemaker
mailing list