[Pacemaker] [Question] About "quorum-policy=freeze" and "promote".

Fri May 9 01:54:32 EDT 2014

Hi Andrew,

> > Okay.
> > I wish this problem is revised by the next release.
> 
> crm_report?

I confirmed a problem again in PM1.2-rc1 and registered in Bugzilla.
 * http://bugs.clusterlabs.org/show_bug.cgi?id=5212

Towards Bugzilla, I attached the crm_report file.

Best Regards,
Hideo Yamauchi.

--- On Fri, 2014/5/9, Andrew Beekhof <andrew at beekhof.net> wrote:

> 
> On 9 May 2014, at 2:05 pm, renayama19661014 at ybb.ne.jp wrote:
> 
> > Hi Andrew,
> > 
> > Thank you for comment.
> > 
> >>> Is it responsibility of the resource agent side to prevent a state of these plural Master?
> >> 
> >> No.
> >> 
> >> In this scenario, no nodes have quorum and therefor no additional instances should have been promoted.  Thats the definition of "freeze" :)
> >> Even if one partition DID have quorum, no instances should have been promoted without fencing occurring first.
> > 
> > Okay.
> > I wish this problem is revised by the next release.
> 
> crm_report?
> 
> > 
> > Many Thanks!
> > Hideo Yamauchi.
> > 
> > --- On Fri, 2014/5/9, Andrew Beekhof <andrew at beekhof.net> wrote:
> > 
> >> 
> >> On 8 May 2014, at 1:37 pm, renayama19661014 at ybb.ne.jp wrote:
> >> 
> >>> Hi All,
> >>> 
> >>> I composed Master/Slave resource of three nodes that set quorum-policy="freeze".
> >>> (I use Stateful in Master/Slave resource.)
> >>> 
> >>> ---------------------------------
> >>> Current DC: srv01 (3232238280) - partition with quorum
> >>> Version: 1.1.11-830af67
> >>> 3 Nodes configured
> >>> 9 Resources configured
> >>> 
> >>> 
> >>> Online: [ srv01 srv02 srv03 ]
> >>> 
> >>> Resource Group: grpStonith1
> >>>      prmStonith1-1      (stonith:external/ssh): Started srv02 
> >>> Resource Group: grpStonith2
> >>>      prmStonith2-1      (stonith:external/ssh): Started srv01 
> >>> Resource Group: grpStonith3
> >>>      prmStonith3-1      (stonith:external/ssh): Started srv01 
> >>> Master/Slave Set: msPostgresql [pgsql]
> >>>      Masters: [ srv01 ]
> >>>      Slaves: [ srv02 srv03 ]
> >>> Clone Set: clnPingd [prmPingd]
> >>>      Started: [ srv01 srv02 srv03 ]
> >>> ---------------------------------
> >>> 
> >>> 
> >>> Master resource starts in all nodes when I interrupt the internal communication of all nodes.
> >>> 
> >>> ---------------------------------
> >>> Node srv02 (3232238290): UNCLEAN (offline)
> >>> Node srv03 (3232238300): UNCLEAN (offline)
> >>> Online: [ srv01 ]
> >>> 
> >>> Resource Group: grpStonith1
> >>>      prmStonith1-1      (stonith:external/ssh): Started srv02 
> >>> Resource Group: grpStonith2
> >>>      prmStonith2-1      (stonith:external/ssh): Started srv01 
> >>> Resource Group: grpStonith3
> >>>      prmStonith3-1      (stonith:external/ssh): Started srv01 
> >>> Master/Slave Set: msPostgresql [pgsql]
> >>>      Masters: [ srv01 ]
> >>>      Slaves: [ srv02 srv03 ]
> >>> Clone Set: clnPingd [prmPingd]
> >>>      Started: [ srv01 srv02 srv03 ]
> >>> (snip)
> >>> Node srv01 (3232238280): UNCLEAN (offline)
> >>> Node srv03 (3232238300): UNCLEAN (offline)
> >>> Online: [ srv02 ]
> >>> 
> >>> Resource Group: grpStonith1
> >>>      prmStonith1-1      (stonith:external/ssh): Started srv02 
> >>> Resource Group: grpStonith2
> >>>      prmStonith2-1      (stonith:external/ssh): Started srv01 
> >>> Resource Group: grpStonith3
> >>>      prmStonith3-1      (stonith:external/ssh): Started srv01 
> >>> Master/Slave Set: msPostgresql [pgsql]
> >>>      Masters: [ srv01 srv02 ]
> >>>      Slaves: [ srv03 ]
> >>> Clone Set: clnPingd [prmPingd]
> >>>      Started: [ srv01 srv02 srv03 ]
> >>> (snip)
> >>> Node srv01 (3232238280): UNCLEAN (offline)
> >>> Node srv02 (3232238290): UNCLEAN (offline)
> >>> Online: [ srv03 ]
> >>> 
> >>> Resource Group: grpStonith1
> >>>      prmStonith1-1      (stonith:external/ssh): Started srv02 
> >>> Resource Group: grpStonith2
> >>>      prmStonith2-1      (stonith:external/ssh): Started srv01 
> >>> Resource Group: grpStonith3
> >>>      prmStonith3-1      (stonith:external/ssh): Started srv01 
> >>> Master/Slave Set: msPostgresql [pgsql]
> >>>      Masters: [ srv01 srv03 ]
> >>>      Slaves: [ srv02 ]
> >>> Clone Set: clnPingd [prmPingd]
> >>>      Started: [ srv01 srv02 srv03 ]
> >>> ---------------------------------
> >>> 
> >>> I think even if the cluster loses Quorum, being "promote" the Master / Slave resource that's specification of Pacemaker.
> >>> 
> >>> Is it responsibility of the resource agent side to prevent a state of these plural Master?
> >> 
> >> No.
> >> 
> >> In this scenario, no nodes have quorum and therefor no additional instances should have been promoted.  Thats the definition of "freeze" :)
> >> Even if one partition DID have quorum, no instances should have been promoted without fencing occurring first.
> >> 
> >>> * I think that drbd-RA has those functions.
> >>> * But, there is no function in Stateful-RA.
> >>> * As an example, I think that the mechanism such as drbd is necessary by all means when I make a resource of Master/Slave newly.
> >>> 
> >>> Will my understanding be wrong?
> >>> 
> >>> Best Regards,
> >>> Hideo Yamauchi.
> >>> 
> >>> 
> >>> _______________________________________________
> >>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>> 
> >>> Project Home: http://www.clusterlabs.org
> >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >>> Bugs: http://bugs.clusterlabs.org
> >> 
> >> 
> 
>