No subject

Tue Oct 23 06:10:09 UTC 2012

ter score is set from the function that grabs the data_status from the mast=
er (STREAMING|SYNC, STREAMING|ASYNC, STREAMING|POTENTIAL, etc ).

The reason I ask is if the master fails and the slaves don't then compare t=
heir xlog location, there is a potential for data loss if the incorrect sla=
ve is promoted.

You can see all them with crm_mon -A1f.
Each slave gets these attributes from all node configured in parameter node=
_list (hopefully your node names in Pacemaker are the same as in node_list)=
 and compares them to get the highest.
If the highest is this list is the own one, it sets the master-score to 100=
0, on other nodes to 100.
Pacemaker then selects the node with the highest master score and promote t=
his.

Rainer
Gesendet: Mittwoch, 27. M=E4rz 2013 um 14:37 Uhr
Von: "Steven Bambling" <smbambling at arin.net<mailto:smbambling at arin.net>>
An: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org=
<mailto:pacemaker at oss.clusterlabs.org>>
Betreff: Re: [Pacemaker] PGSQL resource promotion issue
In talking with andreask from IRC, I  miss understood the need to include t=
he op monitor.  I figured it was pulled from the resource script by default=
.

I used pcs to add the new attributes and one was then promoted to master

pcs resource add_operation PGSQL monitor interval=3D5s role=3DMaster
pcs resource add_operation PGSQL monitor interval=3D7s

v/r

STEVE

On Mar 27, 2013, at 7:08 AM, Steven Bambling <smbambling at arin.net<x-msg://1=
24/smbambling at arin.net>> wrote:

I've built and installed the lastest resource-agents from github on Centos =
6 and configured two resources

1 primitive PGVIP:
pcs resource create PGVIP ocf:heartbeat:IPaddr2 ip=3D10.1.22.48 cidr_netmas=
k=3D25 op monitor interval=3D1

Before setting up the PGSQL resource I manually configured sync/streaming r=
eplication on the three nodes with p1.example.com<http://p1.example.com/> a=
s the master and verified that replication was working.  I think removed th=
e synchronous_standby_name from my postgresql.conf and stop all postgres se=
rvices on all nodes

1 master/slave PGSQL: -- I've the resource to use sync replication.  Also I=
 am using PGSQL 9.2.3

pcs resource create PGSQL ocf:heartbeat:pgsql params pgctl=3D"/usr/pgsql-9.=
2/bin/pg_ctl" pgdata=3D"/var/lib/pgsql/9.2/data" config=3D"/var/lib/pgsql/9=
.2/data/postgresql.conf" stop_escalate=3D"5" rep_mode=3D"sync" node_list=3D=
"p1.example.com<http://p1.example.com/> p2.example.com<http://example.com/>=
 p3.example.com<http://example.com/>" restore_command=3D'cp /var/lib/pgsql/=
9.2/archive/%f "%p"' master_ip=3D"10.1.22.48" repuser=3D"postgres" restart_=
on_promote=3D"true" tmpdir=3D"/var/lib/pgsql/9.2/tmpdir" xlog_check_count=
=3D"3" crm_attr_timeout=3D"5" check_wal_receiver=3D"true" --master

I'm able to successfully get all the nodes in the cluster started and the P=
GVIP resource starts on the 1st node and the PGSQL:[012] resource start on =
each node in the cluster.  The one thing I don't understand is why none of =
the slaves is taking over the master role.

Also how would I go about force promoting one of the slaves into the master=
 role via the PCS command line utility.

v/r

STEVE
_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org<x-msg://124/Pacemaker=
@oss.clusterlabs.org>
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org<mailto:Pacemaker at oss.=
clusterlabs.org>
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

--_000_7CE146047402A34B9E7F0756705B48398CF53287CHAXCH01corpari_
Content-Type: text/html; charset="iso-8859-1"
Content-ID: <B5DD95AFC9EC1B419C2A0CA0384BA797 at corp.arin.net>
Content-Transfer-Encoding: quoted-printable

<html>
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Diso-8859-=
1">
</head>
<body style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-lin=
e-break: after-white-space; ">
<br>
<div>
<div>On Mar 28, 2013, at 8:13 AM, Rainer Brestan &lt;<a href=3D"mailto:rain=
er.brestan at gmx.net">rainer.brestan at gmx.net</a>&gt; wrote:</div>
<br class=3D"Apple-interchange-newline">
<blockquote type=3D"cite">
<div>
<div style=3D"font-family: Verdana;font-size: 12.0px;">
<div>Hi Steve,</div>
<div>i think, you have misunderstood how ip addresses are used with this se=
tup, PGVIP should start after promotion.</div>
<div>Take a look at Takatoshi=B4s Wiki.</div>
<div><a href=3D"https://github.com/t-matsuo/resource-agents/wiki/Resource-A=
gent-for-PostgreSQL-9.1-streaming-replication">https://github.com/t-matsuo/=
resource-agents/wiki/Resource-Agent-for-PostgreSQL-9.1-streaming-replicatio=
n</a></div>
</div>
</div>
</blockquote>
<div><br>
</div>
<span class=3D"Apple-tab-span" style=3D"white-space:pre"></span>I see that =
he has the master/replication VIPs with a resource order to force promotion=
 before moving the VIPs to the new master.&nbsp;</div>
<div><br>
</div>
<div>&nbsp;<span class=3D"Apple-tab-span" style=3D"white-space:pre"> </span=
>I don't get how the postgres service is going to listen on those interface=
s if they have not already migrated to the new master. &nbsp;Even with sett=
ing the listen_addresses =3D &quot;*&quot;</div>
<div><br>
<blockquote type=3D"cite">
<div>
<div style=3D"font-family: Verdana;font-size: 12.0px;">
<div>&nbsp;</div>
<div>The promotion sequency is very simple.</div>
<div>When no master is existing, all slaves write their current replay xlog=
 into the node attribute PGSQL-xlog-loc during monitor call.</div>
</div>
</div>
</blockquote>
<div><span class=3D"Apple-tab-span" style=3D"white-space:pre"></span>Does t=
his also hold true if a Master fails? &nbsp;</div>
<div><br>
</div>
<div><span class=3D"Apple-tab-span" style=3D"white-space:pre"></span>From t=
he looks of it, if there was a Master before the failure that the master sc=
ore is set from the function that grabs the data_status from the master (ST=
REAMING|SYNC,&nbsp;STREAMING|ASYNC,&nbsp;STREAMING|POTENTIAL,
 etc ). &nbsp;</div>
<div><br>
</div>
<div><span class=3D"Apple-tab-span" style=3D"white-space:pre"></span>The re=
ason I ask is if the master fails and the slaves don't then compare their x=
log location, there is a potential for data loss if the incorrect slave is =
promoted.</div>
<div><br>
</div>
<br>
<blockquote type=3D"cite">
<div>
<div style=3D"font-family: Verdana;font-size: 12.0px;">
<div>You can see all them with crm_mon -A1f.</div>
<div>Each slave gets these attributes from all node configured in parameter=
 node_list (hopefully your node names in Pacemaker are the same as in node_=
list) and compares them to get the highest.</div>
<div>
<div>If the highest is this list is the own one, it sets the master-score t=
o 1000, on other nodes to 100.</div>
<div>Pacemaker then selects the node with the highest master score and prom=
ote this.</div>
<div>&nbsp;</div>
<div>Rainer</div>
<div name=3D"quote" style=3D"margin:10px 5px 5px 10px; padding: 10px 0 10px=
 10px; border-left:2px solid #C3D9E5; word-wrap: break-word; -webkit-nbsp-m=
ode: space; -webkit-line-break: after-white-space;">
<div style=3D"margin:0 0 10px 0;"><b>Gesendet:</b>&nbsp;Mittwoch, 27. M=E4r=
z 2013 um 14:37 Uhr<br>
<b>Von:</b>&nbsp;&quot;Steven Bambling&quot; &lt;<a href=3D"mailto:smbambli=
ng at arin.net">smbambling at arin.net</a>&gt;<br>
<b>An:</b>&nbsp;&quot;The Pacemaker cluster resource manager&quot; &lt;<a h=
ref=3D"mailto:pacemaker at oss.clusterlabs.org">pacemaker at oss.clusterlabs.org<=
/a>&gt;<br>
<b>Betreff:</b>&nbsp;Re: [Pacemaker] PGSQL resource promotion issue</div>
<div name=3D"quoted-content">
<div>In talking with andreask from IRC, I &nbsp;miss understood the need to=
 include the op monitor. &nbsp;I figured it was pulled from the resource sc=
ript by default.
<div>&nbsp;</div>
<div>I used pcs to add the new attributes and one was then promoted to mast=
er&nbsp;</div>
<div>&nbsp;</div>
<div>pcs resource add_operation PGSQL monitor interval=3D5s role=3DMaster</=
div>
<div>pcs resource add_operation PGSQL monitor interval=3D7s</div>
<div>&nbsp;</div>
<div>v/r</div>
<div>&nbsp;</div>
<div>STEVE</div>
<div>&nbsp;</div>
<div>On Mar 27, 2013, at 7:08 AM, Steven Bambling &lt;<a href=3D"x-msg://12=
4/smbambling at arin.net" target=3D"_parent">smbambling at arin.net</a>&gt; wrote=
:</div>
<div>
<div>&nbsp;
<blockquote>
<div>
<div>&nbsp;</div>
I've built and installed the lastest resource-agents from github on Centos =
6 and configured two resources&nbsp;
<div>
<div>&nbsp;</div>
<div>1 primitive PGVIP:</div>
<div>pcs resource create PGVIP ocf:heartbeat:IPaddr2 ip=3D10.1.22.48 cidr_n=
etmask=3D25 op monitor interval=3D1</div>
<div>&nbsp;</div>
<div>Before setting up the PGSQL resource I manually configured sync/stream=
ing replication on the three nodes with
<a href=3D"http://p1.example.com/" target=3D"_blank">p1.example.com</a> as =
the master and verified that replication was working. &nbsp;I think removed=
 the synchronous_standby_name from my postgresql.conf and stop all postgres=
 services on all nodes</div>
<div>&nbsp;</div>
<div>1 master/slave PGSQL: -- I've the resource to use sync replication. &n=
bsp;Also I am using PGSQL 9.2.3</div>
<div>&nbsp;</div>
<div>pcs resource create PGSQL ocf:heartbeat:pgsql params pgctl=3D&quot;/us=
r/pgsql-9.2/bin/pg_ctl&quot; pgdata=3D&quot;/var/lib/pgsql/9.2/data&quot; c=
onfig=3D&quot;/var/lib/pgsql/9.2/data/postgresql.conf&quot; stop_escalate=
=3D&quot;5&quot;<b> rep_mode=3D&quot;sync&quot;
</b>node_list=3D&quot;<a href=3D"http://p1.example.com/" target=3D"_blank">=
p1.example.com</a> p2.<a href=3D"http://example.com/" target=3D"_blank">exa=
mple.com</a>&nbsp;p3.<a href=3D"http://example.com/" target=3D"_blank">exam=
ple.com</a>&quot; restore_command=3D'cp /var/lib/pgsql/9.2/archive/%f
 &quot;%p&quot;' master_ip=3D&quot;10.1.22.48&quot; repuser=3D&quot;postgre=
s&quot; restart_on_promote=3D&quot;true&quot; tmpdir=3D&quot;/var/lib/pgsql=
/9.2/tmpdir&quot; xlog_check_count=3D&quot;3&quot; crm_attr_timeout=3D&quot=
;5&quot; check_wal_receiver=3D&quot;true&quot; --master</div>
<div>&nbsp;</div>
<div>I'm able to successfully get all the nodes in the cluster started and =
the PGVIP resource starts on the 1st node and the PGSQL:[012] resource star=
t on each node in the cluster. &nbsp;The one thing I don't understand is wh=
y none of the slaves is taking over the
 master role.</div>
</div>
<div>&nbsp;</div>
<div>Also how would I go about force promoting one of the slaves into the m=
aster role via the PCS command line utility.&nbsp;</div>
<div>&nbsp;</div>
<div>v/r</div>
<div>&nbsp;</div>
<div>STEVE</div>
</div>
_______________________________________________<br>
Pacemaker mailing list: <a href=3D"x-msg://124/Pacemaker@oss.clusterlabs.or=
g" target=3D"_parent">
Pacemaker at oss.clusterlabs.org</a><br>
<a href=3D"http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target=3D=
"_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
<br>
Project Home: <a href=3D"http://www.clusterlabs.org">http://www.clusterlabs=
.org</a><br>
Getting started: <a href=3D"http://www.clusterlabs.org/doc/Cluster_from_Scr=
atch.pdf">
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href=3D"http://bugs.clusterlabs.org">http://bugs.clusterlabs.org</=
a></blockquote>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
_______________________________________________<br>
Pacemaker mailing list: <a href=3D"mailto:Pacemaker at oss.clusterlabs.org">Pa=
cemaker at oss.clusterlabs.org</a><br>
<a href=3D"http://oss.clusterlabs.org/mailman/listinfo/pacemaker">http://os=
s.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
<br>
Project Home: http://www.clusterlabs.org<br>
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf<br=
>
Bugs: http://bugs.clusterlabs.org<br>
</blockquote>
</div>
<br>
</body>
</html>

--_000_7CE146047402A34B9E7F0756705B48398CF53287CHAXCH01corpari_--