[Pacemaker] Better way to change master in 3 node pgsql cluster

Mon Jan 13 09:32:12 UTC 2014

Hi

I have 3 node postgresql cluster.
It work well. But I have some trobule with change master.

For now, if I need change master, I must:
1) Stop PGSQL on each node and cluster service
2) Start Setup new manual PGSQL replication
3) Change attributes on each node for point to new master
4) Stop PGSQL on each node
5) Celanup resource and start cluster service

It take a lot of time. Is it exist better way to change master?

This is my cluster service status:
Node Attributes:
* Node a.geocluster.e-autopay.com:
    + master-pgsql:0                   : 1000
    + pgsql-data-status               : LATEST
    + pgsql-master-baseline           : 000000002F000090
    + pgsql-status                     : PRI
* Node c.geocluster.e-autopay.com:
    + master-pgsql:0                   : 1000
    + pgsql-data-status               : SYNC
    + pgsql-status                     : STOP
* Node b.geocluster.e-autopay.com:
    + master-pgsql:0                   : 1000
    + pgsql-data-status               : SYNC
    + pgsql-status                     : STOP

I was use http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster for my 3
nodes cluster without hard stik.
Now I got strange situation all nodes stay slave:
============
Last updated: Sat Dec  7 04:33:47 2013
Last change: Sat Dec  7 12:56:23 2013 via crmd on a
Stack: openais
Current DC: c - partition with quorum
Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
5 Nodes configured, 3 expected votes
4 Resources configured.
============

Online: [ a c b ]

 Master/Slave Set: msPostgresql [pgsql]
     Slaves: [ a c b ]

My config is:
node a \
attributes pgsql-data-status="DISCONNECT"
node b \
attributes pgsql-data-status="DISCONNECT"
node c \
attributes pgsql-data-status="DISCONNECT"
primitive pgsql ocf:heartbeat:pgsql \
params pgctl="/usr/lib/postgresql/9.3/bin/pg_ctl" psql="/usr/bin/psql"
pgdata="/var/lib/postgresql/9.3/main" start_opt="-p 5432" rep_mode="sync"
node_list="a b c" restore_command="cp /var/lib/postgresql/9.3/pg_archive/%f
%p" master_ip="192.168.10.200" restart_on_promote="true"
config="/etc/postgresql/9.3/main/postgresql.conf" \
op start interval="0s" timeout="60s" on-fail="restart" \
op monitor interval="4s" timeout="60s" on-fail="restart" \
op monitor interval="3s" role="Master" timeout="60s" on-fail="restart" \
op promote interval="0s" timeout="60s" on-fail="restart" \
op demote interval="0s" timeout="60s" on-fail="stop" \
op stop interval="0s" timeout="60s" on-fail="block" \
op notify interval="0s" timeout="60s"
primitive pgsql-master-ip ocf:heartbeat:IPaddr2 \
params ip="192.168.10.200" nic="peervpn0" \
op start interval="0s" timeout="60s" on-fail="restart" \
op monitor interval="10s" timeout="60s" on-fail="restart" \
op stop interval="0s" timeout="60s" on-fail="block"
group master pgsql-master-ip
ms msPostgresql pgsql \
meta master-max="1" master-node-max="1" clone-max="3" clone-node-max="1"
notify="true"
colocation set_ip inf: master msPostgresql:Master
order ip_down 0: msPostgresql:demote master:stop symmetrical=false
order ip_up 0: msPostgresql:promote master:start symmetrical=false
property $id="cib-bootstrap-options" \
dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
cluster-infrastructure="openais" \
expected-quorum-votes="3" \
no-quorum-policy="ignore" \
stonith-enabled="false" \
crmd-transition-delay="0" \
last-lrm-refresh="1386404222"
rsc_defaults $id="rsc-options" \
resource-stickiness="100" \
migration-threshold="1"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140113/c72ef66c/attachment-0003.html>