[Pacemaker] pgsql troubles.
Andrew Beekhof
andrew at beekhof.net
Thu Jan 8 04:13:43 UTC 2015
> On 5 Dec 2014, at 4:16 am, steve <steve at unliketea.com> wrote:
>
> Good Afternoon,
>
>
> I am having loads of trouble with pacemaker/corosync/postgres. Defining the symptoms is rather difficult. The primary being that postgres starts as slave on both nodes. I have tested the pgsqlRA start/stop/status/monitor and they work from the command line after I setup the environment. I have not been able to get promote/demote to work, there are issues with NODENAME not being defined.
You're trying to follow http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster ?
Its not being promoted because of:
> + master-pgsql:0 : -INFINITY
which is set as part of the start action.
Typically I've seen this as a result of 'node_list' being "wrong" in some way.
I'm no expert though.
>
> I am able to run postgres in master/slave mode outside of pacemaker.
>
> I can provide additional logs but here is a start.
>
> Distributor ID: Ubuntu
> Description: Ubuntu 12.04.3 LTS
> Release: 12.04
> Codename: precise
>
> latest verions of pgsql RA (yesterday)
> pacemaker 1.1.6-2ubuntu3.1 HA cluster resource manager
> corosync 1.4.2-2 Standards-based cluster framework (daemon and module
> resource-agents 1:3.9.2-5ubuntu4.1 Cluster Resource Agents
> I have upgraded pgsqlRA to the lastest from git.
>
>
> ============
> Last updated: Wed Nov 26 13:55:59 2014
> Last change: Wed Nov 26 13:55:58 2014 via crm_attribute on tstdb04
> Stack: openais
> Current DC: tstdb04 - partition with quorum
> Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
> 2 Nodes configured, 2 expected votes
> 4 Resources configured.
> ============
>
> Online: [ tstdb03 tstdb04 ]
>
> Full list of resources:
>
> Resource Group: master-group
> vip-master (ocf::heartbeat:IPaddr2): Stopped
> vip-rep (ocf::heartbeat:IPaddr2): Stopped
> Master/Slave Set: msPostgresql [pgsql]
> Slaves: [ tstdb04 ]
> Stopped: [ pgsql:0 ]
>
> Node Attributes:
> * Node tstdb03:
> + master-pgsql:0 : -INFINITY
> + pgsql-data-status : DISCONNECT
> * Node tstdb04:
> + master-pgsql:1 : -INFINITY
> + pgsql-data-status : DISCONNECT
>
> Migration summary:
> * Node tstdb04:
> * Node tstdb03:
> pgsql:0: migration-threshold=1 fail-count=1000000
>
> Failed actions:
> pgsql:0_start_0 (node=tstdb03, call=5, rc=1, status=complete): unknown error
>
>
> config:
> property \
> no-quorum-policy="ignore" \
> stonith-enabled="false" \
> crmd-transition-delay="0"
>
> rsc_defaults \
> resource-stickiness="INFINITY" \
> migration-threshold="1"
>
> group master-group \
> vip-master \
> vip-rep
>
> primitive vip-master ocf:heartbeat:IPaddr2 \
> params \
> ip="10.132.101.95" \
> nic="eth0" \
> cidr_netmask="24" \
> op start timeout="60s" interval="0" on-fail="restart" \
> op monitor timeout="60s" interval="10s" on-fail="restart" \
> op stop timeout="60s" interval="0" on-fail="block"
>
> primitive vip-rep ocf:heartbeat:IPaddr2 \
> params \
> ip="10.132.101.96" \
> nic="eth0" \
> cidr_netmask="24" \
> meta \
> migration-threshold="0" \
> op start timeout="60s" interval="0" on-fail="stop" \
> op monitor timeout="60s" interval="10s" on-fail="restart" \
> op stop timeout="60s" interval="0" on-fail="ignore"
>
> master msPostgresql pgsql \
> meta \
> master-max="1" \
> master-node-max="1" \
> clone-max="2" \
> clone-node-max="1" \
> notify="true"
>
> primitive pgsql ocf:heartbeat:pgsql \
> params \
> pgctl="/usr/bin/pg_ctl" \
> psql="/usr/bin/psql" \
> pgdata="/database/9.3" \
> config="/etc/postgresql/9.3/main/postgresql.conf" \
> socketdir=/var/run/postgresql \
> rep_mode="sync" \
> node_list="tstdb03 tstdb04" \
> restore_command="cp /database/archive/%f %p" \
> primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5 keepalives_count=5" \
> master_ip="10.132.101.95" \
> restart_on_promote="true" \
> logfile=/var/log/postgresql/postgresql-9.3-main.log \
> op start timeout="60s" interval="0" on-fail="restart" \
> op monitor timeout="60s" interval="4s" on-fail="restart" \
> op monitor timeout="60s" interval="3s" on-fail="restart" role="Master" \
> op promote timeout="60s" interval="0" on-fail="restart" \
> op demote timeout="60s" interval="0" on-fail="stop" \
> op stop timeout="60s" interval="0" on-fail="block" \
> op notify timeout="60s" interval="0"
>
> #colocation rsc_colocation-1 inf: vip-master msPostgresql:Master
> #order rsc_order-1 0: msPostgresql:promote vip-master:start symmetrical=false
> #order rsc_order-2 0: msPostgresql:demote vip-rep:stop symmetrical=false
>
> colocation rsc_colocation-1 inf: master-group msPostgresql:Master
> order rsc_order-1 0: msPostgresql:promote master-group:start symmetrical=false
> order rsc_order-2 0: msPostgresql:demote master-group:stop symmetrical=false
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list