[Pacemaker] Postgresql Replication
Takatoshi MATSUO
matsuo.tak at gmail.com
Thu Sep 12 06:00:31 EDT 2013
Hi Eloy
2013/9/12 Eloy Coto Pereiro <eloy.coto at gmail.com>:
> Hi,
>
> I have issues with this config, for example when master is running corosync
> service use pg_ctl. But in the slave pg_ctl doesn't start and replication
> doesn't work.
>
> This is my data:
>
>
> Online: [ master slave ]
> OFFLINE: [ ]
>
> Full list of resources:
>
> ClusterIP (ocf::heartbeat:IPaddr2): Started master
> KAMAILIO (lsb:kamailio): Started master
> Master/Slave Set: msPostgresql [pgsql]
> Masters: [ master ]
> Stopped: [ pgsql:1 ]
>
> Node Attributes:
> * Node master:
> + maintenance : off
> + master-pgsql : 1000
> + pgsql-data-status : LATEST
> + pgsql-master-baseline : 0000000019000080
> + pgsql-status : PRI
> * Node slave:
> + pgsql-data-status : DISCONNECT
> + pgsql-status : HS:sync
>
>
> In my crm configure show is this:
> node master \
> attributes maintenance="off" pgsql-data-status="LATEST"
> node slave \
> attributes pgsql-data-status="DISCONNECT"
> primitive ClusterIP ocf:heartbeat:IPaddr2 \
> params ip="10.1.1.1" cidr_netmask="24" \
> op monitor interval="15s" \
> op start timeout="60s" interval="0s" on-fail="stop" \
> op monitor timeout="60s" interval="10s" on-fail="restart" \
> op stop timeout="60s" interval="0s" on-fail="block"
> primitive KAMAILIO lsb:kamailio \
> op monitor interval="10s" \
> op start interval="0" timeout="120s" \
> op stop interval="0" timeout="120s" \
> meta target-role="Started"
> primitive pgsql ocf:heartbeat:pgsql \
> params pgctl="/usr/pgsql-9.2/bin/pg_ctl" psql="/usr/pgsql-9.2/bin/psql"
> pgdata="/var/lib/pgsql/9.2/data/" rep_mode="sync" node_list="master slave"
> restore_command="cp /var/lib/pgsql/9.2/pg_archive/%f %p"
> primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5
> keepalives_count=5" master_ip="10.1.1.1" restart_on_promote="true" \
> op start timeout="60s" interval="0s" on-fail="restart" \
> op monitor timeout="60s" interval="4s" on-fail="restart" \
> op monitor timeout="60s" interval="3s" on-fail="restart" role="Master" \
> op promote timeout="60s" interval="0s" on-fail="restart" \
> op demote timeout="60s" interval="0s" on-fail="stop" \
> op stop timeout="60s" interval="0s" on-fail="block" \
> op notify timeout="60s" interval="0s"
> ms msPostgresql pgsql \
> meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1"
> notify="true" target-role="Started"
> location cli-prefer-KAMAILIO KAMAILIO \
> rule $id="cli-prefer-rule-KAMAILIO" inf: #uname eq master
> location cli-prefer-pgsql msPostgresql \
> rule $id="cli-prefer-rule-pgsql" inf: #uname eq master
> location cli-standby-ClusterIP ClusterIP \
> rule $id="cli-standby-rule-ClusterIP" -inf: #uname eq slave
This location is invalid.
It means that ClusterIP can't run on slave.
> colocation colocation-1 inf: ClusterIP msPostgresql KAMAILIO
PostgreSQL needs KAMAILIO to start ?
It means that Pacemaker can't start PostgreSQL on slave.
Sample setting is
colocation rsc_colocation-1 inf: master-group msPostgresql:Master
At the very beginning, you might want to customize sample settings.
http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster#sample_settings_for_crm_command
And please see logs because pgsql RA outputs some useful logs.
> order order-1 inf: ClusterIP msPostgresql KAMAILIO
> property $id="cib-bootstrap-options" \
> dc-version="1.1.8-7.el6-394e906" \
> cluster-infrastructure="classic openais (with plugin)" \
> expected-quorum-votes="2" \
> stonith-enabled="false"
>
> Any idea why doesn't start on the second slave?
>
> More info:
>
> Master:
>
> root at master ~]# netstat -putan | grep 5432 | grep LISTEN
> tcp 0 0 0.0.0.0:5432 0.0.0.0:*
> LISTEN 3241/postgres
> tcp 0 0 :::5432 :::*
> LISTEN 3241/postgres
> [root at master ~]# ps axu | grep postgres
> postgres 3241 0.0 0.0 97072 7692 ? S 11:41 0:00
> /usr/pgsql-9.2/bin/postgres -D /var/lib/pgsql/9.2/data -c
> config_file=/var/lib/pgsql/9.2/data//postgresql.conf
> postgres 3293 0.0 0.0 97072 1556 ? Ss 11:41 0:00 postgres:
> checkpointer process
> postgres 3294 0.0 0.0 97072 1600 ? Ss 11:41 0:00 postgres:
> writer process
> postgres 3295 0.0 0.0 97072 1516 ? Ss 11:41 0:00 postgres:
> wal writer process
> postgres 3296 0.0 0.0 97920 2760 ? Ss 11:41 0:00 postgres:
> autovacuum launcher process
> postgres 3297 0.0 0.0 82712 1500 ? Ss 11:41 0:00 postgres:
> archiver process failed on 000000010000000000000001
> postgres 3298 0.0 0.0 82872 1568 ? Ss 11:41 0:00 postgres:
> stats collector process
> root 10901 0.0 0.0 103232 852 pts/0 S+ 11:44 0:00 grep
> postgres
>
>
> On slave:
>
> [root at slave ~]# ps axu | grep postgre
> root 3332 0.0 0.0 103232 856 pts/0 S+ 11:45 0:00 grep
> postgre
> [root at slave ~]# netstat -putan | grep 5432
> [root at slave ~]#
>
>
> If I make pg_ctl /var/lib/pgsql/9.2/data/ start work ok
>
> Any idea?
>
>
> 2013/9/11 Takatoshi MATSUO <matsuo.tak at gmail.com>
>>
>> Hi Eloy
>>
>> Please see http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster .
>> In the document, it uses virtual IP to receive connection,
>> so it doesn't need to change recovery.conf.
>>
>> Thanks,
>> Takatoshi MATSUO
>>
>>
>> 2013/9/11 Eloy Coto Pereiro <eloy.coto at gmail.com>:
>> > Hi,
>> >
>> > In Postgresql if you use wal replication
>> > <http://wiki.postgresql.org/wiki/Streaming_Replication> when the master
>> > servers fails need to change the recovery.conf on the slave server.
>> >
>> > In this case any tool, when the master is down, execute a command and
>> > get
>> > this info?
>> > Is this the right tool for postgresql's replication?
>> >
>> > Cheers
More information about the Pacemaker
mailing list