[Pacemaker] Postgresql Replication
Takatoshi MATSUO
matsuo.tak at gmail.com
Thu Sep 12 09:58:38 EDT 2013
Hi
2013/9/12 Eloy Coto Pereiro <eloy.coto at gmail.com>:
> Hi,
>
> Thanks for your help, I use the same example. In this case Kamailio need to
> start after postgresql. But this is not a problem I think, the replication
> work ok without corosync. I stop all process and start to work with
> corosync.
>
> When I start corosync I see this log in my slave:
>
> Sep 12 16:12:50 slave pgsql(pgsql)[26092]: INFO: Master does not exist.
> Sep 12 16:12:50 slave pgsql(pgsql)[26092]: WARNING: My data is out-of-date.
> status=DISCONNECT
Did you start PostgreSQL on master(node-name) and it became Master ?
These logs mean that slave doesn't see Master and slave's data is old.
(It's confusable to use hostname "master" and "slave")
Please stop pacemaker and erase all configuration and re-load
original configuration which doesn't have
----
node master \
attributes maintenance="off" pgsql-data-status="LATEST"
node slave \
attributes pgsql-data-status="DISCONNECT"
---
Because Pacemaker records last data status.
> But all data is the same, and If I run the slave server with normal postgres
> the replication is ok. Any idea?
>
> Cheers
>
>
> 2013/9/12 Takatoshi MATSUO <matsuo.tak at gmail.com>
>>
>> Hi Eloy
>>
>>
>> 2013/9/12 Eloy Coto Pereiro <eloy.coto at gmail.com>:
>> > Hi,
>> >
>> > I have issues with this config, for example when master is running
>> > corosync
>> > service use pg_ctl. But in the slave pg_ctl doesn't start and
>> > replication
>> > doesn't work.
>> >
>> > This is my data:
>> >
>> >
>> > Online: [ master slave ]
>> > OFFLINE: [ ]
>> >
>> > Full list of resources:
>> >
>> > ClusterIP (ocf::heartbeat:IPaddr2): Started master
>> > KAMAILIO (lsb:kamailio): Started master
>> > Master/Slave Set: msPostgresql [pgsql]
>> > Masters: [ master ]
>> > Stopped: [ pgsql:1 ]
>> >
>> > Node Attributes:
>> > * Node master:
>> > + maintenance : off
>> > + master-pgsql : 1000
>> > + pgsql-data-status : LATEST
>> > + pgsql-master-baseline : 0000000019000080
>> > + pgsql-status : PRI
>> > * Node slave:
>> > + pgsql-data-status : DISCONNECT
>> > + pgsql-status : HS:sync
>> >
>> >
>> > In my crm configure show is this:
>> > node master \
>> > attributes maintenance="off" pgsql-data-status="LATEST"
>> > node slave \
>> > attributes pgsql-data-status="DISCONNECT"
>> > primitive ClusterIP ocf:heartbeat:IPaddr2 \
>> > params ip="10.1.1.1" cidr_netmask="24" \
>> > op monitor interval="15s" \
>> > op start timeout="60s" interval="0s" on-fail="stop" \
>> > op monitor timeout="60s" interval="10s" on-fail="restart" \
>> > op stop timeout="60s" interval="0s" on-fail="block"
>> > primitive KAMAILIO lsb:kamailio \
>> > op monitor interval="10s" \
>> > op start interval="0" timeout="120s" \
>> > op stop interval="0" timeout="120s" \
>> > meta target-role="Started"
>> > primitive pgsql ocf:heartbeat:pgsql \
>> > params pgctl="/usr/pgsql-9.2/bin/pg_ctl" psql="/usr/pgsql-9.2/bin/psql"
>> > pgdata="/var/lib/pgsql/9.2/data/" rep_mode="sync" node_list="master
>> > slave"
>> > restore_command="cp /var/lib/pgsql/9.2/pg_archive/%f %p"
>> > primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5
>> > keepalives_count=5" master_ip="10.1.1.1" restart_on_promote="true" \
>> > op start timeout="60s" interval="0s" on-fail="restart" \
>> > op monitor timeout="60s" interval="4s" on-fail="restart" \
>> > op monitor timeout="60s" interval="3s" on-fail="restart" role="Master" \
>> > op promote timeout="60s" interval="0s" on-fail="restart" \
>> > op demote timeout="60s" interval="0s" on-fail="stop" \
>> > op stop timeout="60s" interval="0s" on-fail="block" \
>> > op notify timeout="60s" interval="0s"
>> > ms msPostgresql pgsql \
>> > meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1"
>> > notify="true" target-role="Started"
>> > location cli-prefer-KAMAILIO KAMAILIO \
>> > rule $id="cli-prefer-rule-KAMAILIO" inf: #uname eq master
>> > location cli-prefer-pgsql msPostgresql \
>> > rule $id="cli-prefer-rule-pgsql" inf: #uname eq master
>> > location cli-standby-ClusterIP ClusterIP \
>> > rule $id="cli-standby-rule-ClusterIP" -inf: #uname eq slave
>>
>> This location is invalid.
>> It means that ClusterIP can't run on slave.
>>
>> > colocation colocation-1 inf: ClusterIP msPostgresql KAMAILIO
>>
>> PostgreSQL needs KAMAILIO to start ?
>> It means that Pacemaker can't start PostgreSQL on slave.
>>
>> Sample setting is
>> colocation rsc_colocation-1 inf: master-group msPostgresql:Master
>>
>> At the very beginning, you might want to customize sample settings.
>>
>> http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster#sample_settings_for_crm_command
>>
>> And please see logs because pgsql RA outputs some useful logs.
>>
>> > order order-1 inf: ClusterIP msPostgresql KAMAILIO
>> > property $id="cib-bootstrap-options" \
>> > dc-version="1.1.8-7.el6-394e906" \
>> > cluster-infrastructure="classic openais (with plugin)" \
>> > expected-quorum-votes="2" \
>> > stonith-enabled="false"
>> >
>> > Any idea why doesn't start on the second slave?
>> >
>> > More info:
>> >
>> > Master:
>> >
>> > root at master ~]# netstat -putan | grep 5432 | grep LISTEN
>> > tcp 0 0 0.0.0.0:5432 0.0.0.0:*
>> > LISTEN 3241/postgres
>> > tcp 0 0 :::5432 :::*
>> > LISTEN 3241/postgres
>> > [root at master ~]# ps axu | grep postgres
>> > postgres 3241 0.0 0.0 97072 7692 ? S 11:41 0:00
>> > /usr/pgsql-9.2/bin/postgres -D /var/lib/pgsql/9.2/data -c
>> > config_file=/var/lib/pgsql/9.2/data//postgresql.conf
>> > postgres 3293 0.0 0.0 97072 1556 ? Ss 11:41 0:00
>> > postgres:
>> > checkpointer process
>> > postgres 3294 0.0 0.0 97072 1600 ? Ss 11:41 0:00
>> > postgres:
>> > writer process
>> > postgres 3295 0.0 0.0 97072 1516 ? Ss 11:41 0:00
>> > postgres:
>> > wal writer process
>> > postgres 3296 0.0 0.0 97920 2760 ? Ss 11:41 0:00
>> > postgres:
>> > autovacuum launcher process
>> > postgres 3297 0.0 0.0 82712 1500 ? Ss 11:41 0:00
>> > postgres:
>> > archiver process failed on 000000010000000000000001
>> > postgres 3298 0.0 0.0 82872 1568 ? Ss 11:41 0:00
>> > postgres:
>> > stats collector process
>> > root 10901 0.0 0.0 103232 852 pts/0 S+ 11:44 0:00 grep
>> > postgres
>> >
>> >
>> > On slave:
>> >
>> > [root at slave ~]# ps axu | grep postgre
>> > root 3332 0.0 0.0 103232 856 pts/0 S+ 11:45 0:00 grep
>> > postgre
>> > [root at slave ~]# netstat -putan | grep 5432
>> > [root at slave ~]#
>> >
>> >
>> > If I make pg_ctl /var/lib/pgsql/9.2/data/ start work ok
>> >
>> > Any idea?
>> >
>> >
>> > 2013/9/11 Takatoshi MATSUO <matsuo.tak at gmail.com>
>> >>
>> >> Hi Eloy
>> >>
>> >> Please see http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster .
>> >> In the document, it uses virtual IP to receive connection,
>> >> so it doesn't need to change recovery.conf.
>> >>
>> >> Thanks,
>> >> Takatoshi MATSUO
>> >>
>> >>
>> >> 2013/9/11 Eloy Coto Pereiro <eloy.coto at gmail.com>:
>> >> > Hi,
>> >> >
>> >> > In Postgresql if you use wal replication
>> >> > <http://wiki.postgresql.org/wiki/Streaming_Replication> when the
>> >> > master
>> >> > servers fails need to change the recovery.conf on the slave server.
>> >> >
>> >> > In this case any tool, when the master is down, execute a command and
>> >> > get
>> >> > this info?
>> >> > Is this the right tool for postgresql's replication?
>> >> >
>> >> > Cheers
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
More information about the Pacemaker
mailing list