[Pacemaker] Postgresql Replication
Andrew Beekhof
andrew at beekhof.net
Wed Sep 18 21:55:10 EDT 2013
On 12/09/2013, at 11:58 PM, Takatoshi MATSUO <matsuo.tak at gmail.com> wrote:
> Hi
>
> 2013/9/12 Eloy Coto Pereiro <eloy.coto at gmail.com>:
>> Hi,
>>
>> Thanks for your help, I use the same example. In this case Kamailio need to
>> start after postgresql. But this is not a problem I think, the replication
>> work ok without corosync. I stop all process and start to work with
>> corosync.
>>
>> When I start corosync I see this log in my slave:
>>
>> Sep 12 16:12:50 slave pgsql(pgsql)[26092]: INFO: Master does not exist.
>> Sep 12 16:12:50 slave pgsql(pgsql)[26092]: WARNING: My data is out-of-date.
>> status=DISCONNECT
>
> Did you start PostgreSQL on master(node-name) and it became Master ?
> These logs mean that slave doesn't see Master and slave's data is old.
> (It's confusable to use hostname "master" and "slave")
>
> Please stop pacemaker and erase all configuration and re-load
> original configuration
Seems a bit extreme. There are easier ways to flush out the operation history than that.
> which doesn't have
> ----
> node master \
> attributes maintenance="off" pgsql-data-status="LATEST"
> node slave \
> attributes pgsql-data-status="DISCONNECT"
> ---
> Because Pacemaker records last data status.
>
>> But all data is the same, and If I run the slave server with normal postgres
>> the replication is ok. Any idea?
>>
>> Cheers
>>
>>
>> 2013/9/12 Takatoshi MATSUO <matsuo.tak at gmail.com>
>>>
>>> Hi Eloy
>>>
>>>
>>> 2013/9/12 Eloy Coto Pereiro <eloy.coto at gmail.com>:
>>>> Hi,
>>>>
>>>> I have issues with this config, for example when master is running
>>>> corosync
>>>> service use pg_ctl. But in the slave pg_ctl doesn't start and
>>>> replication
>>>> doesn't work.
>>>>
>>>> This is my data:
>>>>
>>>>
>>>> Online: [ master slave ]
>>>> OFFLINE: [ ]
>>>>
>>>> Full list of resources:
>>>>
>>>> ClusterIP (ocf::heartbeat:IPaddr2): Started master
>>>> KAMAILIO (lsb:kamailio): Started master
>>>> Master/Slave Set: msPostgresql [pgsql]
>>>> Masters: [ master ]
>>>> Stopped: [ pgsql:1 ]
>>>>
>>>> Node Attributes:
>>>> * Node master:
>>>> + maintenance : off
>>>> + master-pgsql : 1000
>>>> + pgsql-data-status : LATEST
>>>> + pgsql-master-baseline : 0000000019000080
>>>> + pgsql-status : PRI
>>>> * Node slave:
>>>> + pgsql-data-status : DISCONNECT
>>>> + pgsql-status : HS:sync
>>>>
>>>>
>>>> In my crm configure show is this:
>>>> node master \
>>>> attributes maintenance="off" pgsql-data-status="LATEST"
>>>> node slave \
>>>> attributes pgsql-data-status="DISCONNECT"
>>>> primitive ClusterIP ocf:heartbeat:IPaddr2 \
>>>> params ip="10.1.1.1" cidr_netmask="24" \
>>>> op monitor interval="15s" \
>>>> op start timeout="60s" interval="0s" on-fail="stop" \
>>>> op monitor timeout="60s" interval="10s" on-fail="restart" \
>>>> op stop timeout="60s" interval="0s" on-fail="block"
>>>> primitive KAMAILIO lsb:kamailio \
>>>> op monitor interval="10s" \
>>>> op start interval="0" timeout="120s" \
>>>> op stop interval="0" timeout="120s" \
>>>> meta target-role="Started"
>>>> primitive pgsql ocf:heartbeat:pgsql \
>>>> params pgctl="/usr/pgsql-9.2/bin/pg_ctl" psql="/usr/pgsql-9.2/bin/psql"
>>>> pgdata="/var/lib/pgsql/9.2/data/" rep_mode="sync" node_list="master
>>>> slave"
>>>> restore_command="cp /var/lib/pgsql/9.2/pg_archive/%f %p"
>>>> primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5
>>>> keepalives_count=5" master_ip="10.1.1.1" restart_on_promote="true" \
>>>> op start timeout="60s" interval="0s" on-fail="restart" \
>>>> op monitor timeout="60s" interval="4s" on-fail="restart" \
>>>> op monitor timeout="60s" interval="3s" on-fail="restart" role="Master" \
>>>> op promote timeout="60s" interval="0s" on-fail="restart" \
>>>> op demote timeout="60s" interval="0s" on-fail="stop" \
>>>> op stop timeout="60s" interval="0s" on-fail="block" \
>>>> op notify timeout="60s" interval="0s"
>>>> ms msPostgresql pgsql \
>>>> meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1"
>>>> notify="true" target-role="Started"
>>>> location cli-prefer-KAMAILIO KAMAILIO \
>>>> rule $id="cli-prefer-rule-KAMAILIO" inf: #uname eq master
>>>> location cli-prefer-pgsql msPostgresql \
>>>> rule $id="cli-prefer-rule-pgsql" inf: #uname eq master
>>>> location cli-standby-ClusterIP ClusterIP \
>>>> rule $id="cli-standby-rule-ClusterIP" -inf: #uname eq slave
>>>
>>> This location is invalid.
>>> It means that ClusterIP can't run on slave.
>>>
>>>> colocation colocation-1 inf: ClusterIP msPostgresql KAMAILIO
>>>
>>> PostgreSQL needs KAMAILIO to start ?
>>> It means that Pacemaker can't start PostgreSQL on slave.
>>>
>>> Sample setting is
>>> colocation rsc_colocation-1 inf: master-group msPostgresql:Master
>>>
>>> At the very beginning, you might want to customize sample settings.
>>>
>>> http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster#sample_settings_for_crm_command
>>>
>>> And please see logs because pgsql RA outputs some useful logs.
>>>
>>>> order order-1 inf: ClusterIP msPostgresql KAMAILIO
>>>> property $id="cib-bootstrap-options" \
>>>> dc-version="1.1.8-7.el6-394e906" \
>>>> cluster-infrastructure="classic openais (with plugin)" \
>>>> expected-quorum-votes="2" \
>>>> stonith-enabled="false"
>>>>
>>>> Any idea why doesn't start on the second slave?
>>>>
>>>> More info:
>>>>
>>>> Master:
>>>>
>>>> root at master ~]# netstat -putan | grep 5432 | grep LISTEN
>>>> tcp 0 0 0.0.0.0:5432 0.0.0.0:*
>>>> LISTEN 3241/postgres
>>>> tcp 0 0 :::5432 :::*
>>>> LISTEN 3241/postgres
>>>> [root at master ~]# ps axu | grep postgres
>>>> postgres 3241 0.0 0.0 97072 7692 ? S 11:41 0:00
>>>> /usr/pgsql-9.2/bin/postgres -D /var/lib/pgsql/9.2/data -c
>>>> config_file=/var/lib/pgsql/9.2/data//postgresql.conf
>>>> postgres 3293 0.0 0.0 97072 1556 ? Ss 11:41 0:00
>>>> postgres:
>>>> checkpointer process
>>>> postgres 3294 0.0 0.0 97072 1600 ? Ss 11:41 0:00
>>>> postgres:
>>>> writer process
>>>> postgres 3295 0.0 0.0 97072 1516 ? Ss 11:41 0:00
>>>> postgres:
>>>> wal writer process
>>>> postgres 3296 0.0 0.0 97920 2760 ? Ss 11:41 0:00
>>>> postgres:
>>>> autovacuum launcher process
>>>> postgres 3297 0.0 0.0 82712 1500 ? Ss 11:41 0:00
>>>> postgres:
>>>> archiver process failed on 000000010000000000000001
>>>> postgres 3298 0.0 0.0 82872 1568 ? Ss 11:41 0:00
>>>> postgres:
>>>> stats collector process
>>>> root 10901 0.0 0.0 103232 852 pts/0 S+ 11:44 0:00 grep
>>>> postgres
>>>>
>>>>
>>>> On slave:
>>>>
>>>> [root at slave ~]# ps axu | grep postgre
>>>> root 3332 0.0 0.0 103232 856 pts/0 S+ 11:45 0:00 grep
>>>> postgre
>>>> [root at slave ~]# netstat -putan | grep 5432
>>>> [root at slave ~]#
>>>>
>>>>
>>>> If I make pg_ctl /var/lib/pgsql/9.2/data/ start work ok
>>>>
>>>> Any idea?
>>>>
>>>>
>>>> 2013/9/11 Takatoshi MATSUO <matsuo.tak at gmail.com>
>>>>>
>>>>> Hi Eloy
>>>>>
>>>>> Please see http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster .
>>>>> In the document, it uses virtual IP to receive connection,
>>>>> so it doesn't need to change recovery.conf.
>>>>>
>>>>> Thanks,
>>>>> Takatoshi MATSUO
>>>>>
>>>>>
>>>>> 2013/9/11 Eloy Coto Pereiro <eloy.coto at gmail.com>:
>>>>>> Hi,
>>>>>>
>>>>>> In Postgresql if you use wal replication
>>>>>> <http://wiki.postgresql.org/wiki/Streaming_Replication> when the
>>>>>> master
>>>>>> servers fails need to change the recovery.conf on the slave server.
>>>>>>
>>>>>> In this case any tool, when the master is down, execute a command and
>>>>>> get
>>>>>> this info?
>>>>>> Is this the right tool for postgresql's replication?
>>>>>>
>>>>>> Cheers
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130919/d54e537e/attachment-0003.sig>
More information about the Pacemaker
mailing list