[Pacemaker] setup advice

Tue Jul 2 22:31:57 EDT 2013

Hi Stefano

2013/7/2 Stefano Sasso <stesasso at gmail.com>:
> Hello folks,
>   I have the following setup in mind, but I need some advice and one hint on
> how to realize a particular function.
>
> I have a N (>= 2) nodes cluster, with data storage on postgresql.
> I would like to manage postgres master-slave replication in this way: one
> node is the "master", one is the "slave", and the others are "standby"
> nodes.
> If the master fails, the slave becomes the master, and one of the standby
> becomes the slave.
> If the slave fails, one of the standby becomes the new slave.

Does "standby" mean that PostgreSQL is stopped ?
If Master doesn't have WAL files which new slave needs,
new slave can't connect master.

How do you solve it ?
copy data or wal-archive on start automatically ?
It may cause timed-out if PostgreSQL has large database.

> If one of the "standby" fails, no problem :)
> I can correctly manage this configuration with ms and a custom script (using
> ocf:pacemaker:Stateful as example). If the cluster is already operational,
> the failover works fine.
>
> My problem is about cluster start-up: in fact, only the previous running
> master and slave own the most updated data; so I would like that the new
> master should be the "old master" (or, even, the old slave), and the new
> slave should be the "old slave" (but this one is not mandatory). The
> important thing is that the new master should have up-to-date data.
> This should happen even if the servers are booted up with some minutes of
> delay between them. (users are very stupid sometimes).

Latest pgsql RA embraces these ideas to manage replication.

 1. First boot
RA compares data and promotes PostgreSQL which has latest data.
The number of comparison can be changed  using xlog_check_count parameter.
If monitor interval is 10 sec and xlog_check_count is 360, RA can wait
1 hour to promote :)

2. Second boot
Master manages slave's data using attribute with "-l forever" option.
So RA can't start PostgreSQL, if the node has no latest data.

> My idea is the following:
> the MS resource is not started when the cluster comes up, but on startup
> there will only be one "arbitrator" resource (started on only one node).
> This resource reads from somewhere which was the previous master and the
> previous slave, and it wait up to 5 minutes to see if one of them comes up.
> In positive case, it forces the MS master resource to be run on that node
> (and start it); in negative case, if the wait timer expired, it start the
> master resource on a random node.
>
> Is that possible? How can avoid a single resource to start on cluster boot?
> Or, could you advise another way to do this setup?
>
> I hope I was clear, my english is not so good :)
> thank you so much,
>    stefano
>
> --
> Stefano Sasso
> http://stefano.dscnet.org/

Regards,
Takatoshi MATSUO