[Pacemaker] Postgresql streaming replication failover - RA needed
Takatoshi MATSUO
matsuo.tak at gmail.com
Fri Dec 9 05:34:49 UTC 2011
Hi Attila
2011/12/8 Attila Megyeri <amegyeri at minerva-soft.com>:
> Hi Takatoshi,
>
> One strange thing I noticed and could probably be improved.
> When there is data inconsistency, I have the following node properties:
>
> * Node psql2:
> + default_ping_set : 100
> + master-postgresql:1 : -INFINITY
> + pgsql-data-status : DISCONNECT
> + pgsql-status : HS:alone
> * Node psql1:
> + default_ping_set : 100
> + master-postgresql:0 : 1000
> + master-postgresql:1 : -INFINITY
> + pgsql-data-status : LATEST
> + pgsql-master-baseline : 58:000000004B000020
> + pgsql-status : PRI
>
> This is fine, and understandable - but I can see this only if I do a crm_mon -A.
>
> My problem is, that CRM shows the following:
>
> Master/Slave Set: db-ms-psql [postgresql]
> Masters: [ psql1 ]
> Slaves: [ psql2 ]
>
> So if I monitor the system from crm_mon, HAWK or ther tools - I have no indication at all that the slave is running in an inconsistent mode.
>
> I would expect the RA to stop the psql2 node in such cases, because:
> - It is running, but has non-up-to-date data, therefore noone will use it (the slave IP points to the master as well, which is good)
> - In CRM status eveything looks perfect, even though it is NOT perfect and admin intervention is required.
>
>
> Shouldn't the disconnected PSQL server be stopped instead?
hmm..
It's not better to stop PGSQL server.
RA cannot know whether PGSQL is disconnected because of
data-inconsistent or network-down or
starting-up and so on.
How about using dummy RA such as vip-slave?
-------------------------------------------
primitive runningSlaveOK ocf:heartbeat:Dummy
.....(snip)
location rsc_location-dummy runningSlaveOK \
rule 200: pgsql-status eq "HS:sync"
-------------------------------------------
Regards,
Takatoshi MATSUO
More information about the Pacemaker
mailing list