[Pacemaker] Pacemaker CoroSync + PGPool-II

Tue Apr 24 14:59:12 UTC 2012

After doing some searching on setting up "PGPool-HA" to limit pgpool being a single point of failure it looks like development on the heartbeat project has reduced greatly and development has shifted to corosync  (backed by RedHat and Suse) that is recommend by pacemaker.

I've found an article here http://masteinhauser.github.com/blog/2011/09/24/pacemaker-pgpool2/ that explains using pacemaker with pgpool-II.  In the post a resource agent is provided.  There was a quick tweak I had to make with the PGPool-II path created by the installed RPMs obtained from http://yum.postgresql.org/9.1/redhat/rhel-$releasever-$basearch.  I modified the below marked in bold from /var/run/pgpool/ to/var/run/pgpool-II-91

pgpool2_status() {
    if [ ! -r "/var/run/pgpool-II-91/pgpool.pid" ]; then
        return $OCF_NOT_RUNNING
    fi
    ps_info=$(ps ax | grep "pgpool" | grep $(cat /var/run/pgpool-II-91/pgpool.pid))

I used the following parameters to created the resource

crm configure primitive pgPool ocf:heartbeat:pgpool2 \
params pcp_admin_username=postgres \
params pcp_admin_password=password \
params pcp_admin_port=9898 \
params pcp_admin_host=localhost \
params pgpool_bin=/usr/bin/pgpool \
params pcp_attach_node_bin=/usr/bin/pcp_attach_node \
params pcp_detach_node_bin=/usr/bin/pcp_detach_node \
params pcp_node_count_bin=/usr/bin/pcp_node_count \
params pcp_node_info_bin=/usr/bin/pcp_node_info \
params stop_mode=f \
params auto_reconnect=t \
params fail_on_detached=true \
op monitor interval=1min

The resource is looks to be created correctly but when I (re)start the corosync service and look at crm_mon I see some failed actions

============
Last updated: Tue Apr 24 08:31:08 2012
Last change: Tue Apr 24 08:02:31 2012 via cibadmin on pg1.stage.arin.net<http://pg1.stage.arin.net/>
Stack: openais
Current DC: pg2.stage.arin.net<http://pg2.stage.arin.net/> - partition with quorum
Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558
2 Nodes configured, 2 expected votes
2 Resources configured.
============

Online: [ pg1.stage.net<http://pg1.stage.net/> pg2.stage.net<http://pg2.stage.net/> ]

ClusterIP (ocf::heartbeat:IPaddr2):
Started pg2.stage.net<http://pg2.stage.net/>

Failed actions:
    pgPool_monitor_0 (node=pg1.stage.net, call=3, rc=2, status=complete): invalid parameter
    pgPool_monitor_0 (node=pg2.stage.net, call=3, rc=2, status=complete): invalid parameter

When I look in the /var/log/cluster/corosync.log I see this error Apr 24 08:23:48 pg1.stage.net<http://pg1.stage.net/> lrmd: [28471]: WARN: Managed pgPool:monitor process 28484 exited with return code 2

Has anyone ran into a similar experience or have suggestions for a cluster solution with pgpool-II.

v/r

STEVE
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120424/bf688c17/attachment-0003.html>