[Pacemaker] pgsql RA
Евгений Селявка
evg.selyavka at gmail.com
Mon May 19 16:25:04 UTC 2014
Dear users,
I use pacemaker RA script pgsql wich support the replication in schema with 2
non shared servers(servers are identical) this is my environment.
Linux billing-db1 2.6.32-358.23.2.el6.x86_64 #1 SMP Wed Oct 19
18:37:12 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
[root at billing-db1 ~]# cat /etc/issue
CentOS release 6.4 (Final)
Kernel \r on an \m
[root at billing-db1 ~]# rpm -qa | grep pacemaker
pacemaker-libs-1.1.10-14.el6_5.1.x86_64
pacemaker-cli-1.1.10-14.el6_5.1.x86_64
pacemaker-1.1.10-14.el6_5.1.x86_64
pacemaker-cluster-libs-1.1.10-14.el6_5.1.x86_64
[root at billing-db1 ~]# rpm -qa | grep corosync
corosynclib-1.4.1-15.el6_4.1.x86_64
corosync-1.4.1-15.el6_4.1.x86_64
[root at billing-db1 ~]# rpm -qa | grep clus
clusterlib-3.0.12.1-49.el6_4.2.x86_64
PostgreSQL 9.1.11 on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
4.4.7 20120313 (Red Hat 4.4.7-3), 64-bit
pgsqlRA from here https://github.com/t-matsuo/resource-agents
This is cluster log output on command pcs resource cleanup msPostgresql:
May 19 18:59:51 [4136] billing-db1 crmd: info:
delete_resource: Removing resource pgsql for
95f5d2f4-ca62-4966-8cc2-89c2b9898a50 (internal) on billing-db2
May 19 18:59:51 [4133] billing-db1 lrmd: info:
cancel_recurring_action: Cancelling operation
pgsql_monitor_10000
May 19 18:59:51 [4134] billing-db1 attrd: notice:
attrd_cs_dispatch: Update relayed from billing-db2
May 19 18:59:51 [4136] billing-db1 crmd: info:
lrm_remove_deleted_op: Removing op pgsql_monitor_10000:317 for
deleted resource pgsql
May 19 18:59:51 [4136] billing-db1 crmd: info:
notify_deleted: Notifying 95f5d2f4-ca62-4966-8cc2-89c2b9898a50
on billing-db2 that pgsql was deleted
May 19 18:59:51 [4134] billing-db1 attrd: notice:
attrd_cs_dispatch: Update relayed from billing-db2
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Forwarding cib_delete operation for section
//node_state[@uname='billing-db1']//lrm_resource[@id='pgsql'] to
master (origin=local/crmd/183)
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Completed cib_query operation for section
//cib/configuration/crm_config//cluster_property_set//nvpair[@name='last-lrm-refresh']:
OK (rc=0, origin=local/crmd/184, version=0.72.1)
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Forwarding cib_modify operation for section
crm_config to master (origin=local/crmd/185)
May 19 18:59:51 [4133] billing-db1 lrmd: info:
process_lrmd_get_rsc_info: Resource 'pgsql' not found (3 active
resources)
May 19 18:59:51 [4136] billing-db1 crmd: info:
process_lrm_event: LRM operation pgsql_monitor_10000 (call=317,
status=1, cib-update=0, confirmed=true) Cancelled
May 19 18:59:51 [4136] billing-db1 crmd: info:
update_history_cache: Resource pgsql no longer exists, not updating
cache
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Completed cib_apply_diff operation for section
//node_state[@uname='billing-db2']//lrm_resource[@id='pgsql']: OK
(rc=0, origin=billing-db2/crmd/2058, version=0.72.2)
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Completed cib_apply_diff operation for section
//node_state[@uname='billing-db1']//lrm_resource[@id='pgsql']: OK
(rc=0, origin=billing-db2/crmd/183, version=0.72.3)
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Completed cib_apply_diff operation for section
crm_config: OK (rc=0, origin=billing-db2/crmd/185, version=0.73.1)
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Completed cib_query operation for section
crm_config: OK (rc=0, origin=local/crmd/186, version=0.73.1)
May 19 18:59:51 [4136] billing-db1 crmd: info:
plugin_handle_membership: Membership 1192: quorum retained
May 19 18:59:51 [4131] billing-db1 cib: info:
write_cib_contents: Archived previous version as
/var/lib/pacemaker/cib/cib-27.raw
May 19 18:59:51 [4131] billing-db1 cib: info:
write_cib_contents: Wrote version 0.73.0 of the CIB to disk
(digest: 76e018df11f3d006d9ba3e0d6fc28225)
May 19 18:59:51 [4133] billing-db1 lrmd: info:
process_lrmd_get_rsc_info: Resource 'pgsql' not found (3 active
resources)
May 19 18:59:51 [4133] billing-db1 lrmd: info:
process_lrmd_get_rsc_info: Resource 'pgsql:0' not found (3 active
resources)
May 19 18:59:51 [4133] billing-db1 lrmd: info:
process_lrmd_rsc_register: Added 'pgsql' to the rsc list (4 active
resources)
May 19 18:59:51 [4136] billing-db1 crmd: info:
do_lrm_rsc_op: Performing
key=8:444:7:c52110df-29fb-41e4-9f4b-fa87787771ba op=pgsql_monitor_0
May 19 18:59:51 [4131] billing-db1 cib: info: retrieveCib:
Reading cluster configuration from: /var/lib/pacemaker/cib/cib.MjStjO
(digest: /var/lib/pacemaker/cib/cib.z5hNP8)
May 19 18:59:51 [4131] billing-db1 cib: info:
crm_client_new: Connecting 0x1b88300 for uid=0 gid=0 pid=32084
id=2c3b9838-ac73-4bc0-b8b0-e7c896a55a3f
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Completed cib_query operation for section
'all': OK (rc=0, origin=local/crm_mon/2, version=0.73.1)
May 19 18:59:51 [4131] billing-db1 cib: info:
crm_client_destroy: Destroying 0 events
May 19 18:59:51 [4131] billing-db1 cib: info:
crm_client_new: Connecting 0x1b88300 for uid=0 gid=0 pid=32095
id=78412cee-994e-4efc-a268-f361c5039681
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Completed cib_query operation for section
nodes: OK (rc=0, origin=local/crm_attribute/2, version=0.73.1)
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Completed cib_query operation for section
//cib/configuration/nodes//node[@id='billing-db1']//instance_attributes//nvpair[@name='pgsql-data-status']:
OK (rc=0, origin=local/crm_attribute/3, version=0.73.1)
May 19 18:59:51 [4131] billing-db1 cib: info:
crm_client_destroy: Destroying 0 events
May 19 18:59:51 [4133] billing-db1 lrmd: notice:
operation_finished: pgsql_monitor_0:32032:stderr [
2014/05/19_18:59:51 INFO: Don't check /var/lib/pgsql/9.1/data/ during
probe ]
May 19 18:59:51 [4136] billing-db1 crmd: notice:
process_lrm_event: LRM operation pgsql_monitor_0 (call=326, rc=0,
cib-update=187, confirmed=true) ok
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Forwarding cib_modify operation for section
status to master (origin=local/crmd/187)
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Completed cib_apply_diff operation for section
status: OK (rc=0, origin=billing-db2/crmd/187, version=0.73.2)
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Completed cib_apply_diff operation for section
status: OK (rc=0, origin=billing-db2/crmd/2067, version=0.73.3)
May 19 18:59:51 [4136] billing-db1 crmd: info:
do_lrm_rsc_op: Performing
key=15:445:0:c52110df-29fb-41e4-9f4b-fa87787771ba
op=pgsql_monitor_10000
May 19 18:59:51 [4131] billing-db1 cib: info:
crm_client_new: Connecting 0x1beca10 for uid=0 gid=0 pid=32155
id=057f384e-4142-4cfd-8df3-91d1262ecf01
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Completed cib_query operation for section
'all': OK (rc=0, origin=local/crm_mon/2, version=0.73.3)
May 19 18:59:51 [4131] billing-db1 cib: info:
crm_client_destroy: Destroying 0 events
May 19 18:59:51 [4131] billing-db1 cib: info:
crm_client_new: Connecting 0x1beca10 for uid=0 gid=0 pid=32166
id=5f86fae5-b028-4c63-94a2-461d8adfd1b0
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Completed cib_query operation for section
nodes: OK (rc=0, origin=local/crm_attribute/2, version=0.73.3)
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Completed cib_query operation for section
//cib/configuration/nodes//node[@id='billing-db1']//instance_attributes//nvpair[@name='pgsql-data-status']:
OK (rc=0, origin=local/crm_attribute/3, version=0.73.3)
May 19 18:59:51 [4131] billing-db1 cib: info:
crm_client_destroy: Destroying 0 events
May 19 18:59:51 [4136] billing-db1 crmd: notice:
process_lrm_event: LRM operation pgsql_monitor_10000 (call=329,
rc=0, cib-update=188, confirmed=false) ok
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Forwarding cib_modify operation for section
status to master (origin=local/crmd/188)
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Completed cib_apply_diff operation for section
status: OK (rc=0, origin=billing-db2/crmd/188, version=0.73.4)
May 19 18:59:51 [4131] billing-db1 cib: info:
cib_process_request: Completed cib_apply_diff operation for section
status: OK (rc=0, origin=billing-db2/crmd/2069, version=0.73.5)
The proublem is that one of the server is in HS:alone data status
permanently, after failover occur. I try to set this attribute by hand
with command crm_attribute -l forever -N billing-db1 -n
"pgsql-data-status" -v "STREAMING|SYNC" and crm_attribute -l forever
-N billing-db1 -n "pgsql-status" -v "HS:sync". I also manually add
string 'billing-db1' in this file(/var/lib/pgsql/tmp/rep_mode.conf),
on master billing-db2 server, and really replication exists:
user=postgres,db=postgres@[local]:5433=# SELECT
application_name,state,sync_state from pg_stat_replication ;
┌──────────────────┬───────────┬────────────┐
│ application_name │ state │ sync_state │
├──────────────────┼───────────┼────────────┤
│ billing-db1 │ streaming │ sync │
└──────────────────┴───────────┴────────────┘
[root at billing-db2 ~]# cat ~postgres/tmp/rep_mode.conf
synchronous_standby_names = 'billing-db1'
And now two servers in synchronous replication, but script can't detect this.
[root at billing-db2 ~]# crm_mon -Af1
Last updated: Mon May 19 19:20:00 2014
Last change: Mon May 19 19:11:56 2014 via crmd on billing-db1
Stack: classic openais (with plugin)
Current DC: billing-db2 - partition with quorum
Version: 1.1.10-14.el6_5.1-368c726
2 Nodes configured, 2 expected votes
6 Resources configured
Online: [ billing-db1 billing-db2 ]
vip-master (ocf::heartbeat:IPaddr2): Started billing-db2
vip-slave (ocf::heartbeat:IPaddr2): Started billing-db2
Master/Slave Set: msPostgresql [pgsql]
Masters: [ billing-db2 ]
Slaves: [ billing-db1 ]
Clone Set: clnPingCheck [pingCheck]
Started: [ billing-db1 billing-db2 ]
Node Attributes:
* Node billing-db1:
+ default_ping_set : 100
+ master-pgsql : 1000
+ pgsql-data-status : STREAMING|SYNC
+ pgsql-status : HS:alone
+ pgsql-xlog-loc : 0000002616E52B88
* Node billing-db2:
+ default_ping_set : 100
+ master-pgsql : 1000
+ pgsql-data-status : LATEST
+ pgsql-master-baseline : 0000002606B0B2C8
+ pgsql-status : PRI
Migration summary:
* Node billing-db1:
* Node billing-db2:
As you can see i expect that attribute pgsql-data-status will be
'HS:sync' but pgsql-status is not equal HS:sync. On both servers i
don't see anything except rep_mode.conf in ~postgres/tmp/
I see only one possible chance it is erasing configuration on both
node and applying new one, but it will cause downtime.
Where the problem may be, could you help with advice?
--
Best Regards,
Seliavka Evgenii
More information about the Pacemaker
mailing list