[Pacemaker] Postgresql streaming replication failover - RA needed

Thu Dec 8 12:55:23 UTC 2011

Hi Takatoshi,

One strange thing I noticed and could probably be improved.
When there is data inconsistency, I have the following node properties:

* Node psql2:
    + default_ping_set                  : 100
    + master-postgresql:1               : -INFINITY
    + pgsql-data-status                 : DISCONNECT
    + pgsql-status                      : HS:alone
* Node psql1:
    + default_ping_set                  : 100
    + master-postgresql:0               : 1000
    + master-postgresql:1               : -INFINITY
    + pgsql-data-status                 : LATEST
    + pgsql-master-baseline             : 58:000000004B000020
    + pgsql-status                      : PRI

This is fine, and understandable - but I can see this only if I do a crm_mon -A.

My problem is, that CRM shows the following:

Master/Slave Set: db-ms-psql [postgresql]
     Masters: [ psql1 ]
     Slaves: [ psql2 ]

So if I monitor the system from crm_mon, HAWK or ther tools - I have no indication at all that the slave is running in an inconsistent mode.

I would expect the RA to stop the psql2 node in such cases, because:
- It is running, but has non-up-to-date data, therefore noone will use it (the slave IP points to the master as well, which is good)
- In CRM status eveything looks perfect, even though it is NOT perfect and admin intervention is required.

Shouldn't the disconnected PSQL server be stopped instead?

Regards,
Attila

-----Original Message-----
From: Takatoshi MATSUO [mailto:matsuo.tak at gmail.com]
Sent: 2011. november 28. 11:10
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] Postgresql streaming replication failover - RA needed

Hi Attila

2011/11/28 Attila Megyeri <amegyeri at minerva-soft.com>:
> Hi Takatoshi,
>
> I understand your point and I agree that the correct behavior is not to start replication when data consistency exists.
> The only thing I do not really understand is how it could have happened:
>
> 1) nodes were in sync (psql1=PRI, psql2=STREAMING|SYNC)
> 2) I shut down node psql1 (by placing it into standby)
> 3) At this moment psql1's baseline became higher by 20?  What could cause this? Probably the demote operation itself? There were no clients connected - and there was definitively no write operation to the db (except if not from system side).

Yes, PostgreSQL executes a CHECKPOINT when it is shut down normally on demote.

> On the other hand - thank you very much for your contribution, the RA works very well and I really appreciate your work and help!

Not at all. Don't mention it.

Regards,
Takatoshi MATSUO

> Bests,
>
> Attil
>
> -----Original Message-----
> From: Takatoshi MATSUO [mailto:matsuo.tak at gmail.com]
> Sent: 2011. november 28. 2:10
> To: The Pacemaker cluster resource manager
> Subject: Re: [Pacemaker] Postgresql streaming replication failover -
> RA needed
>
> Hi Attila
>
> Primary can not send all wals to HotStandby whether primary is shut down normally.
> These logs validate it.
>
>> Nov 27 16:03:27 psql1 pgsql[12204]: INFO: My Timeline ID and
>> Checkpoint : 14:0000000023000020 Nov 27 16:03:27 psql1 pgsql[12204]:
>> INFO: psql2 master baseline : 14:0000000023000000
>
> psql1's location was  "0000000023000020" when it was demoted.
> OTOH psql2's location was "0000000023000000"  when it was promoted.
>
> It means that psql1's data was newer than psql2's one at that time.
> The gap is 20.
>
> As you said you can start psql1's PostgreSQL manually, but PostgreSQL can't realize this occurrence.
> If you start HotStandby at psql1, data is replicated after 0000000023000020.
> It's inconsistency.
>
> Thanks,
> Takatoshi MATSUO
>
>
> 2011/11/28 Attila Megyeri <amegyeri at minerva-soft.com>:
>> Hi Takatoshi,
>>
>> I don't think it is inconsistency problem - for me it looks like some RA bug.
>> I think so, because postgres starts properly outside pacemaker.
>>
>> When pacemaker starts node psql1 I see only:
>>
>> postgresql:0_start_0 (node=psql1, call=9, rc=1, status=complete):
>> unknown error
>>
>> and the postgres log is empty - so I suppose that it does not even try to start it.
>>
>> What I tested was:
>> - I had a stable cluster, where psql1 was the master, psql2 was the
>> slave
>> - I put psql1 into standby mode. ("node psql1 standby") to test
>> failover
>> - After a while psql2 became the PRI, which is very good
>> - When I put psql1 back online, postgres wouldn't start anymore from pacemaker (unknown error).
>>
>>
>> I tried to start postgres manually from the shell it worked fine, even the monitor was able to see that it became in SYNC (obviously the master/slave group was showing improper state as psql was started outside pacemaker.
>>
>> I don't think data inconsistency is the case, partially because there are no clients connected, partially because psql starts properly outside pacemaker.
>>
>> Here is what is relevant from the log:
>>
>> Nov 27 16:02:50 psql1 pgsql[11021]: DEBUG: PostgreSQL is running as a primary.
>> Nov 27 16:02:51 psql1 pgsql[11021]: DEBUG: node=psql2,
>> state=STREAMING, sync_state=SYNC Nov 27 16:02:53 psql1 pgsql[11142]: DEBUG: PostgreSQL is running as a primary.
>> Nov 27 16:02:53 psql1 pgsql[11142]: DEBUG: node=psql2,
>> state=STREAMING, sync_state=SYNC Nov 27 16:02:55 psql1 pgsql[11272]: DEBUG: PostgreSQL is running as a primary.
>> Nov 27 16:02:55 psql1 pgsql[11272]: DEBUG: node=psql2,
>> state=STREAMING, sync_state=SYNC Nov 27 16:02:57 psql1 pgsql[11368]: DEBUG: PostgreSQL is running as a primary.
>> Nov 27 16:02:57 psql1 pgsql[11368]: DEBUG: node=psql2,
>> state=STREAMING, sync_state=SYNC Nov 27 16:03:00 psql1 pgsql[11463]: DEBUG: PostgreSQL is running as a primary.
>> Nov 27 16:03:00 psql1 pgsql[11463]: DEBUG: node=psql2,
>> state=STREAMING, sync_state=SYNC Nov 27 16:03:00 psql1 pgsql[11556]:
>> DEBUG: notify: pre for demote Nov 27 16:03:00 psql1 pgsql[11590]: INFO: Stopping PostgreSQL on demote.
>> Nov 27 16:03:02 psql1 pgsql[11590]: INFO: waiting for server to shut
>> down..... done server stopped Nov 27 16:03:02 psql1 pgsql[11590]: INFO: Removing /var/lib/pgsql/PGSQL.lock.
>> Nov 27 16:03:02 psql1 pgsql[11590]: INFO: PostgreSQL is down Nov 27
>> 16:03:02 psql1 pgsql[11590]: INFO: Changing pgsql-status on psql1 : PRI->STOP.
>> Nov 27 16:03:02 psql1 pgsql[11590]: DEBUG: Created recovery.conf.
>> host=10.12.1.28, user=postgres Nov 27 16:03:02 psql1 pgsql[11590]: INFO: Setup all nodes as an async.
>> Nov 27 16:03:02 psql1 pgsql[11732]: DEBUG: notify: post for demote
>> Nov
>> 27 16:03:02 psql1 pgsql[11732]: DEBUG: post-demote called. Demote
>> uname is psql1 Nov 27 16:03:02 psql1 pgsql[11732]: INFO: My Timeline
>> ID and Checkpoint : 14:0000000023000020 Nov 27 16:03:02 psql1 pgsql[11732]: WARNING: Can't get psql2 master baseline. Waiting...
>> Nov 27 16:03:03 psql1 pgsql[11732]: INFO: psql2 master baseline :
>> 14:0000000023000000 Nov 27 16:03:03 psql1 pgsql[11732]: ERROR: My data is inconsistent.
>> Nov 27 16:03:03 psql1 pgsql[11867]: DEBUG: notify: pre for stop Nov
>> 27
>> 16:03:03 psql1 pgsql[11969]: INFO: PostgreSQL is already stopped.
>> Nov 27 16:03:12 psql1 pgsql[12053]: INFO: Don't check
>> /var/lib/postgresql/9.1/main during probe Nov 27 16:03:12 psql1
>> pgsql[12053]: INFO: PostgreSQL is down Nov 27 16:03:27 psql1 pgsql[12204]: INFO: Changing pgsql-status on psql1 : ->STOP.
>> Nov 27 16:03:27 psql1 pgsql[12204]: DEBUG: Created recovery.conf.
>> host=10.12.1.28, user=postgres Nov 27 16:03:27 psql1 pgsql[12204]: INFO: Setup all nodes as an async.
>> Nov 27 16:03:27 psql1 pgsql[12204]: INFO: My Timeline ID and
>> Checkpoint : 14:0000000023000020 Nov 27 16:03:27 psql1 pgsql[12204]:
>> INFO: psql2 master baseline : 14:0000000023000000 Nov 27 16:03:27 psql1 pgsql[12204]: ERROR: My data is inconsistent.
>> Nov 27 16:03:27 psql1 pgsql[12339]: DEBUG: notify: post for start Nov
>> 27 16:03:27 psql1 pgsql[12373]: DEBUG: notify: pre for stop Nov 27
>> 16:03:27 psql1 pgsql[12407]: INFO: PostgreSQL is already stopped.
>>
>>
>> Thanks,
>>
>> Attila
>>
>>
>> -----Original Message-----
>> From: Takatoshi MATSUO [mailto:matsuo.tak at gmail.com]
>> Sent: 2011. november 27. 11:07
>> To: The Pacemaker cluster resource manager
>> Subject: Re: [Pacemaker] Postgresql streaming replication failover -
>> RA needed
>>
>> Hi Attila
>>
>> 2011/11/27 Attila Megyeri <amegyeri at minerva-soft.com>:
>>> Hi Takatoshi,
>>>
>>> You were right, changing the shell to bash resolved the problem.
>>> The cluster now started in sync mode - thank you very much.
>>
>> You're very welcome.
>>
>>> I will be testing it in the next couple of days. I did just a very
>>> quick test - it seems that psql master failed over to psql2
>>> properly, but when I tried to move it back to psql1 there was some problems starting psql on node 1.
>>
>> If master(psql1) is failed, its data may be inconsistency.
>> A PostgreSQL developer says that it's a feature.
>> Therefore my RA prevent it from starting automatically if data is inconsistency.
>> Please backup psql2' data and restore it to psql1, and remove
>> /var/lib/pgsql/PGSQL.lock file before clearing failcount.
>>
>> I use rsync to backup and restore in the following way.
>> -----
>> # psql -h 192.168.2.114 -U postgres -c "SELECT pg_start_backup('label')"
>> # rsync -avr --delete --exclude=postmaster.pid
>> 192.168.2.114:/var/lib/pgsql/9.1/data/ /var/lib/pgsql/9.1/data/ #
>> psql -h 192.168.2.114 -U postgres -c "SELECT pg_stop_backup()"
>> -----
>>
>>
>> BTW I fixed some bugs 2 days ago.
>> Please use the newest version.
>>
>> Thanks,
>> Takatoshi MATSUO
>>
>>
>>>
>>> Does it work fine for you in  both directions?
>>>
>>> Thank you very much.
>>>
>>> Have a nice weekend,
>>>
>>> Attila
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Takatoshi MATSUO [mailto:matsuo.tak at gmail.com]
>>> Sent: 2011. november 27. 6:12
>>> To: The Pacemaker cluster resource manager
>>> Subject: Re: [Pacemaker] Postgresql streaming replication failover -
>>> RA needed
>>>
>>> Hi Attila
>>>
>>> 2011/11/27 Attila Megyeri <amegyeri at minerva-soft.com>:
>>>> Hi Takatoshi,
>>>>
>>>> Thank you for coming back to me so quickly.
>>>>
>>>> In the /var/lib/pgsql there are the following files:
>>>>
>>>> PSQL1:
>>>> =====
>>>> root at psql1:/var/lib/pgsql# ls -la
>>>> total 16
>>>> drwxr-xr-x  2 postgres postgres 4096 Nov 26 18:04 .
>>>> drwxr-xr-x 35 root     root     4096 Nov 25 22:21 ..
>>>> -rw-r--r--  1 postgres postgres    1 Nov 26 00:17 rep_mode.conf
>>>> -rw-r--r--  1 root     root       49 Nov 26 18:04 xlog_note.0
>>>>
>>>> root at psql1:/var/lib/pgsql# cat xlog_note.0 -e psql1
>>>> 0000000019000000
>>>> psql2 0000000019000000
>>>> root at psql1:/var/lib/pgsql#
>>>>
>>>> PSQL2:
>>>> =======
>>>> root at psql2:/var/lib/pgsql# ls -la
>>>> total 16
>>>> drwxr-xr-x  2 postgres postgres 4096 Nov 26 18:05 .
>>>> drwxr-xr-x 33 root     root     4096 Nov 26 00:10 ..
>>>> -rw-r--r--  1 postgres postgres    1 Nov 26 00:24 rep_mode.conf
>>>> -rw-r--r--  1 root     root       49 Nov 26 18:05 xlog_note.0
>>>> root at psql2:/var/lib/pgsql# cat xlog_note.0 -e psql1
>>>> 0000000019000000
>>>> psql2 0000000019000000
>>>> root at psql2:/var/lib/pgsql#
>>>
>>> It seems that dash's bultin echo command is used because echo with "-e" option dose not function.
>>>
>>> Perhaps my RA also depends on bash.
>>> Can you use a bash instead of a dash?
>>>
>>>> BTW, postgres is installed under /var/lib/postgresql , but I noticed that some parts of the RA are referring to the  /var/lib/pgsql directory, so I created that directory and i keep some of the files there.
>>>
>>> It's no ploblem.
>>> If you want to change this path, please specify it using "tmpdir" parameter.
>>>
>>> Regards,
>>> Takatoshi MATSUO
>>>
>>>>
>>>> Thanks,
>>>> Attila
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Takatoshi MATSUO [mailto:matsuo.tak at gmail.com]
>>>> Sent: 2011. november 26. 18:27
>>>> To: The Pacemaker cluster resource manager
>>>> Subject: Re: [Pacemaker] Postgresql streaming replication failover
>>>> - RA needed
>>>>
>>>> Hi Attila
>>>>
>>>> 1. Are there /var/lib/pgsql/xlog_note.0 , xlog_note.1, xlog_note.2 .... files?
>>>>   These files are created while checking a xlog location on monitor.
>>>>
>>>> 2. Do these files include lines as below?
>>>> ---------
>>>> pgsql1  0000000019000000
>>>> pgsql2  0000000019000000
>>>> ---------
>>>>
>>>> Regards.
>>>> Takatoshi MATSUO
>>>>
>>>>
>>>> 2011年11月26日22:44 Attila Megyeri <amegyeri at minerva-soft.com>:
>>>>> Hi Yoshiharu, Takatoshi,
>>>>>
>>>>> Spent another day, without success. :(
>>>>>
>>>>> I started from scratch and synchronous replications works nicely when nodes are started outside pacemaker.
>>>>> My PostgreSQL version is 9.1.1.
>>>>>
>>>>> When I start from pacemaker, after a while it gets into the following state:
>>>>>
>>>>> Online: [ psql1 psql2 ]
>>>>>
>>>>>  Master/Slave Set: msPostgresql [postgresql]
>>>>>     Slaves: [ psql1 psql2 ]
>>>>>  Clone Set: clnPingCheck [pingCheck]
>>>>>     Started: [ psql1 psql2 ]
>>>>>
>>>>> Node Attributes:
>>>>> * Node psql1:
>>>>>    + default_ping_set                  : 100
>>>>>    + master-postgresql:0               : -INFINITY
>>>>>    + pgsql-status                      : HS:alone
>>>>>    + pgsql-xlog-loc                    : 0000000019000000
>>>>> * Node psql2:
>>>>>    + default_ping_set                  : 100
>>>>>    + master-postgresql:1               : -INFINITY
>>>>>    + pgsql-status                      : HS:alone
>>>>>    + pgsql-xlog-loc                    : 0000000019000000
>>>>>
>>>>>
>>>>> The psql status queries return the following:
>>>>>
>>>>> PSQL1
>>>>> ======
>>>>> postgres at psql1:/root$ psql  -c "select application_name,upper(state),upper(sync_state) from pg_stat_replication"
>>>>> application_name | upper | upper
>>>>> ------------------+-------+-------
>>>>> (0 rows)
>>>>>
>>>>> postgres at psql1:/root$ psql  -Atc "select pg_last_xlog_replay_location(),pg_last_xlog_receive_location()"
>>>>> 0/19000020|0/19000000
>>>>>
>>>>> PSQL2
>>>>> ======
>>>>> postgres at psql2:~$  psql  -c "select application_name,upper(state),upper(sync_state) from pg_stat_replication"
>>>>>  application_name | upper | upper
>>>>> ------------------+-------+-------
>>>>> (0 rows)
>>>>>
>>>>> postgres at psql2:~$ psql  -Atc "select pg_last_xlog_replay_location(),pg_last_xlog_receive_location()"
>>>>> 0/19000000|0/19000000
>>>>>
>>>>>
>>>>> Neither server can connect (obviously) to the master, as the vip_repl Is not brought up.
>>>>>
>>>>>
>>>>> Could you help me understand WHAT is the action/state/event that sould promote one of the nodes? I see that pacemaker monitors the servers every X seconds, but nothing else happens.
>>>>>
>>>>> In the log (limited to pgsql) the following sequence is repeated
>>>>> forewer
>>>>>
>>>>> Nov 26 13:36:19 psql1 pgsql[19829]: INFO: Master is not exist.
>>>>> Nov 26 13:36:19 psql1 pgsql[19829]: DEBUG: Checking right of master.
>>>>> Nov 26 13:36:19 psql1 pgsql[19829]: INFO: My data status=.
>>>>> Nov 26 13:36:19 psql1 pgsql[19829]: INFO: psql1 xlog location :
>>>>> 0000000019000000 Nov 26 13:36:19 psql1 pgsql[19829]: INFO: psql2
>>>>> xlog location : 0000000019000000 Nov 26 13:36:26 psql1 pgsql[19993]: DEBUG: PostgreSQL is running as a hot standby.
>>>>> Nov 26 13:36:26 psql1 pgsql[19993]: INFO: Master is not exist.
>>>>> Nov 26 13:36:26 psql1 pgsql[19993]: DEBUG: Checking right of master.
>>>>> Nov 26 13:36:26 psql1 pgsql[19993]: INFO: My data status=.
>>>>> Nov 26 13:36:26 psql1 pgsql[19993]: INFO: psql1 xlog location :
>>>>> 0000000019000000 Nov 26 13:36:26 psql1 pgsql[19993]: INFO: psql2
>>>>> xlog location : 0000000019000000 Nov 26 13:36:33 psql1 pgsql[20176]: DEBUG: PostgreSQL is running as a hot standby.
>>>>> Nov 26 13:36:33 psql1 pgsql[20176]: INFO: Master is not exist.
>>>>> Nov 26 13:36:33 psql1 pgsql[20176]: DEBUG: Checking right of master.
>>>>> Nov 26 13:36:33 psql1 pgsql[20176]: INFO: My data status=.
>>>>> Nov 26 13:36:33 psql1 pgsql[20176]: INFO: psql1 xlog location :
>>>>> 0000000019000000 Nov 26 13:36:33 psql1 pgsql[20176]: INFO: psql2
>>>>> xlog location : 0000000019000000 Nov 26 13:36:41 psql1 pgsql[20343]: DEBUG: PostgreSQL is running as a hot standby.
>>>>>
>>>>>
>>>>> Any help is appreciated!
>>>>>
>>>>> Regards,
>>>>> Attila
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Yoshiharu Mori [mailto:y-mori at sraoss.co.jp]
>>>>> Sent: 2011. november 25. 14:17
>>>>> To: The Pacemaker cluster resource manager
>>>>> Cc: Attila Megyeri
>>>>> Subject: Re: [Pacemaker] Postgresql streaming replication failover
>>>>> - RA needed
>>>>>
>>>>> Hi　Attila
>>>>>
>>>>>> A quick snippet from the corosync.log
>>>>>>
>>>>>> Nov 23 05:43:05 psql1 pgsql[2845]: DEBUG: Checking right of master.
>>>>>> Nov 23 05:43:05 psql1 pgsql[2845]: INFO: My data status=.
>>>>>> Nov 23 05:43:05 psql1 pgsql[2845]: INFO: psql1 xlog location :
>>>>>> 000000000D000000 Nov 23 05:43:05 psql1 pgsql[2845]: INFO: psql2
>>>>>> xlog location : 0000000008000000
>>>>>>
>>>>>> As you see, the "my data status" returns an empty string.
>>>>>
>>>>> My log is same. but it works.
>>>>>
>>>>> Nov 18 19:28:26 osspc24-1 pgsql[17350]: INFO: Master is not exist.
>>>>> Nov 18 19:28:26 osspc24-1 pgsql[17350]: INFO: Checking right of master.
>>>>> Nov 18 19:28:19 osspc24-1 pgsql[17138]: INFO: My data status=.
>>>>> Nov 18 19:28:19 osspc24-1 pgsql[17138]: INFO: pm01 xlog location :
>>>>> 0000000005000020 Nov 18 19:28:19 osspc24-1 pgsql[17138]: INFO:
>>>>> pm02 xlog location : 0000000005000000
>>>>>
>>>>> In my log, the following logs are outputted and started after checking xlog location(3 times).
>>>>>
>>>>> Nov 18 19:29:39 osspc24-1 pgsql[18720]: INFO: I have a master right.
>>>>>
>>>>> Please show us more corosync.log.
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Attila Megyeri [mailto:amegyeri at minerva-soft.com]
>>>>>> Sent: 2011. november 25. 9:28
>>>>>> To: The Pacemaker cluster resource manager
>>>>>> Subject: Re: [Pacemaker] Postgresql streaming replication
>>>>>> failover
>>>>>> - RA needed
>>>>>>
>>>>>> Hi Takatoshi,
>>>>>>
>>>>>> I have restored the PSQL to run without corosync so I cannot send you the crm_mon output now.
>>>>>>
>>>>>> What I can tell for sure:
>>>>>> - RA never promoted any of the nodes, no matter what the status was. It also did not promote the node, when it was the only one.
>>>>>> - I believe the issue is in the comparison of the xlogs. How could I troubleshoot that? I see from the logs that crm NEVER tried to invoke pgsql with "promote"
>>>>>> - I tried previously the crm_mon -A option, but there was never a "
>>>>>> pgsql-data-status" attribute. The other attribs were there,
>>>>>> including the HS:alone
>>>>>> - In the corosync log the only relevant RA message I see is " Master is not exist. " I never saw a message like  "My data is out-of-date"
>>>>>>
>>>>>> Thank you!
>>>>>>
>>>>>> Attila
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Takatoshi MATSUO [mailto:matsuo.tak at gmail.com]
>>>>>> Sent: 2011. november 25. 8:56
>>>>>> To: The Pacemaker cluster resource manager
>>>>>> Subject: Re: [Pacemaker] Postgresql streaming replication
>>>>>> failover
>>>>>> - RA needed
>>>>>>
>>>>>> Hi Attila
>>>>>>
>>>>>> 2011/11/24 Attila Megyeri <amegyeri at minerva-soft.com>:
>>>>>> > Hi Takatoshi, All,
>>>>>> >
>>>>>> > Thanks for your reply.
>>>>>> > I see that you have invested significant effort in the development of the RA. I spent the last day trying to set up the RA, but without much success.
>>>>>> >
>>>>>> > My infrastructure is very similar to yours, except for the fact that currently I am testing with a single network adapter.
>>>>>> >
>>>>>> > Replication works nicely when I start the databases manually, not using corosync.
>>>>>> >
>>>>>> > When I try to start using corosync,I see that the ping resources start normally, but the msPostgresql starts on both nodes in slave mode, and I see "HS:alone"
>>>>>>
>>>>>> To see "HS:alone" is normal.
>>>>>> And RA compares xlog locations and promote the postgresql having new data.
>>>>>>
>>>>>> > In the Wiki you state, the if I start on a signle node only, PSQL should start in Master mode (PRI), but this is not the case.
>>>>>>
>>>>>> If the data is old, the node can't be master.
>>>>>> To be master needs pgsql-data-status="LATEST" or "STREAMING|SYNC".
>>>>>> Plese check it using "crm_mon -A".
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> And to become a master from stopped takes a few minutes because the RA compares xlog location on monitor.
>>>>>>
>>>>>>
>>>>>> > The recovery.conf file is created immediately, and from the logs I see no attempt at all to promote the node.
>>>>>> > In the postgres logs I see that node1, which is supposed to be a master, tries to connect to the vip-rep IP address, which is NOT brought up, because it depends on the Master role...
>>>>>> >
>>>>>> > Do you have any idea?
>>>>>>
>>>>>> Please check HA log.
>>>>>> My RA outputs "My data is out-of-date. status=********" to log if the data is old.
>>>>>>
>>>>>> Regards,
>>>>>> Takatoshi MATSUO
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org Getting started:
>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org Getting started:
>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org Getting started:
>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>
>>>>>
>>>>> --
>>>>> Yoshiharu Mori <y-mori at sraoss.co.jp> SRA OSS, Inc Japan
>>>>> http://www.sraoss.co.jp
>>>>> TEL: 03-5979-2701
>>>>> FAX: 03-5979-2702
>>>>>
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org Getting started:
>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>
>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org Getting started:
>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org Getting started:
>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org Getting started:
>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org Getting started:
>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org Getting started:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org Getting started:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>

_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org