[ClusterLabs] 答复: How to setup a simple master/slave cluster in two nodes without stonith resource

Klaus Wenninger kwenning at redhat.com
Tue Apr 3 01:58:09 EDT 2018


On 04/03/2018 06:07 AM, 范国腾 wrote:
> Yes, my resource are started and they are slave status. So I run "pcs resource cleanup pgsql-ha" command. The log shows the error when I run this command.
>
> -----邮件原件-----
> 发件人: Users [mailto:users-bounces at clusterlabs.org] 代表 Andrei Borzenkov
> 发送时间: 2018年4月3日 12:00
> 收件人: users at clusterlabs.org
> 主题: Re: [ClusterLabs] How to setup a simple master/slave cluster in two nodes without stonith resource
>
> 03.04.2018 05:07, 范国腾 пишет:
>> Hello,
>>
>> I want to setup a cluster in two nodes. One is master and the other is slave. I don’t need the fencing device because my internal network is stable.  I use the following command to create the resource, but all of the two nodes are slave and cluster don’t promote it to master. Could you please help check if there is anything wrong with my configuration?
What is the reason why you are using a cluster? Someone ripping out a
network-cable
isn't the only reason why one node might not see the other node. In
addition some
kind of fencing might be useful if a node isn't able to get it's
resources under
control.

>>
>> pcs property set stonith-enabled=false; pcs resource create pgsqld 
>> ocf:heartbeat:pgsqlms bindir=/usr/local/pgsql/bin 
>> pgdata=/home/postgres/data op start timeout=600s op stop timeout=60s 
>> op promote timeout=300s op demote timeout=120s op monitor interval=15s 
>> timeout=100s role="Master" op monitor interval=16s timeout=100s 
>> role="Slave" op notify timeout=60s;pcs resource master pgsql-ha pgsqld 
>> notify=true interleave=true;
>>
>> The status is as below:
>>
>> [root at node1 ~]# pcs status
>> Cluster name: cluster_pgsql
>> Stack: corosync
>> Current DC: node2-1 (version 1.1.15-11.el7-e174ec8) - partition with quorum
>> Last updated: Mon Apr  2 21:51:57 2018          Last change: Mon Apr  2 21:32:22 2018 by hacluster via crmd on node2-1
>>
>> 2 nodes and 3 resources configured
>>
>> Online: [ node1-1 node2-1 ]
>>
>> Full list of resources:
>>
>> Master/Slave Set: pgsql-ha [pgsqld]
>>      Slaves: [ node1-1 node2-1 ]
>> pgsql-master-ip        (ocf::heartbeat:IPaddr2):       Stopped
>>
>> Daemon Status:
>>   corosync: active/disabled
>>   pacemaker: active/disabled
>>   pcsd: active/enabled
>>
>> When I execute pcs resource cleanup in one node, there is always one node print the following waring message in the /var/log/messages. But the other nodes’ log show no error. The resource log(pgsqlms) show the monitor action could return 0 but why the crmd log show failed?
>>
>> Apr  2 21:53:09 node2 crmd[2425]: warning: No reason to expect node 1 
>> to be down Apr  2 21:53:09 node2 crmd[2425]:  notice: State transition 
>> S_IDLE -> S_POLICY_ENGINE | input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph Apr  2 21:53:09 node2 crmd[2425]: warning: No reason to expect node 2 to be down
>> Apr  2 21:53:09 node2 pengine[2424]:  notice: Start   pgsqld:0#011(node1-1)
>> Apr  2 21:53:09 node2 pengine[2424]:  notice: Start   pgsqld:1#011(node2-1)
>> Apr  2 21:53:09 node2 pengine[2424]:  notice: Calculated transition 4, 
>> saving inputs in /var/lib/pacemaker/pengine/pe-input-6.bz2
>> Apr  2 21:53:09 node2 crmd[2425]:  notice: Initiating monitor 
>> operation pgsqld:0_monitor_0 on node1-1 | action 2 Apr  2 21:53:09 
>> node2 crmd[2425]:  notice: Initiating monitor operation 
>> pgsqld:1_monitor_0 locally on node2-1 | action 3 Apr  2 21:53:09 node2 
>> pgsqlms(pgsqld)[3644]: INFO: Action is monitor Apr  2 21:53:09 node2 
>> pgsqlms(pgsqld)[3644]: INFO: pgsql_monitor: monitor is a probe Apr  2 
>> 21:53:09 node2 pgsqlms(pgsqld)[3644]: INFO: pgsql_monitor: instance 
>> "pgsqld" is listening Apr  2 21:53:09 node2 pgsqlms(pgsqld)[3644]: 
>> INFO: Action result is 0 Apr  2 21:53:09 node2 crmd[2425]:  notice: 
>> Result of probe operation for pgsqld on node2-1: 0 (ok) | call=33 
>> key=pgsqld_monitor_0 confirmed=true cib-update=62 Apr  2 21:53:09 
>> node2 crmd[2425]: warning: Action 3 (pgsqld:1_monitor_0) on node2-1 
>> failed (target: 7 vs. rc: 0): Error Apr  2 21:53:09 node2 crmd[2425]:  
>> notice: Transition aborted by operation pgsqld_monitor_0 'create' on 
>> node2-1: Event failed | 
>> magic=0:0;3:4:7:3a132f28-d8b9-4948-bb6b-736edc221664 cib=0.28.2 
>> source=match_graph_event:310 complete=false Apr  2 21:53:09 node2 
>> crmd[2425]: warning: Action 3 (pgsqld:1_monitor_0) on node2-1 failed 
>> (target: 7 vs. rc: 0): Error Apr  2 21:53:09 node2 crmd[2425]: 
>> warning: Action 2 (pgsqld:0_monitor_0) on node1-1 failed (target: 7 
>> vs. rc: 0): Error Apr  2 21:53:09 node2 crmd[2425]: warning: Action 2 
>> (pgsqld:0_monitor_0) on node1-1 failed (target: 7 vs. rc: 0): Error
> Apparently your applications are already started on both nodes at the time you start pacemaker. Pacemaker expects resources to be in inactive state initially.

Not necessarily I would say. Isn't that why they are probed on startup?
Though this probing somehow seems to fail here.

Regards,
Klaus

>
>> Apr  2 21:53:09 node2 crmd[2425]:  notice: Transition 4 (Complete=4, 
>> Pending=0, Fired=0, Skipped=0, Incomplete=10, 
>> Source=/var/lib/pacemaker/pengine/pe-input-6.bz2): Complete Apr  2 
>> 21:53:09 node2 pengine[2424]:  notice: Calculated transition 5, saving 
>> inputs in /var/lib/pacemaker/pengine/pe-input-7.bz2
>> Apr  2 21:53:09 node2 crmd[2425]:  notice: Initiating monitor 
>> operation pgsqld_monitor_16000 locally on node2-1 | action 4 Apr  2 
>> 21:53:09 node2 crmd[2425]:  notice: Initiating monitor operation 
>> pgsqld_monitor_16000 on node1-1 | action 7 Apr  2 21:53:09 node2 
>> pgsqlms(pgsqld)[3663]: INFO: Action is monitor
>>
>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org 
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org Getting started: 
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org




More information about the Users mailing list