[ClusterLabs] Antw: Re: [Question] About movement of pacemaker_remote.

Wed May 20 06:38:23 CEST 2015

> On 11 May 2015, at 2:22 pm, renayama19661014 at ybb.ne.jp wrote:
> 
> Hi All,
> 
> I matched the OS version of the remote node with a host once again and confirmed it in Pacemaker1.1.13-rc2.

I think the work David is doing in this area is targeted for master (ie. 1.1.14) due to the risk involved.
You can follow along in https://github.com/ClusterLabs/pacemaker/pull/708

> 
> It was the same even if I made a host RHEL7.1.(bl460g8n1)
> I made the remote host RHEL7.1.(snmp1)
> 
> The first crm_resource -C fails.
> --------------------------------
> [root at bl460g8n1 ~]# crm_resource -C -r snmp1
> Cleaning up snmp1 on bl460g8n1
> Waiting for 1 replies from the CRMd. OK
> 
> [root at bl460g8n1 ~]# crm_mon -1 -Af
> Last updated: Mon May 11 12:44:31 2015
> Last change: Mon May 11 12:43:30 2015
> Stack: corosync
> Current DC: bl460g8n1 - partition WITHOUT quorum
> Version: 1.1.12-7a2e3ae
> 2 Nodes configured
> 3 Resources configured
> 
> 
> Online: [ bl460g8n1 ]
> RemoteOFFLINE: [ snmp1 ]
> 
>  Host-rsc1      (ocf::heartbeat:Dummy): Started bl460g8n1
>  Remote-rsc1    (ocf::heartbeat:Dummy): Started bl460g8n1 (failure ignored)
> 
> Node Attributes:
> * Node bl460g8n1:
>     + ringnumber_0                      : 192.168.101.21 is UP
>     + ringnumber_1                      : 192.168.102.21 is UP
> 
> Migration summary:
> * Node bl460g8n1:
>    snmp1: migration-threshold=1 fail-count=1000000 last-failure='Mon May 11 12:44:28 2015'
> 
> Failed actions:
>     snmp1_start_0 on bl460g8n1 'unknown error' (1): call=5, status=Timed Out, exit-reason='none', last-rc-change='Mon May 11 12:43:31 2015', queued=0ms, exec=0ms
> --------------------------------
> 
> 
> The second crm_resource -C succeeded and was connected to the remote host.
> --------------------------------
> [root at bl460g8n1 ~]# crm_mon -1 -Af
> Last updated: Mon May 11 12:44:54 2015
> Last change: Mon May 11 12:44:48 2015
> Stack: corosync
> Current DC: bl460g8n1 - partition WITHOUT quorum
> Version: 1.1.12-7a2e3ae
> 2 Nodes configured
> 3 Resources configured
> 
> 
> Online: [ bl460g8n1 ]
> RemoteOnline: [ snmp1 ]
> 
>  Host-rsc1      (ocf::heartbeat:Dummy): Started bl460g8n1
>  Remote-rsc1    (ocf::heartbeat:Dummy): Started snmp1
>  snmp1  (ocf::pacemaker:remote):        Started bl460g8n1
> 
> Node Attributes:
> * Node bl460g8n1:
>     + ringnumber_0                      : 192.168.101.21 is UP
>     + ringnumber_1                      : 192.168.102.21 is UP
> * Node snmp1:
> 
> Migration summary:
> * Node bl460g8n1:
> * Node snmp1:
> --------------------------------
> 
> The gnutls of a host and the remote node was the next version.
> 
> gnutls-devel-3.3.8-12.el7.x86_64
> gnutls-dane-3.3.8-12.el7.x86_64
> gnutls-c++-3.3.8-12.el7.x86_64
> gnutls-3.3.8-12.el7.x86_64
> gnutls-utils-3.3.8-12.el7.x86_64
> 
> 
> Best Regards,
> Hideo Yamauchi.
> 
> 
> 
> 
> ----- Original Message -----
>> From: "renayama19661014 at ybb.ne.jp" <renayama19661014 at ybb.ne.jp>
>> To: Cluster Labs - All topics related to open-source clustering welcomed <users at clusterlabs.org>
>> Cc: 
>> Date: 2015/4/28, Tue 14:06
>> Subject: Re: [ClusterLabs] Antw: Re: [Question] About movement of pacemaker_remote.
>> 
>> Hi David,
>> 
>> Even if the result changed the remote node to RHEL7.1, it was the same.
>> 
>> 
>> I try it with a host node of pacemaker as RHEL7.1 this time.
>> 
>> 
>> I noticed an interesting phenomenon.
>> The remote node fails in a reconnection in the first crm_resource.
>> However, the remote node succeeds in a reconnection in the second crm_resource.
>> 
>> I think that I have some problem with the point where I cut the connection with 
>> the remote node first.
>> 
>> Best Regards,
>> Hideo Yamauchi.
>> 
>> 
>> ----- Original Message -----
>>> From: "renayama19661014 at ybb.ne.jp" 
>> <renayama19661014 at ybb.ne.jp>
>>> To: Cluster Labs - All topics related to open-source clustering welcomed 
>> <users at clusterlabs.org>
>>> Cc: 
>>> Date: 2015/4/28, Tue 11:52
>>> Subject: Re: [ClusterLabs] Antw: Re: [Question] About movement of 
>> pacemaker_remote.
>>> 
>>> Hi David,
>>> Thank you for comments.
>>>> At first glance this looks gnutls related.  GNUTLS is returning -50 
>> during 
>>> receive
>>> 
>>>> on the client side (pacemaker's side). -50 maps to 'invalid 
>>> request'. >debug: crm_remote_recv_once:     TLS receive failed: The 
>>> request is invalid. >We treat this error as fatal and destroy the 
>> connection. 
>>> I've never encountered
>>>> this error and I don't know what causes it. It's possible 
>>> there's a bug in
>>>> our gnutls usage... it's also possible there's a bug in the 
>> version 
>>> of gnutls
>>>> that is in use as well. 
>>> We built the remote node in RHEL6.5.
>>> Because it may be a problem of gnutls, I confirm it in RHEL7.1.
>>> 
>>> Best Regards,
>>> Hideo Yamauchi.
>>> 
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> http://clusterlabs.org/mailman/listinfo/users
>>> 
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>> 
>> 
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org