[Pacemaker] cibadmin -Q: Call cib_query failed (-62): Timer expired

Radoslaw Garbacz radoslaw.garbacz at xtremedatainc.com
Fri Sep 27 19:37:29 UTC 2013


The problem was actually of a different nature - nothing to do with
cib_shm. The logs showed later on that the connection to cib was
established, just the corosync configuration file didn't hava a proper
quorum section, which caused the experienced problems.

After fixing "corosync,conf" "quorum" section everything works.

many thanks,


On Fri, Sep 27, 2013 at 2:16 PM, Radoslaw Garbacz
<radoslaw.garbacz at xtremedatainc.com> wrote:
> cibadmin -Ql works, problem is persistent after upgrade, and the logs
> for "crmd" reviled the problem:
>
> Sep 27 16:19:22 [5074] ip-10-82-197-219       crmd:     info:
> crm_ipc_connect:  Could not establish cib_shm connection: Connection
> refused (111)
> Sep 27 16:19:22 [5074] ip-10-82-197-219       crmd:    debug:
> cib_native_signon_raw:    Connection unsuccessful (0 (nil))
> Sep 27 16:19:22 [5074] ip-10-82-197-219       crmd:    debug:
> cib_native_signon_raw:    Connection to CIB failed: Transport endpoint
> is not connected
>
> I will keep searching for the solution, but in meantime, if you had a
> moment, any hint would be welcomed.
>
> many thanks,
>
>
> On Thu, Sep 26, 2013 at 9:25 PM, Andrew Beekhof <andrew at beekhof.net> wrote:
>>
>> On 27/09/2013, at 8:45 AM, Radoslaw Garbacz <radoslaw.garbacz at xtremedatainc.com> wrote:
>>
>>> Hi,
>>>
>>> I have a problem starting up a cluster after upgrading corosync from
>>> 1.4 to 2.3.2 and pacemaker from 1.8 to 1.9.
>>>
>>> All "crm_node" calls report well, but any CIB manipulation fails, i.e.:
>>> * crm_node -q: 1
>>> * crm_node -l: OK
>>> * crm_node -p: OK
>>> * cibadmin -Q: Call cib_query failed (-62): Timer expired
>>
>> Does cibadmin -Ql work?
>> If so, there might be a DC election going on (look in the logs for "crmd").
>> Is the error transient or persistent?
>>
>>>
>>> No iptables, no SELinux, 3 nodes cluster, corosync.conf:
>>> ...
>>>        ringnumber: 0
>>>        bindnetaddr: ...
>>>        mcastport: 7800
>>>    }
>>>
>>>    transport: udpu
>>>
>>>
>>>
>>> Any help greatly appreciated.
>>>
>>>
>>> Below is some more information:
>>>
>>> * pacemaker logs:
>>>
>>> Sep 26 22:24:00 [2836] ip-10-114-210-162        cib:     info:
>>> crm_client_new:  Connecting 0x111b780 for uid=0 gid=0 pid=2883
>>> id=977d6f23-963b-41a4-8fe0-a63024080d41
>>> Sep 26 22:24:00 [2836] ip-10-114-210-162        cib:     info:
>>> cib_process_request:     Forwarding cib_query operation for section
>>> 'all' to master (origin=local/cibadmin/2)
>>> Sep 26 22:24:30 [2836] ip-10-114-210-162        cib:     info:
>>> crm_client_destroy:      Destroying 0 events
>>>
>>>
>>> * ps axf | grep pacemaker|corosync:
>>>
>>> 2806 ?        Ssl    0:10 corosync
>>> 2834 pts/1    S      0:00 pacemakerd
>>> 2836 ?        Ss     0:01  \_ /usr/libexec/pacemaker/cib
>>> 2837 ?        Ss     0:00  \_ /usr/libexec/pacemaker/stonithd
>>> 2838 ?        Ss     0:00  \_ /usr/libexec/pacemaker/lrmd
>>> 2839 ?        Ss     0:00  \_ /usr/libexec/pacemaker/attrd
>>> 2840 ?        Ss     0:00  \_ /usr/libexec/pacemaker/pengine
>>> 2841 ?        Ss     0:00  \_ /usr/libexec/pacemaker/crmd
>>>
>>>
>>> * strace cibadmin -Q:
>>>
>>> open("/dev/shm/qb-cib_rw-event-2836-2897-12-data", O_RDWR) = 6
>>> ftruncate(6, 20480000)                  = 0
>>> mmap(NULL, 40960000, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
>>> 0x7fa221692000
>>> mmap(0x7fa221692000, 20480000, PROT_READ|PROT_WRITE,
>>> MAP_SHARED|MAP_FIXED, 6, 0) = 0x7fa221692000
>>> mmap(0x7fa222a1a000, 20480000, PROT_READ|PROT_WRITE,
>>> MAP_SHARED|MAP_FIXED, 6, 0) = 0x7fa222a1a000
>>> close(6)                                = 0
>>> close(5)                                = 0
>>> close(6)                                = -1 EBADF (Bad file descriptor)
>>> fstat(4, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
>>> fcntl(4, F_GETFL)                       = 0x802 (flags O_RDWR|O_NONBLOCK)
>>> poll([{fd=4, events=POLLIN}], 1, 0)     = 0 (Timeout)
>>> poll([{fd=4, events=POLLIN}], 1, 0)     = 0 (Timeout)
>>> sendto(4, "~", 1, MSG_NOSIGNAL, NULL, 0) = 1
>>> futex(0x7fa22df4cb60, FUTEX_WAKE_PRIVATE, 2147483647) = 0
>>> gettimeofday({1380234692, 68879}, NULL) = 0
>>> poll([{fd=4, events=POLLIN}], 1, 0)     = 0 (Timeout)
>>> poll([{fd=4, events=POLLIN}], 1, 0)     = 0 (Timeout)
>>> gettimeofday({1380234692, 69522}, NULL) = 0
>>> sendto(4, "\274", 1, MSG_NOSIGNAL, NULL, 0) = 1
>>> poll([{fd=4, events=POLLIN}], 1, 0)     = 0 (Timeout)
>>> gettimeofday({1380234692, 70085}, NULL) = 0
>>> gettimeofday({1380234692, 70197}, NULL) = 0
>>> poll([{fd=4, events=POLLIN}], 1, 30000) = 0 (Timeout)
>>> gettimeofday({1380234722, 91625}, NULL) = 0
>>> write(2, "Call cib_query failed (-62): Tim"..., 43Call cib_query
>>> failed (-62): Timer expired
>>> ) = 43
>>> poll([{fd=4, events=POLLIN}], 1, 0)     = 0 (Timeout)
>>>
>>>
>>> * netstat -lxp:
>>>
>>> Active UNIX domain sockets (only servers)
>>> Proto RefCnt Flags       Type       State         I-Node PID/Program
>>> name    Path
>>> unix  2      [ ACC ]     STREAM     LISTENING     20021  2836/cib
>>>      @cib_rw
>>> unix  2      [ ACC ]     STREAM     LISTENING     19958  2838/lrmd
>>>      @lrmd
>>> unix  2      [ ACC ]     STREAM     LISTENING     19789  2806/corosync
>>>      @quorum
>>> unix  2      [ ACC ]     STREAM     LISTENING     19786  2806/corosync
>>>      @cmap
>>> unix  2      [ ACC ]     STREAM     LISTENING     20020  2836/cib
>>>      @cib_ro
>>> unix  2      [ ACC ]     STREAM     LISTENING     20057  2837/stonithd
>>>      @stonith-ng
>>> unix  2      [ ACC ]     STREAM     LISTENING     19787  2806/corosync
>>>      @cfg
>>> unix  2      [ ACC ]     STREAM     LISTENING     19906
>>> 2834/pacemakerd     @pacemakerd
>>> unix  2      [ ACC ]     STREAM     LISTENING     19788  2806/corosync
>>>      @cpg
>>> unix  2      [ ACC ]     STREAM     LISTENING     20022  2836/cib
>>>      @cib_shm
>>> unix  2      [ ACC ]     STREAM     LISTENING     19985  2840/pengine
>>>      @pengine
>>>
>>>
>>>
>>> Thanks in advance,
>>>
>>> --
>>> Best Regards,
>>>
>>> Radoslaw Garbacz
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
>
>
> --
> Best Regards,
>
> Radoslaw Garbacz
> XtremeData Incorporation



-- 
Best Regards,

Radoslaw Garbacz
XtremeData Incorporation




More information about the Pacemaker mailing list