[Pacemaker] cib

Shravan Mishra shravan.mishra at gmail.com
Wed Sep 29 10:29:01 EDT 2010


Hi,



I did a bt on the core, this is what I found:


==========
Core was generated by `/usr/lib64/heartbeat/cib'.
Program terminated with signal 11, Segmentation fault.
[New process 12340]
#0  0x00007f23acc553fa in strncmp () from /lib64/libc.so.6
(gdb) bt
#0  0x00007f23acc553fa in strncmp () from /lib64/libc.so.6
#1  0x00007f23acf87c39 in __xmlParserInputBufferCreateFilename () from
/usr/lib64/libxml2.so.2
#2  0x00007f23acf6147b in xmlNewInputFromFile () from /usr/lib64/libxml2.so.2
#3  0x00007f23acf641d4 in xmlCreateURLParserCtxt () from /usr/lib64/libxml2.so.2
#4  0x00007f23acf78f3a in xmlReadFile () from /usr/lib64/libxml2.so.2
#5  0x00007f23ad0167b1 in xmlRelaxNGParse () from /usr/lib64/libxml2.so.2
#6  0x00007f23ae967321 in validate_with_relaxng (doc=0x626020, to_logs=1,
    relaxng_file=0x7f23ae97ba10
"/usr/share/pacemaker/pacemaker-1.2.rng") at xml.c:2222
#7  0x00007f23ae967769 in validate_with (xml=0x6260d0, method=6,
to_logs=1) at xml.c:2287
#8  0x00007f23ae967b9f in validate_xml (xml_blob=0x6260d0,
validation=0x626910 "pacemaker-1.2",
    to_logs=1) at xml.c:2373
#9  0x0000000000405b23 in readCibXmlFile (dir=0x41b580
"/var/lib/heartbeat/crm",
    file=0x41c40a "cib.xml", discard_status=1) at io.c:396
#10 0x0000000000412285 in startCib (filename=0x41c40a "cib.xml") at main.c:613
#11 0x0000000000411309 in cib_init () at main.c:408
#12 0x000000000041064a in main (argc=1, argv=0x7fff942e0f58) at main.c:218


==========



If it's a fresh install let's say then cib.xml will not exist.
Then why is it looking for this file on startup.


Sincerely
Shravan


On Tue, Sep 28, 2010 at 10:24 AM, Shravan Mishra
<shravan.mishra at gmail.com> wrote:
> Sorry forgot to attach my corosync.conf.
>
>
> =========
> totem {
>        version: 2
> #       token: 3000
> #       token_retransmits_before_loss_const: 10
> #       join: 60
> #       consensus: 1500
> #       vsftype: none
> #       max_messages: 20
> #       clear_node_high_bit: yes
>        secauth: off
>        threads: 0
> #       rrp_mode: passive
>
>        interface {
>                ringnumber: 0
>                bindnetaddr: 192.168.2.0
>                #mcastaddr: 226.94.1.1
>                broadcast: yes
>                mcastport: 5405
>        }
> #       interface {
> #               ringnumber: 1
> #               bindnetaddr: 172.20.20.0
>                #mcastaddr: 226.94.1.1
> #               broadcast: yes
> #               mcastport: 5405
> #       }
> }
>
> logging {
>        fileline: off
>        to_stderr: yes
>        to_logfile: yes
>        to_syslog: yes
>        logfile: /tmp/corosync.log
>        debug: off
>        timestamp: on
>        logger_subsys {
>                subsys: AMF
>                debug: off
>        }
> }
>
> service {
>        name: pacemaker
>        ver: 0
> }
>
> aisexec {
>        user:root
>        group: root
> }
>
> amf {
>        mode: disabled
> }
>
>
>
>
> =========
>
> On Tue, Sep 28, 2010 at 10:10 AM, Shravan Mishra
> <shravan.mishra at gmail.com> wrote:
>> Hi Andrew,
>>
>> I'm attaching another log file as I reflashed my machine started
>> everything from scratch.
>> Looks like my old system got little messed up as I was trying to
>> install old HA libraries - corosyc/pacemaker that was initially
>> working for me.
>>
>>
>> Here are the details:
>>
>> As of now  I just want to see cib/attrd up so I have only one machine
>> where I want to see things in a sane state.
>>
>> [root at ha2 ~]# /usr/sbin/corosync -v
>> Corosync Cluster Engine, version '1.2.8' SVN revision '3035'
>> Copyright (c) 2006-2009 Red Hat, Inc.
>>
>> [root at ha2 ~]# /usr/lib64/heartbeat/crmd version
>> CRM Version: 1.1.2 (e0d731c2b1be446b27a73327a53067bf6230fb6a)
>>
>>
>>
>> Pacemaker version is 1.1, the release based on the above output is
>> 1.1.2 if I correctly understand.
>>
>> This one is showing --
>>
>> Sep 27 12:30:45 corosync [pcmk  ] ERROR: pcmk_wait_dispatch: Child
>> process cib terminated with signal 11 (pid=9216, core=false)
>>
>>
>> Please find corosync logs attached.
>>
>> Thanks
>> Shravan
>>
>>
>> On Tue, Sep 28, 2010 at 5:47 AM, Andrew Beekhof <andrew at beekhof.net> wrote:
>>> On Mon, Sep 27, 2010 at 6:26 AM, Shravan Mishra
>>> <shravan.mishra at gmail.com> wrote:
>>>> Thanks Raoul for the response.
>>>>
>>>> Changing the permission to hacluster:haclient did stop that error.
>>>>
>>>> Now I'm hitting another problem whereby cib is failing to start
>>>
>>> Very strange logs.
>>> Which distribution is this?
>>> What does your corosync.conf look like?
>>>
>>>
>>>> =====
>>>> Sep 27 00:16:29 corosync [pcmk  ] info: update_member: Node
>>>> ha2.itactics.com now has process list:
>>>> 00000000000000000000000000110012 (1114130)
>>>> Sep 27 00:16:29 corosync [pcmk  ] info: update_member: Node
>>>> ha2.itactics.com now has 1 quorum votes (was 0)
>>>> Sep 27 00:16:29 corosync [pcmk  ] info: send_member_notification:
>>>> Sending membership update 100 to 0 children
>>>> Sep 27 00:16:29 corosync [MAIN  ] Completed service synchronization,
>>>> ready to provide service.
>>>> Sep 27 00:16:30 corosync [pcmk  ] ERROR: pcmk_wait_dispatch: Child
>>>> process cib exited (pid=14889, rc=127)
>>>> Sep 27 00:16:30 corosync [pcmk  ] notice: pcmk_wait_dispatch:
>>>> Respawning failed child process: cib
>>>> Sep 27 00:16:30 corosync [pcmk  ] info: spawn_child: Forked child
>>>> 14896 for process cib
>>>> crmd[14893]: 2010/09/27_00:16:30 WARN: do_cib_control: Couldn't
>>>> complete CIB registration 1 times... pause and retry
>>>> Sep 27 00:16:31 corosync [pcmk  ] ERROR: pcmk_wait_dispatch: Child
>>>> process cib exited (pid=14896, rc=127)
>>>> Sep 27 00:16:31 corosync [pcmk  ] notice: pcmk_wait_dispatch:
>>>> Respawning failed child process: cib
>>>> Sep 27 00:16:31 corosync [pcmk  ] info: spawn_child: Forked child
>>>> 14901 for process cib
>>>> Sep 27 00:16:32 corosync [pcmk  ] ERROR: pcmk_wait_dispatch: Child
>>>> process cib exited (pid=14901, rc=1
>>>> ======
>>>>
>>>>
>>>> I have attached the full logs.
>>>>
>>>> We are using  corosync 1.2.8 and pacemaker 1.1.3.
>>>>
>>>>
>>>>  Thanks.
>>>> Shravan
>>>>
>>>>
>>>>
>>>> On Sat, Sep 25, 2010 at 4:36 AM, Raoul Bhatia [IPAX] <r.bhatia at ipax.at> wrote:
>>>>> On 24.09.2010 21:41, Shravan Mishra wrote:
>>>>>>
>>>>>> crmd[20612]: 2010/09/24_15:29:57 ERROR: crm_log_init_worker: Cannot
>>>>>> change active directory to /var/lib/heartbeat/cores/hacluster:
>>>>>> Permission denied (13)
>>>>>
>>>>> ls -ald /var/lib/heartbeat/cores/hacluster /var/lib/heartbeat/cores/
>>>>> /var/lib/heartbeat/ /var/lib/ /var/
>>>>>
>>>>> is haclient allowed to cd all the way into
>>>>> /var/lib/heartbeat/cores/hacluster ?
>>>>>
>>>>> cheers,
>>>>>
>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>
>>
>




More information about the Pacemaker mailing list