[Pacemaker] cib

Shravan Mishra shravan.mishra at gmail.com
Wed Sep 29 12:25:00 EDT 2010


Some more info:


root     14170 14166  0 12:23 ?        00:00:00 /usr/lib64/heartbeat/stonithd
nobody   14172 14166  0 12:23 ?        00:00:00 /usr/lib64/heartbeat/lrmd
82       14173 14166  0 12:23 ?        00:00:00 /usr/lib64/heartbeat/attrd
82       14174 14166  0 12:23 ?        00:00:00 /usr/lib64/heartbeat/pengine
82       14175 14166  0 12:23 ?        00:00:00 /usr/lib64/heartbeat/crmd




--lrmd is running as nobody when it should have been root.

I'm not sure why that would happen.


Thanks
Shravan

On Wed, Sep 29, 2010 at 10:29 AM, Shravan Mishra
<shravan.mishra at gmail.com> wrote:
> Hi,
>
>
>
> I did a bt on the core, this is what I found:
>
>
> ==========
> Core was generated by `/usr/lib64/heartbeat/cib'.
> Program terminated with signal 11, Segmentation fault.
> [New process 12340]
> #0  0x00007f23acc553fa in strncmp () from /lib64/libc.so.6
> (gdb) bt
> #0  0x00007f23acc553fa in strncmp () from /lib64/libc.so.6
> #1  0x00007f23acf87c39 in __xmlParserInputBufferCreateFilename () from
> /usr/lib64/libxml2.so.2
> #2  0x00007f23acf6147b in xmlNewInputFromFile () from /usr/lib64/libxml2.so.2
> #3  0x00007f23acf641d4 in xmlCreateURLParserCtxt () from /usr/lib64/libxml2.so.2
> #4  0x00007f23acf78f3a in xmlReadFile () from /usr/lib64/libxml2.so.2
> #5  0x00007f23ad0167b1 in xmlRelaxNGParse () from /usr/lib64/libxml2.so.2
> #6  0x00007f23ae967321 in validate_with_relaxng (doc=0x626020, to_logs=1,
>    relaxng_file=0x7f23ae97ba10
> "/usr/share/pacemaker/pacemaker-1.2.rng") at xml.c:2222
> #7  0x00007f23ae967769 in validate_with (xml=0x6260d0, method=6,
> to_logs=1) at xml.c:2287
> #8  0x00007f23ae967b9f in validate_xml (xml_blob=0x6260d0,
> validation=0x626910 "pacemaker-1.2",
>    to_logs=1) at xml.c:2373
> #9  0x0000000000405b23 in readCibXmlFile (dir=0x41b580
> "/var/lib/heartbeat/crm",
>    file=0x41c40a "cib.xml", discard_status=1) at io.c:396
> #10 0x0000000000412285 in startCib (filename=0x41c40a "cib.xml") at main.c:613
> #11 0x0000000000411309 in cib_init () at main.c:408
> #12 0x000000000041064a in main (argc=1, argv=0x7fff942e0f58) at main.c:218
>
>
> ==========
>
>
>
> If it's a fresh install let's say then cib.xml will not exist.
> Then why is it looking for this file on startup.
>
>
> Sincerely
> Shravan
>
>
> On Tue, Sep 28, 2010 at 10:24 AM, Shravan Mishra
> <shravan.mishra at gmail.com> wrote:
>> Sorry forgot to attach my corosync.conf.
>>
>>
>> =========
>> totem {
>>        version: 2
>> #       token: 3000
>> #       token_retransmits_before_loss_const: 10
>> #       join: 60
>> #       consensus: 1500
>> #       vsftype: none
>> #       max_messages: 20
>> #       clear_node_high_bit: yes
>>        secauth: off
>>        threads: 0
>> #       rrp_mode: passive
>>
>>        interface {
>>                ringnumber: 0
>>                bindnetaddr: 192.168.2.0
>>                #mcastaddr: 226.94.1.1
>>                broadcast: yes
>>                mcastport: 5405
>>        }
>> #       interface {
>> #               ringnumber: 1
>> #               bindnetaddr: 172.20.20.0
>>                #mcastaddr: 226.94.1.1
>> #               broadcast: yes
>> #               mcastport: 5405
>> #       }
>> }
>>
>> logging {
>>        fileline: off
>>        to_stderr: yes
>>        to_logfile: yes
>>        to_syslog: yes
>>        logfile: /tmp/corosync.log
>>        debug: off
>>        timestamp: on
>>        logger_subsys {
>>                subsys: AMF
>>                debug: off
>>        }
>> }
>>
>> service {
>>        name: pacemaker
>>        ver: 0
>> }
>>
>> aisexec {
>>        user:root
>>        group: root
>> }
>>
>> amf {
>>        mode: disabled
>> }
>>
>>
>>
>>
>> =========
>>
>> On Tue, Sep 28, 2010 at 10:10 AM, Shravan Mishra
>> <shravan.mishra at gmail.com> wrote:
>>> Hi Andrew,
>>>
>>> I'm attaching another log file as I reflashed my machine started
>>> everything from scratch.
>>> Looks like my old system got little messed up as I was trying to
>>> install old HA libraries - corosyc/pacemaker that was initially
>>> working for me.
>>>
>>>
>>> Here are the details:
>>>
>>> As of now  I just want to see cib/attrd up so I have only one machine
>>> where I want to see things in a sane state.
>>>
>>> [root at ha2 ~]# /usr/sbin/corosync -v
>>> Corosync Cluster Engine, version '1.2.8' SVN revision '3035'
>>> Copyright (c) 2006-2009 Red Hat, Inc.
>>>
>>> [root at ha2 ~]# /usr/lib64/heartbeat/crmd version
>>> CRM Version: 1.1.2 (e0d731c2b1be446b27a73327a53067bf6230fb6a)
>>>
>>>
>>>
>>> Pacemaker version is 1.1, the release based on the above output is
>>> 1.1.2 if I correctly understand.
>>>
>>> This one is showing --
>>>
>>> Sep 27 12:30:45 corosync [pcmk  ] ERROR: pcmk_wait_dispatch: Child
>>> process cib terminated with signal 11 (pid=9216, core=false)
>>>
>>>
>>> Please find corosync logs attached.
>>>
>>> Thanks
>>> Shravan
>>>
>>>
>>> On Tue, Sep 28, 2010 at 5:47 AM, Andrew Beekhof <andrew at beekhof.net> wrote:
>>>> On Mon, Sep 27, 2010 at 6:26 AM, Shravan Mishra
>>>> <shravan.mishra at gmail.com> wrote:
>>>>> Thanks Raoul for the response.
>>>>>
>>>>> Changing the permission to hacluster:haclient did stop that error.
>>>>>
>>>>> Now I'm hitting another problem whereby cib is failing to start
>>>>
>>>> Very strange logs.
>>>> Which distribution is this?
>>>> What does your corosync.conf look like?
>>>>
>>>>
>>>>> =====
>>>>> Sep 27 00:16:29 corosync [pcmk  ] info: update_member: Node
>>>>> ha2.itactics.com now has process list:
>>>>> 00000000000000000000000000110012 (1114130)
>>>>> Sep 27 00:16:29 corosync [pcmk  ] info: update_member: Node
>>>>> ha2.itactics.com now has 1 quorum votes (was 0)
>>>>> Sep 27 00:16:29 corosync [pcmk  ] info: send_member_notification:
>>>>> Sending membership update 100 to 0 children
>>>>> Sep 27 00:16:29 corosync [MAIN  ] Completed service synchronization,
>>>>> ready to provide service.
>>>>> Sep 27 00:16:30 corosync [pcmk  ] ERROR: pcmk_wait_dispatch: Child
>>>>> process cib exited (pid=14889, rc=127)
>>>>> Sep 27 00:16:30 corosync [pcmk  ] notice: pcmk_wait_dispatch:
>>>>> Respawning failed child process: cib
>>>>> Sep 27 00:16:30 corosync [pcmk  ] info: spawn_child: Forked child
>>>>> 14896 for process cib
>>>>> crmd[14893]: 2010/09/27_00:16:30 WARN: do_cib_control: Couldn't
>>>>> complete CIB registration 1 times... pause and retry
>>>>> Sep 27 00:16:31 corosync [pcmk  ] ERROR: pcmk_wait_dispatch: Child
>>>>> process cib exited (pid=14896, rc=127)
>>>>> Sep 27 00:16:31 corosync [pcmk  ] notice: pcmk_wait_dispatch:
>>>>> Respawning failed child process: cib
>>>>> Sep 27 00:16:31 corosync [pcmk  ] info: spawn_child: Forked child
>>>>> 14901 for process cib
>>>>> Sep 27 00:16:32 corosync [pcmk  ] ERROR: pcmk_wait_dispatch: Child
>>>>> process cib exited (pid=14901, rc=1
>>>>> ======
>>>>>
>>>>>
>>>>> I have attached the full logs.
>>>>>
>>>>> We are using  corosync 1.2.8 and pacemaker 1.1.3.
>>>>>
>>>>>
>>>>>  Thanks.
>>>>> Shravan
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Sep 25, 2010 at 4:36 AM, Raoul Bhatia [IPAX] <r.bhatia at ipax.at> wrote:
>>>>>> On 24.09.2010 21:41, Shravan Mishra wrote:
>>>>>>>
>>>>>>> crmd[20612]: 2010/09/24_15:29:57 ERROR: crm_log_init_worker: Cannot
>>>>>>> change active directory to /var/lib/heartbeat/cores/hacluster:
>>>>>>> Permission denied (13)
>>>>>>
>>>>>> ls -ald /var/lib/heartbeat/cores/hacluster /var/lib/heartbeat/cores/
>>>>>> /var/lib/heartbeat/ /var/lib/ /var/
>>>>>>
>>>>>> is haclient allowed to cd all the way into
>>>>>> /var/lib/heartbeat/cores/hacluster ?
>>>>>>
>>>>>> cheers,
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>>
>>>
>>
>




More information about the Pacemaker mailing list