[Pacemaker] cib
Shravan Mishra
shravan.mishra at gmail.com
Wed Sep 29 16:25:00 UTC 2010
Some more info:
root 14170 14166 0 12:23 ? 00:00:00 /usr/lib64/heartbeat/stonithd
nobody 14172 14166 0 12:23 ? 00:00:00 /usr/lib64/heartbeat/lrmd
82 14173 14166 0 12:23 ? 00:00:00 /usr/lib64/heartbeat/attrd
82 14174 14166 0 12:23 ? 00:00:00 /usr/lib64/heartbeat/pengine
82 14175 14166 0 12:23 ? 00:00:00 /usr/lib64/heartbeat/crmd
--lrmd is running as nobody when it should have been root.
I'm not sure why that would happen.
Thanks
Shravan
On Wed, Sep 29, 2010 at 10:29 AM, Shravan Mishra
<shravan.mishra at gmail.com> wrote:
> Hi,
>
>
>
> I did a bt on the core, this is what I found:
>
>
> ==========
> Core was generated by `/usr/lib64/heartbeat/cib'.
> Program terminated with signal 11, Segmentation fault.
> [New process 12340]
> #0 0x00007f23acc553fa in strncmp () from /lib64/libc.so.6
> (gdb) bt
> #0 0x00007f23acc553fa in strncmp () from /lib64/libc.so.6
> #1 0x00007f23acf87c39 in __xmlParserInputBufferCreateFilename () from
> /usr/lib64/libxml2.so.2
> #2 0x00007f23acf6147b in xmlNewInputFromFile () from /usr/lib64/libxml2.so.2
> #3 0x00007f23acf641d4 in xmlCreateURLParserCtxt () from /usr/lib64/libxml2.so.2
> #4 0x00007f23acf78f3a in xmlReadFile () from /usr/lib64/libxml2.so.2
> #5 0x00007f23ad0167b1 in xmlRelaxNGParse () from /usr/lib64/libxml2.so.2
> #6 0x00007f23ae967321 in validate_with_relaxng (doc=0x626020, to_logs=1,
> relaxng_file=0x7f23ae97ba10
> "/usr/share/pacemaker/pacemaker-1.2.rng") at xml.c:2222
> #7 0x00007f23ae967769 in validate_with (xml=0x6260d0, method=6,
> to_logs=1) at xml.c:2287
> #8 0x00007f23ae967b9f in validate_xml (xml_blob=0x6260d0,
> validation=0x626910 "pacemaker-1.2",
> to_logs=1) at xml.c:2373
> #9 0x0000000000405b23 in readCibXmlFile (dir=0x41b580
> "/var/lib/heartbeat/crm",
> file=0x41c40a "cib.xml", discard_status=1) at io.c:396
> #10 0x0000000000412285 in startCib (filename=0x41c40a "cib.xml") at main.c:613
> #11 0x0000000000411309 in cib_init () at main.c:408
> #12 0x000000000041064a in main (argc=1, argv=0x7fff942e0f58) at main.c:218
>
>
> ==========
>
>
>
> If it's a fresh install let's say then cib.xml will not exist.
> Then why is it looking for this file on startup.
>
>
> Sincerely
> Shravan
>
>
> On Tue, Sep 28, 2010 at 10:24 AM, Shravan Mishra
> <shravan.mishra at gmail.com> wrote:
>> Sorry forgot to attach my corosync.conf.
>>
>>
>> =========
>> totem {
>> version: 2
>> # token: 3000
>> # token_retransmits_before_loss_const: 10
>> # join: 60
>> # consensus: 1500
>> # vsftype: none
>> # max_messages: 20
>> # clear_node_high_bit: yes
>> secauth: off
>> threads: 0
>> # rrp_mode: passive
>>
>> interface {
>> ringnumber: 0
>> bindnetaddr: 192.168.2.0
>> #mcastaddr: 226.94.1.1
>> broadcast: yes
>> mcastport: 5405
>> }
>> # interface {
>> # ringnumber: 1
>> # bindnetaddr: 172.20.20.0
>> #mcastaddr: 226.94.1.1
>> # broadcast: yes
>> # mcastport: 5405
>> # }
>> }
>>
>> logging {
>> fileline: off
>> to_stderr: yes
>> to_logfile: yes
>> to_syslog: yes
>> logfile: /tmp/corosync.log
>> debug: off
>> timestamp: on
>> logger_subsys {
>> subsys: AMF
>> debug: off
>> }
>> }
>>
>> service {
>> name: pacemaker
>> ver: 0
>> }
>>
>> aisexec {
>> user:root
>> group: root
>> }
>>
>> amf {
>> mode: disabled
>> }
>>
>>
>>
>>
>> =========
>>
>> On Tue, Sep 28, 2010 at 10:10 AM, Shravan Mishra
>> <shravan.mishra at gmail.com> wrote:
>>> Hi Andrew,
>>>
>>> I'm attaching another log file as I reflashed my machine started
>>> everything from scratch.
>>> Looks like my old system got little messed up as I was trying to
>>> install old HA libraries - corosyc/pacemaker that was initially
>>> working for me.
>>>
>>>
>>> Here are the details:
>>>
>>> As of now I just want to see cib/attrd up so I have only one machine
>>> where I want to see things in a sane state.
>>>
>>> [root at ha2 ~]# /usr/sbin/corosync -v
>>> Corosync Cluster Engine, version '1.2.8' SVN revision '3035'
>>> Copyright (c) 2006-2009 Red Hat, Inc.
>>>
>>> [root at ha2 ~]# /usr/lib64/heartbeat/crmd version
>>> CRM Version: 1.1.2 (e0d731c2b1be446b27a73327a53067bf6230fb6a)
>>>
>>>
>>>
>>> Pacemaker version is 1.1, the release based on the above output is
>>> 1.1.2 if I correctly understand.
>>>
>>> This one is showing --
>>>
>>> Sep 27 12:30:45 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child
>>> process cib terminated with signal 11 (pid=9216, core=false)
>>>
>>>
>>> Please find corosync logs attached.
>>>
>>> Thanks
>>> Shravan
>>>
>>>
>>> On Tue, Sep 28, 2010 at 5:47 AM, Andrew Beekhof <andrew at beekhof.net> wrote:
>>>> On Mon, Sep 27, 2010 at 6:26 AM, Shravan Mishra
>>>> <shravan.mishra at gmail.com> wrote:
>>>>> Thanks Raoul for the response.
>>>>>
>>>>> Changing the permission to hacluster:haclient did stop that error.
>>>>>
>>>>> Now I'm hitting another problem whereby cib is failing to start
>>>>
>>>> Very strange logs.
>>>> Which distribution is this?
>>>> What does your corosync.conf look like?
>>>>
>>>>
>>>>> =====
>>>>> Sep 27 00:16:29 corosync [pcmk ] info: update_member: Node
>>>>> ha2.itactics.com now has process list:
>>>>> 00000000000000000000000000110012 (1114130)
>>>>> Sep 27 00:16:29 corosync [pcmk ] info: update_member: Node
>>>>> ha2.itactics.com now has 1 quorum votes (was 0)
>>>>> Sep 27 00:16:29 corosync [pcmk ] info: send_member_notification:
>>>>> Sending membership update 100 to 0 children
>>>>> Sep 27 00:16:29 corosync [MAIN ] Completed service synchronization,
>>>>> ready to provide service.
>>>>> Sep 27 00:16:30 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child
>>>>> process cib exited (pid=14889, rc=127)
>>>>> Sep 27 00:16:30 corosync [pcmk ] notice: pcmk_wait_dispatch:
>>>>> Respawning failed child process: cib
>>>>> Sep 27 00:16:30 corosync [pcmk ] info: spawn_child: Forked child
>>>>> 14896 for process cib
>>>>> crmd[14893]: 2010/09/27_00:16:30 WARN: do_cib_control: Couldn't
>>>>> complete CIB registration 1 times... pause and retry
>>>>> Sep 27 00:16:31 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child
>>>>> process cib exited (pid=14896, rc=127)
>>>>> Sep 27 00:16:31 corosync [pcmk ] notice: pcmk_wait_dispatch:
>>>>> Respawning failed child process: cib
>>>>> Sep 27 00:16:31 corosync [pcmk ] info: spawn_child: Forked child
>>>>> 14901 for process cib
>>>>> Sep 27 00:16:32 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child
>>>>> process cib exited (pid=14901, rc=1
>>>>> ======
>>>>>
>>>>>
>>>>> I have attached the full logs.
>>>>>
>>>>> We are using corosync 1.2.8 and pacemaker 1.1.3.
>>>>>
>>>>>
>>>>> Thanks.
>>>>> Shravan
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Sep 25, 2010 at 4:36 AM, Raoul Bhatia [IPAX] <r.bhatia at ipax.at> wrote:
>>>>>> On 24.09.2010 21:41, Shravan Mishra wrote:
>>>>>>>
>>>>>>> crmd[20612]: 2010/09/24_15:29:57 ERROR: crm_log_init_worker: Cannot
>>>>>>> change active directory to /var/lib/heartbeat/cores/hacluster:
>>>>>>> Permission denied (13)
>>>>>>
>>>>>> ls -ald /var/lib/heartbeat/cores/hacluster /var/lib/heartbeat/cores/
>>>>>> /var/lib/heartbeat/ /var/lib/ /var/
>>>>>>
>>>>>> is haclient allowed to cd all the way into
>>>>>> /var/lib/heartbeat/cores/hacluster ?
>>>>>>
>>>>>> cheers,
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>>
>>>
>>
>
More information about the Pacemaker
mailing list