[Pacemaker] CRM and openAIS

Andrew Beekhof beekhof at gmail.com
Thu Jan 17 13:48:34 EST 2008


On Jan 17, 2008, at 6:55 PM, Serge Dubrouski wrote:

> On Jan 17, 2008 10:40 AM, Serge Dubrouski <sergeyfd at gmail.com> wrote:
>>
>> On Jan 17, 2008 10:14 AM, Andrew Beekhof <beekhof at gmail.com> wrote:
>>>
>>> On Jan 17, 2008, at 5:07 PM, Serge Dubrouski wrote:
>>>
>>>> I've got it starting all right,. now it complains on permissions:
>>>>
>>>> Jan 17 11:00:16 fc-node1 crmd: [32197]: ERROR:  
>>>> socket_wait_conn_new:
>>>> unlink failure(/var/run/heartbeat/crm/crmd): Permission denied
>>>> Jan 17 11:00:16 fc-node1 cib: [32196]: ERROR: Could not open config
>>>> file /var/lib/heartbeat/crm/cib.xml.last for reading: Permission
>>>> denied
>>>> Jan 17 11:00:16 fc-node1 crmd: [32197]: ERROR:  
>>>> socket_wait_conn_new:
>>>> trying to create in /var/run/heartbeat/crm/crmd bind:: Address  
>>>> already
>>>> in use
>>>
>>> already in use?
>>> you dont have heartbeat running too do you?
>>>
>>>>
>>>>
>>>> ...................
>>>>
>>>> All those files belong to hacluser:hacluster Do they need to  
>>>> belong to
>>>> the other user?
>>>
>>> assuming you're using the packages from the build service (and that
>>> hacluser is missing a 't'), that should be right.
>>>
>>> maybe delete /var/run/heartbeat/crm/crmd and see what perms it gets
>>> recreated with?
>>>
>>
>> Looks like you built Fedora packages for particular UID or so:
>>
>> Jan 17 12:27:21 fc-node1 crmd: [323]: info: crmd_init: Starting crmd
>> Jan 17 12:27:21 fc-node1 attrd: [324]: ERROR: Cannot get name for uid
>> [24]: Success
>> Jan 17 12:27:21 fc-node1 cib: [322]: ERROR: Cannot get name for uid
>> [24]: Success
>>
>> Then:
>>
>> Jan 17 12:30:32 fc-node1 cib: [773]: info: retrieveCib: Reading
>> cluster configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
>> /var/lib/heartbeat/crm/cib.xml.sig)
>> Jan 17 12:30:32 fc-node1 cib: [773]: ERROR: Could not open config  
>> file
>> /var/lib/heartbeat/crm/cib.xml for reading: Permission denied
>> Jan 17 12:30:32 fc-node1 cib: [773]: ERROR: retrieveCib:
>> /var/lib/heartbeat/crm/cib.xml exists but does NOT contain valid XML.
>>
>>
>> But:
>>
>> # ls -l /var/lib/heartbeat/crm/cib.xml
>> -rw------- 1 hacluster hacluster 3158 Jan 10 16:08
>> /var/lib/heartbeat/crm/cib.xml
>>
>> And crmd doesn't get created with the same error: permissions denied.
>>
>> Changind uid for hacluster from 501 to 24 fixed the problem.
>>
>> BTW: Stopping openais service leaves lrmd up:
>>
>> [root at fc-node1 log]# service openais stop
>> Stopping OpenAIS daemon (aisexec):                         [  OK  ]
>> [root at fc-node1 log]# ps -ef | grep heart
>> root      3444     1  0 12:37 pts/0    00:00:00 /usr/lib/heartbeat/ 
>> lrmd
>> root      3483 32732  0 12:39 pts/0    00:00:00 grep heart
>>
>> Is it supposed to be like that?
>> --
>> Serge Dubrouski.
>>
>
> And some more problems:
>
> Jan 17 12:43:27 fc-node2 lrmd: [10530]: ERROR: on_msg_add_rsc: RA
> class [stonith] does not exist.
> Jan 17 12:43:27 fc-node2 crmd: [10532]: ERROR: lrm_add_rsc(726): got a
> return code HA_FAIL from a reply message of addrsc with function
> get_ret_from_msg.
> Jan 17 12:43:27 fc-node2 crmd: [10532]: ERROR: get_lrm_resource: Could
> not add resource child_DoFencing:0 to LRM
> Jan 17 12:43:27 fc-node2 crmd: [10532]: ERROR: do_lrm_invoke: Invalid
> resource definition

Not so much a problem as something thats not implemented yet.

stonithd relies on the heartbeat comms layer and thus wont work with  
OpenAIS.
plus the configuration is hell and there are a number of design/ 
implementation issues.

we're going to get together with the Red Hat guys to figure out what  
we're going to do for stonith in the new stack.
there will likely be a short-term solution next month.




More information about the Pacemaker mailing list