[Pacemaker] Pacemaker installed to custom location

James Masson james.masson at opencredo.com
Fri Apr 26 07:12:13 EDT 2013



On 26/04/13 01:29, Andrew Beekhof wrote:
>
> On 26/04/2013, at 12:12 AM, James Masson <james.masson at opencredo.com> wrote:
>
>>
>> Hi list,
>>
>> I'm trying to build and run pacemaker from a custom location.
>>
>> Corosync starts up fine.
>>
>> Pacemakerd does not - the result is.
>
> Try turning up the debug to see why the cib isn't happy:
>
>> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd:    error: pcmk_child_exit: 	Child process cib exited (pid=10484, rc=100)
>> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd:  warning: pcmk_child_exit: 	Pacemaker child process cib no longer
>
>
>
Hi Andrew,

debug log + strace are attached. The strace has something interesting...


5195  open("/dev/shm/qb-cpg-request-5173-5195-19-header", O_RDWR) = -1 
EACCES (Permission denied)


I know pacemaker uses shm to communicate. permissions on /dev/shm are (I 
think) correct.

root at 5627a5e1-9e30-4fe2-9178-6445e26a8ccc:~# ls -al /dev/shm/
total 8224
drwxrwx---  2 root vcap      80 2013-04-26 10:30 .
drwxr-xr-x 12 root root    3900 2013-04-26 08:23 ..
-rw-------  1 root root 8388608 2013-04-26 10:30 qb-corosync-blackbox-data
-rw-------  1 root root    8248 2013-04-26 10:28 qb-corosync-blackbox-header

When I changed permissions on /dev/shm to 777 - things get a little 
further - CIB stays up, crmd respawns, and I get this over and over 
again in the logs.

##################################
Apr 26 10:55:52 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_destroy:   Destroying 0 events
Apr 26 10:55:54 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_new:       Connecting 0x1a498e0 for uid=1000 gid=0 
pid=5775 id=95b6eca5-a34e-49e5-b0f8-74b84857d690
Apr 26 10:55:54 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_destroy:   Destroying 0 events
Apr 26 10:55:56 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_new:       Connecting 0x1a498e0 for uid=1000 gid=0 
pid=5775 id=117e515b-da4d-4842-9414-7b7d004e5c92
Apr 26 10:55:56 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_destroy:   Destroying 0 events
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_new:       Connecting 0x1a498e0 for uid=1000 gid=0 
pid=5775 id=cf7c10b1-14a1-47d1-9e2e-30707254256f
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_destroy:   Destroying 0 events
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd: 
    error: pcmk_child_exit:      Child process crmd exited (pid=5775, rc=2)
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd: 
    trace: update_node_processes:        Empty uname for node 839122954
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd: 
    debug: update_node_processes:        Node 
5627a5e1-9e30-4fe2-9178-6445e26a8ccc now has process list: 
00000000000000000000000000111112 (was 00000000000000000000000000111312)
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd: 
    trace: update_process_clients:       Sending process list to 0 children
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd: 
    trace: update_process_peers:         Sending <node 
uname="5627a5e1-9e30-4fe2-9178-6445e26a8ccc" proclist="1118482"/>
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd: 
   notice: pcmk_process_exit:    Respawning failed child process: crmd
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd: 
     info: start_child:  Forked child 5789 for process crmd
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd: 
    trace: update_node_processes:        Empty uname for node 839122954
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd: 
    debug: update_node_processes:        Node 
5627a5e1-9e30-4fe2-9178-6445e26a8ccc now has process list: 
00000000000000000000000000111312 (was 00000000000000000000000000111112)
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd: 
    trace: update_process_clients:       Sending process list to 0 children
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd: 
    trace: update_process_peers:         Sending <node 
uname="5627a5e1-9e30-4fe2-9178-6445e26a8ccc" proclist="1118994"/>
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd: 
    trace: crm_user_lookup:      Cluster user vcap has uid=1000 gid=1000
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd: 
    trace: mainloop_gio_callback:        New message from 
corosync-cpg[0x21b1c60]
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_new:       Connecting 0x1a498e0 for uid=1000 gid=0 
pid=5789 id=5dfb6f5a-8b53-42f6-b5f5-61e49efa93dd
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_new:       Connecting 0x1a636f0 for uid=1000 gid=0 
pid=5789 id=3198d49f-8ff9-4799-9496-1b9aed0de807
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_destroy:   Destroying 0 events
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_new:       Connecting 0x1a56cb0 for uid=1000 gid=0 
pid=5789 id=2713f990-2533-4fb8-82e0-31e40b1ef577
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_destroy:   Destroying 0 events
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_new:       Connecting 0x1a571f0 for uid=1000 gid=0 
pid=5789 id=2bf401a2-3bd5-43af-9328-0a53bb61d9f7
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_destroy:   Destroying 0 events
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_destroy:   Destroying 0 events
Apr 26 10:56:00 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_new:       Connecting 0x1a498e0 for uid=1000 gid=0 
pid=5789 id=7233fbec-3633-4a48-8fe7-3028bfa58029
Apr 26 10:56:00 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_destroy:   Destroying 0 events
Apr 26 10:56:02 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_new:       Connecting 0x1a498e0 for uid=1000 gid=0 
pid=5789 id=a7b76888-7137-4eb1-888d-d7a3ea273a4f
Apr 26 10:56:02 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_destroy:   Destroying 0 events
Apr 26 10:56:04 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_new:       Connecting 0x1a498e0 for uid=1000 gid=0 
pid=5789 id=4fbd695d-902b-4a29-957f-8d36fd072178
Apr 26 10:56:04 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_destroy:   Destroying 0 events
Apr 26 10:56:06 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc       lrmd: 
     info: crm_client_new:       Connecting 0x1a498e0 for uid=1000 gid=0 
pid=5789 id=a3e00689-d842-456d-957a-22e2e4e7eedf
##################

SHM while running...

#####################
root at 5627a5e1-9e30-4fe2-9178-6445e26a8ccc:~# ls -al /dev/shm/
total 34936
drwxrwxrwx  2 root vcap    1280 2013-04-26 10:57 .
drwxr-xr-x 12 root root    3900 2013-04-26 08:23 ..
-rw-------  1 root root 1048576 2013-04-26 10:54 
qb-cfg-event-5598-5754-16-data
-rw-------  1 root root    8248 2013-04-26 10:54 
qb-cfg-event-5598-5754-16-header
-rw-------  1 root root 1048576 2013-04-26 10:54 
qb-cfg-request-5598-5754-16-data
-rw-------  1 root root    8252 2013-04-26 10:54 
qb-cfg-request-5598-5754-16-header
-rw-------  1 root root 1048576 2013-04-26 10:54 
qb-cfg-response-5598-5754-16-data
-rw-------  1 root root    8248 2013-04-26 10:54 
qb-cfg-response-5598-5754-16-header
-rw-rw----  1 vcap root  524288 2013-04-26 10:54 
qb-cib_rw-event-5756-5757-9-data
-rw-rw----  1 vcap root    8248 2013-04-26 10:54 
qb-cib_rw-event-5756-5757-9-header
-rw-rw----  1 vcap root  524288 2013-04-26 10:54 
qb-cib_rw-event-5756-5759-10-data
-rw-rw----  1 vcap root    8248 2013-04-26 10:54 
qb-cib_rw-event-5756-5759-10-header
-rw-rw----  1 vcap root  524288 2013-04-26 10:54 
qb-cib_rw-request-5756-5757-9-data
-rw-rw----  1 vcap root    8252 2013-04-26 10:54 
qb-cib_rw-request-5756-5757-9-header
-rw-rw----  1 vcap root  524288 2013-04-26 10:54 
qb-cib_rw-request-5756-5759-10-data
-rw-rw----  1 vcap root    8252 2013-04-26 10:54 
qb-cib_rw-request-5756-5759-10-header
-rw-rw----  1 vcap root  524288 2013-04-26 10:54 
qb-cib_rw-response-5756-5757-9-data
-rw-rw----  1 vcap root    8248 2013-04-26 10:54 
qb-cib_rw-response-5756-5757-9-header
-rw-rw----  1 vcap root  524288 2013-04-26 10:54 
qb-cib_rw-response-5756-5759-10-data
-rw-rw----  1 vcap root    8248 2013-04-26 10:54 
qb-cib_rw-response-5756-5759-10-header
-rw-rw----  1 vcap root  524288 2013-04-26 10:56 
qb-cib_shm-event-5756-5808-7-data
-rw-rw----  1 vcap root    8248 2013-04-26 10:56 
qb-cib_shm-event-5756-5808-7-header
-rw-rw----  1 vcap root  524288 2013-04-26 10:56 
qb-cib_shm-request-5756-5808-7-data
-rw-rw----  1 vcap root    8252 2013-04-26 10:56 
qb-cib_shm-request-5756-5808-7-header
-rw-rw----  1 vcap root  524288 2013-04-26 10:56 
qb-cib_shm-response-5756-5808-7-data
-rw-rw----  1 vcap root    8248 2013-04-26 10:56 
qb-cib_shm-response-5756-5808-7-header
-rw-------  1 root root 8388608 2013-04-26 10:56 qb-corosync-blackbox-data
-rw-------  1 root root    8248 2013-04-26 10:47 qb-corosync-blackbox-header
-rw-------  1 root root 1048576 2013-04-26 10:54 
qb-cpg-event-5598-5754-17-data
-rw-------  1 root root    8248 2013-04-26 10:54 
qb-cpg-event-5598-5754-17-header
-rw-------  1 vcap root 1048576 2013-04-26 10:54 
qb-cpg-event-5598-5756-19-data
-rw-------  1 vcap root    8248 2013-04-26 10:54 
qb-cpg-event-5598-5756-19-header
-rw-------  1 root root 1048576 2013-04-26 10:54 
qb-cpg-event-5598-5757-18-data
-rw-------  1 root root    8248 2013-04-26 10:54 
qb-cpg-event-5598-5757-18-header
-rw-------  1 vcap root 1048576 2013-04-26 10:54 
qb-cpg-event-5598-5759-20-data
-rw-------  1 vcap root    8248 2013-04-26 10:54 
qb-cpg-event-5598-5759-20-header
-rw-------  1 vcap root 1048576 2013-04-26 10:56 
qb-cpg-event-5598-5808-21-data
-rw-------  1 vcap root    8248 2013-04-26 10:56 
qb-cpg-event-5598-5808-21-header
-rw-------  1 root root 1048576 2013-04-26 10:54 
qb-cpg-request-5598-5754-17-data
-rw-------  1 root root    8252 2013-04-26 10:54 
qb-cpg-request-5598-5754-17-header
-rw-------  1 vcap root 1048576 2013-04-26 10:54 
qb-cpg-request-5598-5756-19-data
-rw-------  1 vcap root    8252 2013-04-26 10:54 
qb-cpg-request-5598-5756-19-header
-rw-------  1 root root 1048576 2013-04-26 10:54 
qb-cpg-request-5598-5757-18-data
-rw-------  1 root root    8252 2013-04-26 10:54 
qb-cpg-request-5598-5757-18-header
-rw-------  1 vcap root 1048576 2013-04-26 10:54 
qb-cpg-request-5598-5759-20-data
-rw-------  1 vcap root    8252 2013-04-26 10:54 
qb-cpg-request-5598-5759-20-header
-rw-------  1 vcap root 1048576 2013-04-26 10:56 
qb-cpg-request-5598-5808-21-data
-rw-------  1 vcap root    8252 2013-04-26 10:56 
qb-cpg-request-5598-5808-21-header
-rw-------  1 root root 1048576 2013-04-26 10:54 
qb-cpg-response-5598-5754-17-data
-rw-------  1 root root    8248 2013-04-26 10:54 
qb-cpg-response-5598-5754-17-header
-rw-------  1 vcap root 1048576 2013-04-26 10:54 
qb-cpg-response-5598-5756-19-data
-rw-------  1 vcap root    8248 2013-04-26 10:54 
qb-cpg-response-5598-5756-19-header
-rw-------  1 root root 1048576 2013-04-26 10:54 
qb-cpg-response-5598-5757-18-data
-rw-------  1 root root    8248 2013-04-26 10:54 
qb-cpg-response-5598-5757-18-header
-rw-------  1 vcap root 1048576 2013-04-26 10:54 
qb-cpg-response-5598-5759-20-data
-rw-------  1 vcap root    8248 2013-04-26 10:54 
qb-cpg-response-5598-5759-20-header
-rw-------  1 vcap root 1048576 2013-04-26 10:56 
qb-cpg-response-5598-5808-21-data
-rw-------  1 vcap root    8248 2013-04-26 10:56 
qb-cpg-response-5598-5808-21-header
-rw-------  1 vcap root 1048576 2013-04-26 10:56 
qb-quorum-event-5598-5808-22-data
-rw-------  1 vcap root    8248 2013-04-26 10:56 
qb-quorum-event-5598-5808-22-header
-rw-------  1 vcap root 1048576 2013-04-26 10:56 
qb-quorum-request-5598-5808-22-data
-rw-------  1 vcap root    8252 2013-04-26 10:56 
qb-quorum-request-5598-5808-22-header
-rw-------  1 vcap root 1048576 2013-04-26 10:56 
qb-quorum-response-5598-5808-22-data
-rw-------  1 vcap root    8248 2013-04-26 10:56 
qb-quorum-response-5598-5808-22-header
#####################################

snippets from pacemaker-strace after chmod 777 /dev/shm

###################
CIB
5833  chown("/dev/shm/qb-cib_shm-event-5833-5858-7-data", 4294967295, 
1000) = -1 EPERM (Operation not permitted)
5833  chown("/dev/shm/qb-cib_shm-event-5833-5858-7-header", 4294967295, 
1000) = -1 EPERM (Operation not permitted)
5833  chmod("/dev/shm/qb-cib_shm-event-5833-5858-7-data", 0660) = 0
5833  chmod("/dev/shm/qb-cib_shm-event-5833-5858-7-header", 0660) = 0
####################
CRMD
5838  connect(3, {sa_family=AF_FILE, path=@"cib_shm"}, 110) = -1 
ECONNREFUSED (Connection refused)
5838  close(3)                          = 0
5838  shutdown(4294967295, 2 /* send and receive */) = -1 EBADF (Bad 
file descriptor)
5838  close(4294967295)                 = -1 EBADF (Bad file descriptor)
5838  write(2, "Could not establish cib_shm conn"..., 65) = 65
5838  clock_gettime(CLOCK_REALTIME, {1366973927, 255600506}) = 0
5838  munmap(0x7f6c1bcc3000, 528384)    = 0
#########################

this is looking more and more like a permissions problem on files 
read/written on SHM.

I read  - 
http://www.ultrabug.fr/pacemaker-vulnerability-and-v1-1-9-release/ - and 
added root to group vcap, and vcap to group root. ( vcap is my 
equivalent for haclient user/group) - no change to behavior. I did add 
"--with-acls" at compile time - but I'm not planning on using them.

regards

James M


-------------- next part --------------
A non-text attachment was scrubbed...
Name: corosync.log.gz
Type: application/x-gzip
Size: 5019 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130426/cdabe191/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pacemaker.strace.gz
Type: application/x-gzip
Size: 67528 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130426/cdabe191/attachment-0003.bin>


More information about the Pacemaker mailing list