[Pacemaker] Pacemaker installed to custom location
James Masson
james.masson at opencredo.com
Fri Apr 26 11:12:13 UTC 2013
On 26/04/13 01:29, Andrew Beekhof wrote:
>
> On 26/04/2013, at 12:12 AM, James Masson <james.masson at opencredo.com> wrote:
>
>>
>> Hi list,
>>
>> I'm trying to build and run pacemaker from a custom location.
>>
>> Corosync starts up fine.
>>
>> Pacemakerd does not - the result is.
>
> Try turning up the debug to see why the cib isn't happy:
>
>> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: error: pcmk_child_exit: Child process cib exited (pid=10484, rc=100)
>> Apr 25 13:54:10 [10482] fcde02a2-cc41-4c58-b6d2-b7bb0bada436 pacemakerd: warning: pcmk_child_exit: Pacemaker child process cib no longer
>
>
>
Hi Andrew,
debug log + strace are attached. The strace has something interesting...
5195 open("/dev/shm/qb-cpg-request-5173-5195-19-header", O_RDWR) = -1
EACCES (Permission denied)
I know pacemaker uses shm to communicate. permissions on /dev/shm are (I
think) correct.
root at 5627a5e1-9e30-4fe2-9178-6445e26a8ccc:~# ls -al /dev/shm/
total 8224
drwxrwx--- 2 root vcap 80 2013-04-26 10:30 .
drwxr-xr-x 12 root root 3900 2013-04-26 08:23 ..
-rw------- 1 root root 8388608 2013-04-26 10:30 qb-corosync-blackbox-data
-rw------- 1 root root 8248 2013-04-26 10:28 qb-corosync-blackbox-header
When I changed permissions on /dev/shm to 777 - things get a little
further - CIB stays up, crmd respawns, and I get this over and over
again in the logs.
##################################
Apr 26 10:55:52 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_destroy: Destroying 0 events
Apr 26 10:55:54 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_new: Connecting 0x1a498e0 for uid=1000 gid=0
pid=5775 id=95b6eca5-a34e-49e5-b0f8-74b84857d690
Apr 26 10:55:54 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_destroy: Destroying 0 events
Apr 26 10:55:56 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_new: Connecting 0x1a498e0 for uid=1000 gid=0
pid=5775 id=117e515b-da4d-4842-9414-7b7d004e5c92
Apr 26 10:55:56 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_destroy: Destroying 0 events
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_new: Connecting 0x1a498e0 for uid=1000 gid=0
pid=5775 id=cf7c10b1-14a1-47d1-9e2e-30707254256f
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_destroy: Destroying 0 events
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd:
error: pcmk_child_exit: Child process crmd exited (pid=5775, rc=2)
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd:
trace: update_node_processes: Empty uname for node 839122954
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd:
debug: update_node_processes: Node
5627a5e1-9e30-4fe2-9178-6445e26a8ccc now has process list:
00000000000000000000000000111112 (was 00000000000000000000000000111312)
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd:
trace: update_process_clients: Sending process list to 0 children
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd:
trace: update_process_peers: Sending <node
uname="5627a5e1-9e30-4fe2-9178-6445e26a8ccc" proclist="1118482"/>
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd:
notice: pcmk_process_exit: Respawning failed child process: crmd
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd:
info: start_child: Forked child 5789 for process crmd
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd:
trace: update_node_processes: Empty uname for node 839122954
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd:
debug: update_node_processes: Node
5627a5e1-9e30-4fe2-9178-6445e26a8ccc now has process list:
00000000000000000000000000111312 (was 00000000000000000000000000111112)
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd:
trace: update_process_clients: Sending process list to 0 children
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd:
trace: update_process_peers: Sending <node
uname="5627a5e1-9e30-4fe2-9178-6445e26a8ccc" proclist="1118994"/>
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd:
trace: crm_user_lookup: Cluster user vcap has uid=1000 gid=1000
Apr 26 10:55:58 [5754] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc pacemakerd:
trace: mainloop_gio_callback: New message from
corosync-cpg[0x21b1c60]
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_new: Connecting 0x1a498e0 for uid=1000 gid=0
pid=5789 id=5dfb6f5a-8b53-42f6-b5f5-61e49efa93dd
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_new: Connecting 0x1a636f0 for uid=1000 gid=0
pid=5789 id=3198d49f-8ff9-4799-9496-1b9aed0de807
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_destroy: Destroying 0 events
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_new: Connecting 0x1a56cb0 for uid=1000 gid=0
pid=5789 id=2713f990-2533-4fb8-82e0-31e40b1ef577
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_destroy: Destroying 0 events
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_new: Connecting 0x1a571f0 for uid=1000 gid=0
pid=5789 id=2bf401a2-3bd5-43af-9328-0a53bb61d9f7
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_destroy: Destroying 0 events
Apr 26 10:55:58 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_destroy: Destroying 0 events
Apr 26 10:56:00 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_new: Connecting 0x1a498e0 for uid=1000 gid=0
pid=5789 id=7233fbec-3633-4a48-8fe7-3028bfa58029
Apr 26 10:56:00 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_destroy: Destroying 0 events
Apr 26 10:56:02 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_new: Connecting 0x1a498e0 for uid=1000 gid=0
pid=5789 id=a7b76888-7137-4eb1-888d-d7a3ea273a4f
Apr 26 10:56:02 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_destroy: Destroying 0 events
Apr 26 10:56:04 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_new: Connecting 0x1a498e0 for uid=1000 gid=0
pid=5789 id=4fbd695d-902b-4a29-957f-8d36fd072178
Apr 26 10:56:04 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_destroy: Destroying 0 events
Apr 26 10:56:06 [5758] 5627a5e1-9e30-4fe2-9178-6445e26a8ccc lrmd:
info: crm_client_new: Connecting 0x1a498e0 for uid=1000 gid=0
pid=5789 id=a3e00689-d842-456d-957a-22e2e4e7eedf
##################
SHM while running...
#####################
root at 5627a5e1-9e30-4fe2-9178-6445e26a8ccc:~# ls -al /dev/shm/
total 34936
drwxrwxrwx 2 root vcap 1280 2013-04-26 10:57 .
drwxr-xr-x 12 root root 3900 2013-04-26 08:23 ..
-rw------- 1 root root 1048576 2013-04-26 10:54
qb-cfg-event-5598-5754-16-data
-rw------- 1 root root 8248 2013-04-26 10:54
qb-cfg-event-5598-5754-16-header
-rw------- 1 root root 1048576 2013-04-26 10:54
qb-cfg-request-5598-5754-16-data
-rw------- 1 root root 8252 2013-04-26 10:54
qb-cfg-request-5598-5754-16-header
-rw------- 1 root root 1048576 2013-04-26 10:54
qb-cfg-response-5598-5754-16-data
-rw------- 1 root root 8248 2013-04-26 10:54
qb-cfg-response-5598-5754-16-header
-rw-rw---- 1 vcap root 524288 2013-04-26 10:54
qb-cib_rw-event-5756-5757-9-data
-rw-rw---- 1 vcap root 8248 2013-04-26 10:54
qb-cib_rw-event-5756-5757-9-header
-rw-rw---- 1 vcap root 524288 2013-04-26 10:54
qb-cib_rw-event-5756-5759-10-data
-rw-rw---- 1 vcap root 8248 2013-04-26 10:54
qb-cib_rw-event-5756-5759-10-header
-rw-rw---- 1 vcap root 524288 2013-04-26 10:54
qb-cib_rw-request-5756-5757-9-data
-rw-rw---- 1 vcap root 8252 2013-04-26 10:54
qb-cib_rw-request-5756-5757-9-header
-rw-rw---- 1 vcap root 524288 2013-04-26 10:54
qb-cib_rw-request-5756-5759-10-data
-rw-rw---- 1 vcap root 8252 2013-04-26 10:54
qb-cib_rw-request-5756-5759-10-header
-rw-rw---- 1 vcap root 524288 2013-04-26 10:54
qb-cib_rw-response-5756-5757-9-data
-rw-rw---- 1 vcap root 8248 2013-04-26 10:54
qb-cib_rw-response-5756-5757-9-header
-rw-rw---- 1 vcap root 524288 2013-04-26 10:54
qb-cib_rw-response-5756-5759-10-data
-rw-rw---- 1 vcap root 8248 2013-04-26 10:54
qb-cib_rw-response-5756-5759-10-header
-rw-rw---- 1 vcap root 524288 2013-04-26 10:56
qb-cib_shm-event-5756-5808-7-data
-rw-rw---- 1 vcap root 8248 2013-04-26 10:56
qb-cib_shm-event-5756-5808-7-header
-rw-rw---- 1 vcap root 524288 2013-04-26 10:56
qb-cib_shm-request-5756-5808-7-data
-rw-rw---- 1 vcap root 8252 2013-04-26 10:56
qb-cib_shm-request-5756-5808-7-header
-rw-rw---- 1 vcap root 524288 2013-04-26 10:56
qb-cib_shm-response-5756-5808-7-data
-rw-rw---- 1 vcap root 8248 2013-04-26 10:56
qb-cib_shm-response-5756-5808-7-header
-rw------- 1 root root 8388608 2013-04-26 10:56 qb-corosync-blackbox-data
-rw------- 1 root root 8248 2013-04-26 10:47 qb-corosync-blackbox-header
-rw------- 1 root root 1048576 2013-04-26 10:54
qb-cpg-event-5598-5754-17-data
-rw------- 1 root root 8248 2013-04-26 10:54
qb-cpg-event-5598-5754-17-header
-rw------- 1 vcap root 1048576 2013-04-26 10:54
qb-cpg-event-5598-5756-19-data
-rw------- 1 vcap root 8248 2013-04-26 10:54
qb-cpg-event-5598-5756-19-header
-rw------- 1 root root 1048576 2013-04-26 10:54
qb-cpg-event-5598-5757-18-data
-rw------- 1 root root 8248 2013-04-26 10:54
qb-cpg-event-5598-5757-18-header
-rw------- 1 vcap root 1048576 2013-04-26 10:54
qb-cpg-event-5598-5759-20-data
-rw------- 1 vcap root 8248 2013-04-26 10:54
qb-cpg-event-5598-5759-20-header
-rw------- 1 vcap root 1048576 2013-04-26 10:56
qb-cpg-event-5598-5808-21-data
-rw------- 1 vcap root 8248 2013-04-26 10:56
qb-cpg-event-5598-5808-21-header
-rw------- 1 root root 1048576 2013-04-26 10:54
qb-cpg-request-5598-5754-17-data
-rw------- 1 root root 8252 2013-04-26 10:54
qb-cpg-request-5598-5754-17-header
-rw------- 1 vcap root 1048576 2013-04-26 10:54
qb-cpg-request-5598-5756-19-data
-rw------- 1 vcap root 8252 2013-04-26 10:54
qb-cpg-request-5598-5756-19-header
-rw------- 1 root root 1048576 2013-04-26 10:54
qb-cpg-request-5598-5757-18-data
-rw------- 1 root root 8252 2013-04-26 10:54
qb-cpg-request-5598-5757-18-header
-rw------- 1 vcap root 1048576 2013-04-26 10:54
qb-cpg-request-5598-5759-20-data
-rw------- 1 vcap root 8252 2013-04-26 10:54
qb-cpg-request-5598-5759-20-header
-rw------- 1 vcap root 1048576 2013-04-26 10:56
qb-cpg-request-5598-5808-21-data
-rw------- 1 vcap root 8252 2013-04-26 10:56
qb-cpg-request-5598-5808-21-header
-rw------- 1 root root 1048576 2013-04-26 10:54
qb-cpg-response-5598-5754-17-data
-rw------- 1 root root 8248 2013-04-26 10:54
qb-cpg-response-5598-5754-17-header
-rw------- 1 vcap root 1048576 2013-04-26 10:54
qb-cpg-response-5598-5756-19-data
-rw------- 1 vcap root 8248 2013-04-26 10:54
qb-cpg-response-5598-5756-19-header
-rw------- 1 root root 1048576 2013-04-26 10:54
qb-cpg-response-5598-5757-18-data
-rw------- 1 root root 8248 2013-04-26 10:54
qb-cpg-response-5598-5757-18-header
-rw------- 1 vcap root 1048576 2013-04-26 10:54
qb-cpg-response-5598-5759-20-data
-rw------- 1 vcap root 8248 2013-04-26 10:54
qb-cpg-response-5598-5759-20-header
-rw------- 1 vcap root 1048576 2013-04-26 10:56
qb-cpg-response-5598-5808-21-data
-rw------- 1 vcap root 8248 2013-04-26 10:56
qb-cpg-response-5598-5808-21-header
-rw------- 1 vcap root 1048576 2013-04-26 10:56
qb-quorum-event-5598-5808-22-data
-rw------- 1 vcap root 8248 2013-04-26 10:56
qb-quorum-event-5598-5808-22-header
-rw------- 1 vcap root 1048576 2013-04-26 10:56
qb-quorum-request-5598-5808-22-data
-rw------- 1 vcap root 8252 2013-04-26 10:56
qb-quorum-request-5598-5808-22-header
-rw------- 1 vcap root 1048576 2013-04-26 10:56
qb-quorum-response-5598-5808-22-data
-rw------- 1 vcap root 8248 2013-04-26 10:56
qb-quorum-response-5598-5808-22-header
#####################################
snippets from pacemaker-strace after chmod 777 /dev/shm
###################
CIB
5833 chown("/dev/shm/qb-cib_shm-event-5833-5858-7-data", 4294967295,
1000) = -1 EPERM (Operation not permitted)
5833 chown("/dev/shm/qb-cib_shm-event-5833-5858-7-header", 4294967295,
1000) = -1 EPERM (Operation not permitted)
5833 chmod("/dev/shm/qb-cib_shm-event-5833-5858-7-data", 0660) = 0
5833 chmod("/dev/shm/qb-cib_shm-event-5833-5858-7-header", 0660) = 0
####################
CRMD
5838 connect(3, {sa_family=AF_FILE, path=@"cib_shm"}, 110) = -1
ECONNREFUSED (Connection refused)
5838 close(3) = 0
5838 shutdown(4294967295, 2 /* send and receive */) = -1 EBADF (Bad
file descriptor)
5838 close(4294967295) = -1 EBADF (Bad file descriptor)
5838 write(2, "Could not establish cib_shm conn"..., 65) = 65
5838 clock_gettime(CLOCK_REALTIME, {1366973927, 255600506}) = 0
5838 munmap(0x7f6c1bcc3000, 528384) = 0
#########################
this is looking more and more like a permissions problem on files
read/written on SHM.
I read -
http://www.ultrabug.fr/pacemaker-vulnerability-and-v1-1-9-release/ - and
added root to group vcap, and vcap to group root. ( vcap is my
equivalent for haclient user/group) - no change to behavior. I did add
"--with-acls" at compile time - but I'm not planning on using them.
regards
James M
-------------- next part --------------
A non-text attachment was scrubbed...
Name: corosync.log.gz
Type: application/x-gzip
Size: 5019 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130426/cdabe191/attachment-0004.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pacemaker.strace.gz
Type: application/x-gzip
Size: 67528 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130426/cdabe191/attachment-0005.bin>
More information about the Pacemaker
mailing list