[Pacemaker] solaris problem

Andrei Belov defanator at gmail.com
Mon Mar 25 07:30:41 EDT 2013


Andreas,

just tried "PCMK_ipc_type=socket pacemaker -fV" - a bunch of additional "event_send" errors appeared:

Mar 25 11:15:55 [33641] ha1 corosync error   [MAIN  ] event_send retuned -32, expected 256!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 217!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 219!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 256!
Mar 25 11:15:55 [53980]    pengine:    error: qb_ipcs_us_publish:       Could not bind AF_UNIX (/var/run/pengine): Permission denied (13)
Mar 25 11:15:55 [53980]    pengine:    error: mainloop_add_ipc_server:  Could not start pengine IPC server: Unknown error (-13)
Mar 25 11:15:55 [53980]    pengine:    error: main:     Couldn't start IPC server
Mar 25 11:15:55 [53975] pacemakerd:    error: pcmk_child_exit:  Child process pengine exited (pid=53980, rc=1)
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 256!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 223!
Mar 25 11:15:55 [53979]      attrd:    error: qb_ipcs_us_publish:       Could not bind AF_UNIX (/var/run/attrd): Permission denied (13)
Mar 25 11:15:55 [53979]      attrd:    error: mainloop_add_ipc_server:  Could not start attrd IPC server: Unknown error (-13)
Mar 25 11:15:55 [53979]      attrd:    error: main:     Could not start IPC server
Mar 25 11:15:55 [53979]      attrd:    error: main:     Aborting startup
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 223!
Mar 25 11:15:55 [53975] pacemakerd:    error: pcmk_child_exit:  Child process attrd exited (pid=53979, rc=100)
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 256!
Mar 25 11:15:55 [53976]        cib:    error: qb_ipcs_us_publish:       Could not bind AF_UNIX (/var/run/cib_ro): Permission denied (13)
Mar 25 11:15:55 [53976]        cib:    error: mainloop_add_ipc_server:  Could not start cib_ro IPC server: Unknown error (-13)
Mar 25 11:15:55 [53976]        cib:    error: qb_ipcs_us_publish:       Could not bind AF_UNIX (/var/run/cib_rw): Permission denied (13)
Mar 25 11:15:55 [53976]        cib:    error: mainloop_add_ipc_server:  Could not start cib_rw IPC server: Unknown error (-13)
Mar 25 11:15:55 [53976]        cib:    error: qb_ipcs_us_publish:       Could not bind AF_UNIX (/var/run/cib_shm): Permission denied (13)
Mar 25 11:15:55 [53976]        cib:    error: mainloop_add_ipc_server:  Could not start cib_shm IPC server: Unknown error (-13)
Mar 25 11:15:55 [53976]        cib:    error: cib_init:         Couldnt start all IPC channels, exiting.
Mar 25 11:15:55 [53975] pacemakerd:    error: pcmk_child_exit:  Child process cib exited (pid=53976, rc=255)
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 223!
Mar 25 11:16:04 [53977] stonith-ng:    error: setup_cib:        Could not connect to the CIB service: -134 fffffd7fc421a0b0
Mar 25 11:16:04 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, expected 217!
Mar 25 11:16:04 [53975] pacemakerd:   notice: pcmk_shutdown_worker:     Attempting to inhibit respawning after fatal error


# fgrep 32 /usr/include/sys/errno.h 
#define EPIPE   32      /* Broken pipe                          */



On Mar 25, 2013, at 13:55 , "Grüninger, Andreas (LGL Extern)" <Andreas.Grueninger at lgl.bwl.de> wrote:

> With solaris/openindiana you should use this setting 
> export PCMK_ipc_type=socket 
> 
> Andreas
> 
> -----Ursprüngliche Nachricht-----
> Von: Andrei Belov [mailto:defanator at gmail.com] 
> Gesendet: Montag, 25. März 2013 10:43
> An: pacemaker at oss.clusterlabs.org
> Betreff: [Pacemaker] solaris problem
> 
> Hi folks,
> 
> I'm trying to build test HA cluster on Solaris 5.11 using libqb 0.14.4, corosync 2.3.0 and pacemaker 1.1.8, and I'm facing a strange problem while starting pacemaker.
> 
> Log shows the following errors:
> 
> Mar 25 09:21:26 [33720]       lrmd:    error: mainloop_add_ipc_server:  Could not start lrmd IPC server: Unknown error (-48)
> Mar 25 09:21:26 [33720]       lrmd:    error: try_server_create:        New IPC server could not be created because another lrmd process exists, sending shutdown command to old lrmd process.
> Mar 25 09:21:26 [33720]       lrmd:    error: mainloop_add_ipc_server:  Could not start lrmd IPC server: Unknown error (-48)
> Mar 25 09:21:26 [33720]       lrmd:    error: try_server_create:        New IPC server could not be created because another lrmd process exists, sending shutdown command to old lrmd process.
> Mar 25 09:21:26 [33720]       lrmd:    error: mainloop_add_ipc_server:  Could not start lrmd IPC server: Unknown error (-48)
> Mar 25 09:21:26 [33720]       lrmd:    error: try_server_create:        New IPC server could not be created because another lrmd process exists, sending shutdown command to old lrmd process.
> Mar 25 09:21:26 [33720]       lrmd:    error: mainloop_add_ipc_server:  Could not start lrmd IPC server: Unknown error (-48)
> Mar 25 09:21:26 [33720]       lrmd:    error: try_server_create:        New IPC server could not be created because another lrmd process exists, sending shutdown command to old lrmd process.
> Mar 25 09:21:26 [33720]       lrmd:    error: mainloop_add_ipc_server:  Could not start lrmd IPC server: Unknown error (-48)
> Mar 25 09:21:26 [33720]       lrmd:    error: try_server_create:        New IPC server could not be created because another lrmd process exists, sending shutdown command to old lrmd process.
> Mar 25 09:21:26 [33720]       lrmd:    error: mainloop_add_ipc_server:  Could not start lrmd IPC server: Unknown error (-48)
> Mar 25 09:21:26 [33720]       lrmd:    error: try_server_create:        New IPC server could not be created because another lrmd process exists, sending shutdown command to old lrmd process.
> Mar 25 09:21:26 [33720]       lrmd:    error: mainloop_add_ipc_server:  Could not start lrmd IPC server: Unknown error (-48)
> Mar 25 09:21:26 [33720]       lrmd:    error: try_server_create:        New IPC server could not be created because another lrmd process exists, sending shutdown command to old lrmd process.
> Mar 25 09:21:26 [33720]       lrmd:    error: mainloop_add_ipc_server:  Could not start lrmd IPC server: Unknown error (-48)
> Mar 25 09:21:26 [33720]       lrmd:    error: try_server_create:        New IPC server could not be created because another lrmd process exists, sending shutdown command to old lrmd process.
> Mar 25 09:21:26 [33720]       lrmd:    error: mainloop_add_ipc_server:  Could not start lrmd IPC server: Unknown error (-48)
> Mar 25 09:21:26 [33720]       lrmd:    error: try_server_create:        New IPC server could not be created because another lrmd process exists, sending shutdown command to old lrmd process.
> Mar 25 09:21:26 [33720]       lrmd:    error: mainloop_add_ipc_server:  Could not start lrmd IPC server: Unknown error (-48)
> Mar 25 09:21:26 [33720]       lrmd:    error: try_server_create:        New IPC server could not be created because another lrmd process exists, sending shutdown command to old lrmd process.
> Mar 25 09:21:26 [33720]       lrmd:    error: main:     Failed to allocate lrmd server.  shutting down
> Mar 25 09:21:26 [33722]    pengine:    error: mainloop_add_ipc_server:  Could not start pengine IPC server: Unknown error (-48)
> Mar 25 09:21:26 [33722]    pengine:    error: main:     Couldn't start IPC server
> Mar 25 09:21:26 [33717] pacemakerd:    error: pcmk_child_exit:  Child process lrmd exited (pid=33720, rc=255)
> Mar 25 09:21:26 [33721]      attrd:    error: qb_ipcs_us_publish:       Could not bind AF_UNIX (/var/run/attrd): Permission denied (13)
> Mar 25 09:21:26 [33721]      attrd:    error: mainloop_add_ipc_server:  Could not start attrd IPC server: Unknown error (-13)
> Mar 25 09:21:26 [33721]      attrd:    error: main:     Could not start IPC server
> Mar 25 09:21:26 [33721]      attrd:    error: main:     Aborting startup
> Mar 25 09:21:26 [33717] pacemakerd:    error: pcmk_child_exit:  Child process pengine exited (pid=33722, rc=1)
> Mar 25 09:21:26 [33717] pacemakerd:    error: pcmk_child_exit:  Child process attrd exited (pid=33721, rc=100)
> Mar 25 09:21:26 [33718]        cib:    error: qb_ipcs_us_publish:       Could not bind AF_UNIX (/var/run/cib_ro): Permission denied (13)
> Mar 25 09:21:26 [33718]        cib:    error: mainloop_add_ipc_server:  Could not start cib_ro IPC server: Unknown error (-13)
> Mar 25 09:21:26 [33718]        cib:    error: qb_ipcs_us_publish:       Could not bind AF_UNIX (/var/run/cib_rw): Permission denied (13)
> Mar 25 09:21:26 [33718]        cib:    error: mainloop_add_ipc_server:  Could not start cib_rw IPC server: Unknown error (-13)
> Mar 25 09:21:26 [33718]        cib:    error: mainloop_add_ipc_server:  Could not start cib_shm IPC server: Unknown error (-48)
> Mar 25 09:21:26 [33718]        cib:    error: cib_init:         Couldnt start all IPC channels, exiting.
> Mar 25 09:21:26 [33717] pacemakerd:    error: pcmk_child_exit:  Child process cib exited (pid=33718, rc=255)
> Mar 25 09:21:35 [33719] stonith-ng:    error: setup_cib:        Could not connect to the CIB service: -134 fffffd7fc421a0b0
> Mar 25 09:21:35 [33717] pacemakerd:   notice: pcmk_shutdown_worker:     Attempting to inhibit respawning after fatal error
> 
> Full log (in case of any things I've probably missed) is attached.
> 
> I wonder to know the reason of "unknown error (-48)" - on this system 48 in errno.h is "ENOTSUP", but I haven't found the exact place in code where this may happen (so I'm not sure about that).
> 
> Just for record - I'm able to run corosync on two nodes and see them connected without any visible problems - thus, I suppose there may be something wrong with either pacemaker or libqb.
> 
> Any help will be greatly appreciated!
> 
> Thanks,
> Andrei.
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org





More information about the Pacemaker mailing list