[Pacemaker] [Linux-HA] Revenge of the cluster-glue clplumbing ABI change (a public service announcement)
Simon Horman
horms at verge.net.au
Tue Aug 17 07:50:27 UTC 2010
On Wed, Jul 21, 2010 at 01:41:09AM -0600, Tim Serong wrote:
> Hi All,
>
> A while ago (April, from memory), there was an ABI change in
> clplumbing in cluster-glue. Presumably this went mostly unnoticed
> in general usage, however I have twice seen systems where the cluster
> could not run because of a missing (or incorrect) libglue2 package.
> One was my development system, with a dodgy build, the other was
> mentioned on #linux-ha yesterday, and was the result of ignoring a
> conflict error when installing the pacemaker RPM on openSUSE. So,
> let me be clear, this is not something anyone should need to worry
> about... But I thought I'd mention it here, because the error
> messages you get are, IMO, not very obvious.
>
> Symptoms of a mismatched pacemaker/libglue build are errors like:
>
> lrmd: [3004]: ERROR:
> main: can not create wait connection for command.
> lrmd: [3004]: ERROR:
> Startup aborted (can't create comm channel). Shutting down.
> ...
> pengine: [4011]: ERROR:
> init_client_ipc_comms_nodispatch: Could not access channel on:
> /var/run/crm/pengine
> corosync[4000]: [pcmk ] ERROR:
> pcmk_wait_dispatch: Child process pengine exited (pid=4011, rc=1)
> corosync[4000]: [pcmk ] notice:
> pcmk_wait_dispatch: Respawning failed child process: pengine
>
> If your cluster won't start and you see this in /var/log/messages,
> make sure libglue2 is up to date. And now that I've mentioned this
> here and it's made it to the mailing list archive, Google will know,
> and nobody else will ever have this problem again.
>
> This has been a public service announcement. Thank you for reading.
Could we get the .so bumped accordingly in the next release of
cluster glue? That would at least help in managing the problem
once the new release has been made.
More information about the Pacemaker
mailing list