[Pacemaker] unexpected quorum votes in new debian packages

Andrew Beekhof andrew at beekhof.net
Fri Feb 5 06:03:44 EST 2010


On Thu, Feb 4, 2010 at 9:16 PM, Michael Schwartzkopff <misch at multinet.de> wrote:
> Am Donnerstag, 4. Februar 2010 13:58:45 schrieb Andrew Beekhof:
>> On Wed, Feb 3, 2010 at 4:00 PM, Michael Schwartzkopff <misch at multinet.de>
> wrote:
>> > Hi,
>> >
>> > I just installed the new packages from madkiss on a completely new
>> > installed lenny.
>> >
>> > crm_mon shows two nodes online but "unexpected quorum votes" and the
>> > partition has no quorum.
>> >
>> > Logs under http://www.pastie.org/807580
>>
>> Debug would be useful.
>> This is readily reproducible right?
>>
>> What happens if you try and set a value for expected-quorum-votes ?
>
>
> If I use the old package (1.0.6) from the repository the error does NOT
> occure. So It seems to be a new bug.

I find it very hard to imagine.
Here's the complete list of possibly relevant changes:

[11:35 AM] beekhof at mobile ~/Development/pacemaker/stable-1.0 # hg log
-M --template "  + {desc|firstline|strip} CS: {node|short}\n" -r
Pacemaker-1.0.6:tip crmd cib lib | sort
  + Dev: cib: Fix minor build issue CS: e7357bc6ea31
  + Dev: cib: Repair the TLS send/recv logic for remote connections
CS: 1e08614f9e46
  + Dev: pengine: crmd: Ensure help text includes correct binary name
CS: 4f90f1765ad7
  + High: PE: Bug lf#2216 - Correctly identify the state of anonymous
clones when deciding when to probe CS: b10262d6a873
  + High: cib: Ensure the loop for login message terminates CS: 2cfec71918a0
  + High: cib: Finally fix reliability of receiving large messages
over remote plaintext connections CS: f44e9e3774ab
  + High: cib: Fix remote notifications CS: 611080fd3b6d
  + High: cib: For remote connections, default to CRM_DAEMON_USER
since thats the only one that the cib can validate the password for
using PAM CS: 0d5ffd67b5f4
  + High: cib: Remote plaintext - Retry sending parts of the message
that didn't fit the first time CS: 971d8989e9f0
  + High: crmd: Ensure batch-limit is correctly enforced CS: a133c5148932
  + High: crmd: Ensure we have the latest status after a transition
abort CS: 4362e591a273
  + Low: PE: Bug lf#2251 - Don't log uninstalled resource agents as
errors CS: cd2aaf7e35cf
  + Low: cib: Allow get_channel_token() to return errors CS: faf0008c9c34
  + Low: cib: Check also whether an user's primary group is matched
CS: 3be61bc8f7f6
  + Low: cib: Downgrade development logging CS: f7a8250d23fc
  + Low: cib: Re-calling cib_recv_remote_msg() when there is no TLS
message causes a deadlock CS: d22a6373bcf3
  + Low: crmd: C_TIMER_POPPED is now quite normal thanks to the
recheck timer, downgrade log message CS: 5c842bbe55a1
  + Low: pengine/crmd: move crm_log_init after version and metadata
calls (LF 2272) CS: 3d439c32d910
  + Medium: PE: Only complain about target-role=master for non m/s
resources CS: 723208edd3db
  + Medium: PE: Prevent non-multistate resources from being promoted
through target-role CS: 60b7d46f6cf0
  + Medium: PE: Silently fix requires=fencing for stonith resources so
that it can be set in op_defaults CS: b07b1f50d798
  + Medium: ais: Some clients such as gfs_controld want a cluster
name, allow one to be specified in corosync.conf CS: 66b7bfd467f3
  + Medium: cib: Clean up logic for receiving remote messages CS: 5acf9f2e9c9e
  + Medium: cib: Create valid notification control messages CS: a6d70b1b479d
  + Medium: cib: Indicate where the remote connection came from CS: 01674250decc
  + Medium: cib: Send password prompt to stderr so that stdout can be
redirected CS: 51a9c7382955

Nothing there looks even remotely relevant.
Its also highly unlikely that I managed to make a change that
resurrected the exact same symptom from 1.0.5 (which I also did
nothing to actually fix - except add logging thats not been triggered
here).

What does the cluster do if you manually set a value for
expected-quorum-votes? Does the value stay set in the CIB or get
erased?

I can't imagine that its influencing quorum either.
This is the only line indicating a value change, and its set to the
correct value:

  Feb  4 21:22:22 debian2 corosync[6221]:   [pcmk  ] info:
update_expected_votes: Expected quorum votes 1024 -> 2

Strange




More information about the Pacemaker mailing list