[Pacemaker] Issues with constraints - working for start/stop, being ignored on "failures"
Cnut Jansen
work at cnutjansen.eu
Wed Jun 9 23:12:36 UTC 2010
Am 07.06.2010 03:07, schrieb Tim Serong:
> On 6/2/2010 at 11:10 AM, Cnut Jansen<work at cnutjansen.eu> wrote:
>
>> About those ":start" specifiers on the mount-resources's order
>> constraints you're of course right, and I also allready knew about that.
>> They're just remains from some tests (probably seek for (other?)
>> workarounds or something) I did, which I only - due to their (to my
>> knowledge) harmless redundancy - so far allways forgot to remove again
>> when doing other, more relevant/important changes. you know, due to the
>> crm-shell's (which I currently use for editing my configuration)
>> canceling all resource monitor operations on the node the crm-shell is
>> started on, I prefer to avoid starting it as much as possible for
>> allways having to make sure I afterwards made all monitor operations run
>> again (i.e. switch cluster's maintenance-mode on/off or switch node to
>> standby and back online).
>>
> Say what? The CRM shell shouldn't be canceling ops...
>
That's what I had expected too, even though - the more I got used to it
while still haven't found anything just mentioning it at all, and thus
making me make assumptions about it - I also allready considered
possible that it was just simply intended behaviour - maybe since one
shouldn't call the crm-shell on a live CIB anyway, but only on shadow
CIBs, or something - and as such just that obvious for everyone else
that no one even thought about just wasting time for a warning note
about it in any of the step-by-step-tutorials, for dumbheads like me. d-#
But it's perfectly, 100%ly reproducable in our office's current
testing-cluster (SLES 11 SP0, kept up-to-date).
Meanwhile I got to "enjoy" some unexpected "holidays" (sick at home) and
used some of it productively to start setting up a cluster with a little
more recent software (i.e. Pacemaker 1.0.8; shipped with Debian
Squeeze/testing), and here I so far couldn't find any unexpected cancels
of monitor ops. So I guess that it might really be just due to a bug in
elder Pacemaker/s or something.
We'll see when I'm back in office and upgrading our testing-cluster to
SLES 11 SP1.
>> About those 0-scores, unfortunately they're necessary, since they're the
>> - afaik - official workaround for to prevent instances of clone
>> resources being also restarted on nodes where it's unnecessary to do so.
>> So with scores set to "inf" instead, when I for example put one node
>> into standby and/or back to online, most clone resources would also be
>> restarted on the other node. That's not acceptable for production.
>> This behaviour is according to what I remember having read only changed
>> in Pacemaker 1.0.7, which isn't shipped with SLES 11 yet. I'm hoping for
>> SLES 11 SP1 to change that, but haven't found any reliable informations
>> about its version of Pacemaker yet.
>>
> SLES 11 SP1 and the SLE High Availability Extension 11 SP1 are now
> available for download fromhttp://download.novell.com/ - this includes
> Pacemaker 1.1.2.
>
Yeah, I know. And it's what we finally decided to wait for about all so
far unresolved problems, hoping that many of them would get solved with
more recent cluster software. (-;
For example, I expect to - about the order constraints - be able to
change the scores back to inf then, without having clones unnecessarily
be restarted too (changed in Pacemaker 1.0.7). Then also my order
constraints issues might(!) allready be solved too, since they (as far
as I remember my testing) were also allready ok in SP0 with inf-scores.
p.s.: Even though wrong newsgroup; but since there are Novell guys here
and just mentioned to upgrade to SP1: d-;
Why does the SLES 11 Upgrade-HowTo (
http://www.novell.com/support/documentLink.do?externalID=7005410 ; tried
the zypper way) work correctly for SLES itself and does even show
SLE-HAE SP1 stuff during that "<product>"-grep - and even install
something about it; got output from 2nd try on that the HAE SP1 product
stuff (don't remember the exact name currently) was allready installed
-, but afterwards I only see SP1-repositories for SLES, not for SLE-HAE
(still only SP0)... while our company's Novell-account allready shows
5-6 HAE-installations on that one machine?! o_O
More information about the Pacemaker
mailing list