[Pacemaker] Lots of Issues with Live Pacemaker Cluster

Dejan Muhamedagic dejanmm at fastmail.fm
Tue Mar 15 11:36:39 EDT 2011


On Tue, Mar 15, 2011 at 02:27:16PM -0000, Darren.Mansell at opengi.co.uk wrote:
> On Mon, 2011-03-14 at 17:35 +0100, Dejan Muhamedagic wrote:
> > Hi,
> > 
> > On Mon, Mar 14, 2011 at 10:57:27AM -0000, Darren.Mansell at opengi.co.uk wrote:
> > > Hello everyone.
> > > 
> > >  
> > > 
> > > I built and put into production without adequate testing a 2 node
> > > cluster running Ubuntu 10.04 LTS with Pacemaker and associated packages
> > > from the Ubuntu-HA-maintainers repo
> > > (https://launchpad.net/~ubuntu-ha-maintainers/+archive/ppa). 
> > 
> > Not good to go live without sufficient testing. Testing is as
> > important as anything else. Or even more important. If there
> > isn't enough time for testing, then better to go without
> > clustering.
> 
> I've very quickly realised this fact. Even if under pressure to put a
> cluster live, don't give in until you're 100% happy with it. It WILL
> bite you, and it won't be anyone else's fault but yours.
> 
> > 
> > > 2.       Crm shell won't load from a text file. When I use crm configure
> > > < crm.txt, it will run through the file, complaining about the default
> > > timeout being less than 240, but doesn't load anything. So I go into the
> > > crm shell and set default-action-timeout to 240, commit and exit and do
> > > the same. This time it just exits silently, without loading the config.
> > 
> > Strange. I assume that you run version 1.0.x which I don't use
> > very often, but I cannot recall seeing this problem.
> 
> I'm not sure if I need to put a commit at the end of the input file? I
> always assumed it had an implicit commit. I'll test this next time I get
> chance.

No, commit is not necessary. crm should notice that the input
comes from a file or here document and that implies commit.

> > > If I go into the crm shell and use load replace crm.txt it will work.
> > 
> > Loading from a file was really meant to be done with "configure
> > load". Now, if there are errors/warnings in the configuration,
> > what happens depends on check-* options for semantic checks.
> 
> I'll try that armed with this information next time.
> 
> > 
> > > 3.       Crm shell tab completes don't work unless you put an incorrect
> > > entry in first. I'm sure this is a python readline problem, as it also
> > > happens in SLE 11 HAE SP1 (but not in pre-SP1). I assume everyone
> > > associated (Dejan?) is aware of the problem, but highlighting it just in
> > > case.
> > 
> > No, I'm not aware of it. Tab completion works here, though a bit
> > differently from 1.0 due to lazy creation of the completion
> > tables. You need to enter another level at least once before the
> > tab completion is going to work for that level. For instance,
> > it won't work in this case:
> > 
> > crm(live)# resource <TAB><TAB>
> > 
> > But it would once the user enters the resource level:
> > 
> > crm(live)resource# <TAB><TAB>
> > bye           failcount     move          restart	unmigrate 
> > cd            help          param         show          unmove 
> > cleanup       list          promote       start         up 
> > demote        manage        quit          status	utilization 
> > end           meta          refresh       stop          
> > exit          migrate       reprobe       unmanage      
> > 
> > Can you elaborate "put incorrect entry first"?
> 
> I think this is more down to my lack of understanding of how it's
> changed then. I'm used to < 1.0 clusters and the crm shell would always
> tab complete *almost* everything. IIRC only location score rules etc
> wouldn't.
> 
> However, I think my confusion has arisen due to this behaviour:
> 
> crm(live)# resource mi<TAB><TAB>
> nothing
> crm(live)# resource mi<enter>
> ERROR: syntax: mi
> crm(live)# resource mi<TAB>
> crm(live)# resource migrate
> 
> It will tab-complete the first and second level, if you've already
> entered an incorrect parameter.

Right, that's exactly what I described above. I can see now that
the lazy table loading went into 1.0 too. The syntax error has
nothing to do with it, but on entering "resource whatever" crm
goes briefly to the resource level and the tables are loaded. I
know that it looks a bit confusing, but I reckoned that benefits
of faster startup outweigh this issue.

Thanks,

Dejan

> Regards,
> Darren Mansell
> 




More information about the Pacemaker mailing list