[Pacemaker] crm_master triggering assert section != NULL

Thu Oct 13 00:04:11 UTC 2011

Hi Lars
   indeed... this is much more interesting.  It should be pretty easy to 
port my stuff to that RA and build a configuration that fits my 
requirements.

Regards,

Yves

On 11-10-12 07:21 PM, Lars Ellenberg wrote:
> On Wed, Oct 12, 2011 at 05:09:45PM -0400, Yves Trudeau wrote:
>> Hi Florian,
>>
>> On 11-10-12 04:09 PM, Florian Haas wrote:
>>> On 2011-10-12 21:46, Yves Trudeau wrote:
>>>> Hi Florian,
>>>>    sure, let me state the requirements.  If those requirements can be
>>>> met, pacemaker will be much more used to manage MySQL replication.
>>>> Right now, although at Percona I deal with many large MySQL deployments,
>>>> none are using the current agent.   Another tool, MMM is currently used
>>>> but it is currently orphan and suffers from many pretty fundamental
>>>> flaws (while implement about the same logic as below).
>>>>
>>>> Consider a pool of N identical MySQL servers.  In that case we need:
>>>> - N replication resources (it could be the MySQL RA)
>>>> - N Reader_vip
>>>> - 1 Writer_vip
>>>>
>>>> Reader vips are used by the application to run queries that do not
>>>> modify data, usually accessed is round-robin fashion.  When the
>>>> application needs to write something, it uses the writer_vip.  That's
>>>> how read/write splitting is implement in many many places.
>>>>
>>>> So, for the agent, here are the requirements:
>>>>
>>>> - No need to manage MySQL itself
>>>>
>>>> The resource we are interested in is replication, MySQL itself is at
>>>> another level.  If the RA is to manage MySQL, it must not interfere.
>>>>
>>>> - the writer_vip must be assigned only to the master, after it is promoted
>>>>
>>>> This, is easy with colocation
>>> Agreed.
>>>
>>>> - After the promotion of a new master, all slaves should be allowed to
>>>> complete the application of their relay logs prior to any change master
>>>>
>>>> The current RA does not do that but it should be fairly easy to implement.
>>> That's a use case for a pre-promote and post-promote notification. Like
>>> the mysql RA currently does.
>>>
>>>> - After its promotion and before allowing writes to it, a master should
>>>> publish its current master file and position.   I am using resource
>>>> parameters in the CIB for these (I am wondering if transient attributes
>>>> could be used instead)
>>> They could, and you should. Like the mysql RA currently does.
>>>
>> The RA I downloaded following instruction of the wiki stating it is
>> the latest sources:
>>
>> wget -O resource-agents.tar.bz2
>> http://hg.linux-ha.org/agents/archive/tip.tar.bz2
> Has moved to github.
> I'll try to make that more obvious at the website,
> but that won't help for "direct download" hg archive links.
>
> http://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/mysql
>
> raw download:
> http://raw.github.com/ClusterLabs/resource-agents/master/heartbeat/mysql
>
> Also see this pull request:
> https://github.com/ClusterLabs/resource-agents/pull/28
>
>> has the following code to change the master:
>>
>>      ocf_run $MYSQL $MYSQL_OPTIONS_LOCAL $MYSQL_OPTIONS_REPL \
>>          -e "CHANGE MASTER TO MASTER_HOST='$master_host', \
>>                               MASTER_USER='$OCF_RESKEY_replication_user', \
>> MASTER_PASSWORD='$OCF_RESKEY_replication_passwd'"
>>
>> which does not include file and position.
>>
>>
>>>> - After the promotion of a new master, all slaves should be reconfigured
>>>> to point to the new master host with correct file and position as
>>>> published by the master when it was promoted
>>>>
>>>> The current RA does not set file and position.
>>> "The current RA" being ocf:heartbeat:mysql?
>>>
>>> A cursory grep for "CRM_ATTR" in ocf:heartbeat:mysql indicates that it
>>> does set those.
>> grep CRM_ATTR returned nothing.
>>
>> yves at yves-desktop:/opt/pacemaker/Cluster-Resource-Agents-7a11934b142d/heartbeat$
>> grep -i CRM_ATTR mysql
>> yves at yves-desktop:/opt/pacemaker/Cluster-Resource-Agents-7a11934b142d/heartbeat$
>>
>> and that is the latest from Mercurial...
>>
>>>> Under any non-trivial
>>>> load this will fail.  The current RA is not designed to stores the
>>>> information.  The new RA uses the information stored in the cib along
>>>> with post-promote notification.
>>> Is this point moot considering my previous statement?
>>>
>>>> - each slave and the master may have one or more reader_vip provided
>>>> that they are replicating correctly (no lag beyond a threshold,
>>>> replication of course working).  If all slaves fails, all reader_vip
>>>> should be located on the master.
>>> Use a cloned IPaddr2 as a non-anonymous clone, thereby managing an IP
>>> range. Add a location constraint restricting the clone instance to run
>>> on only those nodes where a specific node attribute is set. Or
>>> conversely, forbid them from running on nodes where said attribute is
>>> not set. Manage that attribute from your RA.
>> That's clever, never thought about it.
>>
>>>> The current RA either kills MySQL or does nothing, it doesn't care about
>>>> reader_vips.  Killling MySQL on a busy server with 256GB of buffer pool
>>>> is enough for someone to lose his job...  The new RA adjusts location
>>>> scores for the reader_vip resources dynamically.
>>> Like I said, that's managing one resource from another, which is a total
>>> nightmare. It's also not necessary, I dare say, given the approach I
>>> outlined above.
>>>
>> I'll explore the node attribute approach, I like it.
>>
>> Is it possible to create an attribute that does not belong to a node
>> but is cluster wide?
>>>> - the RA should implement a protection against flapping in case a slave
>>>> hovers around the replication lag threshold
>>> You should get plenty of inspiration there from how the dampen parameter
>>> is used in ocf:pacemaker:ping.
>>>
>> ok, I'll check
>>>> The current RA does implement that but it is not required giving the
>>>> context.  The new RA does implement flapping protection.
>>>>
>>>> - upon demote of a master, the RA _must_ attempt to kill all user
>>>> (non-system) connections
>>>>
>>>> The current RA does not do that but it is easy to implement
>>> Yeah, as I assume it would be in the other one.
>>>
>>>> - Slaves must be read-only
>>>>
>>>> That's fine, handled by the current RA.
>>> Correct.
>>>
>>>> - Monitor should test MySQL and replication.  If either is bad, vips
>>>> should be moved away.  Common errors should not trigger actions.
>>> Like I said, should be feasible with the node attribute approach
>>> outlined above. No reason to muck around with the resources directly.
>>>
>>>> That's handled by the current RA for most of if.  The error handling
>>>> could be added.
>>>>
>>>> - Slaves should update their master score according to the state of
>>>> their replication.
>>>>
>>>> Handled by both RA
>>> Right.
>>>
>>>> So, at the minimum, the RA needs to be able to store the master
>>>> coordinate information, either in the resource parameters or in
>>>> transient attributes and must be able to modify resources location
>>>> scores.  The script _was_ working before I got the cib issue, maybe it
>>>> was purely accidental but it proves the concept.  I was actually
>>>> implement/testing the relay_log completion stuff.  I chose not to use
>>>> the current agent because I didn't want to manage MySQL itself, just
>>>> replication.
>>>>
>>>> I am wide open to argue any Pacemaker or RA architecture/design part but
>>>> I don't want to argue the replication requirements, they are fundamental
>>>> in my mind.
>>> Yup, and I still believe that ocf:heartbeat:mysql either already
>>> addresses those, or they could be addressed in a much cleaner fashion
>>> than writing a new RA.
>>>
>>> Now, if the only remaining point is "but I want to write an agent that
>>> can do _less_ than an existing one" (namely, manage only replication,
>>> not the underlying daemon), then I guess I can't argue with that, but
>>> I'd still believe that would be a suboptimal approach.
>> Ohh...  don't get me wrong, I am not the kind of guy that takes
>> pride in having re-invented the flat tire.  I want an opensource
>> _solution_ I can offer to my customers.  I think part of the problem
>> here is that we are not talking about the same ocf:heartbeat:mysql
>> RA.  What is mainstream is what you can get with "apt-get install
>> pacemaker" on 10.04 LTS for example.  This is 1.0.8.  I also tried
>> 1.0.11 and still it is obviously not the same version.  I got my
>> "latest" agent version as explained in the clusterlabs FAQ page
>> from:
>>
>> wget -O resource-agents.tar.bz2
>> http://hg.linux-ha.org/agents/archive/tip.tar.bz2
>>
>> Where can I get the version you are using :)
>>
>> Regards,
>>
>> Yves
>>
>>> Cheers,
>>> Florian
>>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker