[Pacemaker] Pacemaker failover delays (followup)
Andrew Beekhof
andrew at beekhof.net
Wed Mar 13 01:35:18 UTC 2013
On Sat, Mar 9, 2013 at 9:50 AM, Michael Powell <
Michael.Powell at harmonicinc.com> wrote:
> Andrew,****
>
> ** **
>
> Thanks for the feedback to my earlier questions from March 6th. I’ve done
> some further investigation wrt the timing of what I’d call the “simple”
> failover case: where an SSID that is master on the DC node is killed, and
> it takes 10-12 seconds before the slave SSID on the other node transitions
> to master. (Recall that “SSID” is a SliceServer app instance, each of
> which is abstracted as a Pacemaker resource.)****
>
> ** **
>
> Before going into my findings, I want to clear up a couple of
> misstatements on my part.****
>
> **· **WRT my mention of “notifications” in my earlier e-mail, I
> misused the term. I was simply referring to the “notify” events passed
> from the DC to the other node.****
>
> **· **I also misspoke when I said that the failed SSID was
> subsequently restarted as a result of a monitor event. In fact, the SSID
> process is restarted by the “ss” resource agent script in response to a
> “start” event from lrmd.****
>
> ** **
>
> The key issue, however, is the time required – 10 to 12 seconds – from the
> time the master SSID is killed until the slave fails over to become
> master. You opined that the time required would largely depend upon the
> behavior of the resource agent, which in our case is a script called “ss”.
> To determine what effect the ss script’s execution would be, I modified it
> to log the current monotonic system clock value each time it starts, and
> just before it exits. The log messages specify the clock value in ms.****
>
> ** **
>
> From this, I did find several instances where the ss script would take
> just over a second to complete execution. In each such case, the “culprit”
> is an exec of “crm_node –p”, which is called to determine how many nodes
> are presently in the cluster. (I’ve verified this timing independently by
> executing “crm_node –p” from a command line when the cluster is
> quiescent.) This seems like a rather long time for a simple objective.
> What would “crm_node –p” do that would take so long?
>
If support for corosync is compiled in, probably trying to connect to it
first.
Its not as smart as 1.1.x
Otherwise its just waiting for heartbeat to answer.
> ****
>
> ** **
>
> That notwithstanding, from the POV of the slave during the failover, there
> are delays of several hundred to about 1400ms between the completion of the
> ss script and its invocation for the next event. To explain, I’ve attached
> an Excel spreadsheet (which I’ve verified is virus-free), that documents
> two experiments. In each case, there’s an SSID instance that’s master on
> node-0, the DC, and which is killed. The spreadsheet includes a synopsis
> of the log message that follows on both cans, interleaved into a timeline.
> ****
>
> ** **
>
> By way of explanation, columns B-D contain timestamp information for
> node-0 and columns E-G for node 1. Columns B/E show the current time of
> day, C/F show the monotonic clock value when the ss script begins execution
> (in ms, truncated to the least 5 digits), and D/G show the duration of the
> ss script execution for the relevant event. Column H is text extracted
> from the log, showing the key text. In some cases there is a significant
> amount of information in the log file relating to pengine behavior, but I
> omitted such information from the spreadsheet. Column I contains
> explanatory comments.****
>
> ** **
>
> Realizing that we need to look forward to upgrading our Pacemaker version
> (from 1.0.9), I wonder if you can clear up a couple of questions. We are
> presently using Heartbeat, which I believe restricts our upgrade to the 1.0
> branch, correct?
>
Nope. http://blog.clusterlabs.org/blog/2012/can-pacemaker-118-be-used-with/
> In other words, if we want to upgrade to the 1.1 branch, are we required
> to replace Heartbeat with Corosync? Secondly, when upgrading, are there
> kernel dependencies to worry about?
>
Not unless you're using OCFS2/GFS2.
> We are presently running on the open source kernel version 2.6.18. We
> plan to migrate to the most current 2.8 or 3.0 version later this year, at
> which time it would probably make sense to bring Pacemaker up to date.****
>
> ** **
>
> I apologize for the length of this posting, and again appreciate any
> assistance you can offer.****
>
> ** **
>
> Regards,****
>
> Michael Powell****
>
> ** **
>
> [image: LogoSignature2]****
>
> ** **
>
> Michael Powell****
>
> Staff Engineer****
>
> ** **
>
> 15220 NW Greenbrier Pkwy****
>
> Suite 290****
>
> Beaverton, OR 97006****
>
> T 503-372-7327 M 503-789-3019 H 503-625-5332****
>
> ** **
>
> www.harmonicinc.com****
>
> ** **
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130313/50f0be27/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 1625 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130313/50f0be27/attachment-0004.gif>
More information about the Pacemaker
mailing list