[Pacemaker] Simulating that a node is down.

Mon Jul 15 16:19:16 UTC 2013

Hi Andreas/Jacobo,

For one of my network failure tests, I drop all incoming and outgoing packets. I have also tested shutting down the eth interface, as you mentioned below, but have run into DRBD split-brain issues, as I reported in separate mail to this list. I’d be interested to hear if you had any issues with your tests involving shutdown of the interface.

For the node shutdown tests, I would include testing a node crash.

Thanks,
Tom

From: Andreas Mock [mailto:andreas.mock at web.de]
Sent: 12 July 2013 10:31
To: 'The Pacemaker cluster resource manager'
Subject: Re: [Pacemaker] Simulating that a node is down.

Hi Jacobo,

1) corosync communicates through 2 ports, don't forget the second one.
2) IMHO, when you block both ports, it's like a classical split brain.
I've done it to test split brain and hopefully fencing behaviour.
´
Best regards
Andreas Mock

Von: Jacobo García [mailto:jacobo.garcia at gmail.com]
Gesendet: Freitag, 12. Juli 2013 11:04
An: The Pacemaker cluster resource manager
Betreff: Re: [Pacemaker] Simulating that a node is down.

Thanks Andreas for your kind answer, I'll add this to my test battery.

Also, my other question, is it a good idea to close the corosync port? Should corosync behave in a expected way? I am getting odd behaviors on this one, but not sure if where to put the blame.

Thanks in advance.

Jacobo García López de Araujo
http://thebourbaki.com | http://twitter.com/clapkent

On Thu, Jul 11, 2013 at 8:39 PM, Andreas Mock <andreas.mock at web.de<mailto:andreas.mock at web.de>> wrote:
Hi Jacobo,

one very interesting thing is missing.
Overload the node. Make a programm/script which generates
many IO-operations, many flushes and meanwhile requesting
more and more memory from the OS until swapping begins.
Ohhh, yes, swapping and IO is nice…

…then you can prove your monitor and stop action timeouts…  ;-)

Best regards
Andreas Mock

Von: Jacobo García [mailto:jacobo.garcia at gmail.com<mailto:jacobo.garcia at gmail.com>]
Gesendet: Donnerstag, 11. Juli 2013 19:14
An: pacemaker at oss.clusterlabs.org<mailto:pacemaker at oss.clusterlabs.org>
Betreff: [Pacemaker] Simulating that a node is down.

Hello,

I am looking for different ways of testing that a node is down. I am finding a strange behavior with one of them (closing with IPtables the UDP communication port). I would like to know if closing the port is a recommended way of achieving my testing purposes.

Also I would like to know other ways of testing apart from the ones compiled in the list below:

1.     Stopping corosync.
2.     Shutting down the node.
3.     Shutting down the eth0 interface.
4.     Killing corosync process.
5.     Closing the corosync communication port.
Thanks,

Jacobo García López de Araujo

_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org<mailto:Pacemaker at oss.clusterlabs.org>
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130715/dc82c47f/attachment.htm>