[Pacemaker] 2 Node Clustering, when primary server goes down(shutdown) the secondary server restarts
Arjun Pandey
apandepublic at gmail.com
Wed Oct 29 08:53:44 UTC 2014
Googling on "fencing agent IPMI " helps :)
This link might be useful.
https://fedorahosted.org/cluster/wiki/IPMI_FencingConfig
Regards
Arjun
On Wed, Oct 29, 2014 at 2:11 PM, kamal kishi <kamal.kishi at gmail.com> wrote:
> Thanks for the info, was trying to configure IPMI in the servers.
> Can you please suggest a configuration procedure for enabling and
> configuring the IPMI(Which you might have referred to).
> The sites I came across are not understandable.
> The servers I'm using is DELL POWEREDGE R320
>
> On Tue, Oct 28, 2014 at 7:55 PM, Digimer <lists at alteeve.ca> wrote:
>>
>> On 28/10/14 02:24 AM, kamal kishi wrote:
>>>
>>> Hi,
>>>
>>> I know, no fencing configuration creates issue.
>>> But the current scenario is due to fencing??
>>
>>
>> Maybe, maybe not. I can say that *not* having it will make solving the
>> problem much more difficult. Please get it working, it's pretty easy and it
>> will make your life a lot easier.
>>
>>> The syslog isn't revealing much about the same.
>>> I would love to configure fencing but currently need some solution to
>>> overcome the current scenario, if you say fencing is the only solution
>>> then I might have to do it remotely.
>>
>>
>> It is critical, yes. Please add it, test it and then hook DRBD into it.
>>
>>> OS -> UBUNTU 12.04 (64 bits)
>>> DRBD -> 8.3.11
>>
>>
>> That is quite old. Can you update to 8.3.16? Also, what version is
>> pacemaker and corosync?
>>
>>> Thanks for the quick reply
>>>
>>> On Tue, Oct 28, 2014 at 11:19 AM, Digimer <lists at alteeve.ca
>>> <mailto:lists at alteeve.ca>> wrote:
>>>
>>> On 28/10/14 01:39 AM, kamal kishi wrote:
>>>
>>> Hi all,
>>>
>>> Facing a strange issue which I'm not able to resolve as
>>> I'm not
>>> sure where what is going wrong as the logs is not giving away
>>> much to my
>>> knowledge.
>>>
>>> Issue -
>>> Have configured 2 Node Clustering, have attached the
>>> configuration
>>> file(New CRM conf of BIC.txt).
>>>
>>> If Server2 which is primary is shutdown(forcefully by turning
>>> off the
>>> switch), Server1 restarts within few seconds and starts the
>>> resources.
>>> Even though the Server1 restarts and starts the resources the
>>> time taken
>>> to recover is too long to convince the clients and the current
>>> working
>>> is erroneous is what I feel.
>>>
>>> Have attached the syslog with this mail.(syslog)
>>>
>>> Do go through the same and let know a solution to resolve the
>>> same as
>>> the setup is in clients place.
>>>
>>> --
>>> Regards,
>>> Kamal Kishore B V
>>>
>>>
>>> You really need fencing, first and foremost. This will cause the
>>> survivor to put the lost node into a known state and then safely
>>> begin taking over lost services. Do your nodes have IPMI (or iRMC,
>>> iLO, DRAC, etc)? If so, setting up stonith is easy.
>>>
>>> Once it is setup, configure DRBD to use the fence-handler
>>> 'crm-fence-peer.sh' and change the fencing policy to
>>> 'resource-and-stonith'. Without this, you will get split-brains and
>>> fail-over will be unpredictable.
>>>
>>> Once stonith is configured and tested in pacemaker and you've hooked
>>> DRBD's fencing into pacemaker, see if you problem remains. If it
>>> does, on both nodes, run: 'tail -f -n 0 /var/log/messages', kill a
>>> node and wait for things to settle down. Share the log output here.
>>>
>>> Please also tell us your OS, pacemaker, drbd and corosync versions.
>>>
>>> --
>>> Digimer
>>> Papers and Projects: https://alteeve.ca/w/
>>> What if the cure for cancer is trapped in the mind of a person
>>> without access to education?
>>>
>>> _________________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> <mailto:Pacemaker at oss.clusterlabs.org>
>>> http://oss.clusterlabs.org/__mailman/listinfo/pacemaker
>>> <http://oss.clusterlabs.org/mailman/listinfo/pacemaker>
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started:
>>> http://www.clusterlabs.org/__doc/Cluster_from_Scratch.pdf
>>> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Kamal Kishore B V
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>
>>
>> --
>> Digimer
>> Papers and Projects: https://alteeve.ca/w/
>> What if the cure for cancer is trapped in the mind of a person without
>> access to education?
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
>
>
> --
> Regards,
> Kamal Kishore B V
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
More information about the Pacemaker
mailing list