[ClusterLabs] Delayed first monitoring

Miloš Kozák milos.kozak at lejmr.com
Wed Aug 12 15:45:32 UTC 2015


Thank you for your answer, but.

1) This sounds ok, but in other words it means the first delayed check 
is not possible to be done.

2) Start of init script? I follow lsb scripts from distribution, so 
there is not way to change them (I can change them, but with packages 
upgade they will go void). The is quite typical approach, how can I do 
HA for atlassian for example? Jira loads 5minutes..



Dne 12.8.2015 v 16:14 Nekrasov, Alexander napsal(a):
> 1. Pacemaker will/may call a monitor before starting a resource, in which case it expects a NOT_RUNNING response. It's just checking assumptions at that point.
>
> 2. A resource::start must only return when resource::monitor is successful. Basically the logic of a start() must follow this:
>
> start() {
>    start_daemon()
>    while ! monitor() ; do
>        sleep some
>    done
>    return $OCF_SUCCESS
> }
>
>> -----Original Message-----
>> From: Miloš Kozák [mailto:milos.kozak at lejmr.com]
>> Sent: Wednesday, August 12, 2015 10:03 AM
>> To: users at clusterlabs.org
>> Subject: [ClusterLabs] Delayed first monitoring
>>
>> Hi,
>>
>> I have set up and CoroSync+CMAN+Pacemaker at CentOS 6.5 in order to
>> provide high-availability of opennebula. However, I am facing to a
>> strange problem which raises from my lack of knowleadge..
>>
>> In the log I can see that when I create a resource based on an init
>> script, typically:
>>
>> pcs resource create httpd lsb:httpd
>>
>> The httpd daemon gets started, but monitor is initiated at the same time
>> and the resource is identified as not running. This behaviour makes
>> sense since we realize that the daemon starting takes some time. In this
>> particular case, I get error code 2 which means that process is running,
>> but environment is not locked. The effect of this is that httpd resource
>> gets restarted.
>>
>> My workaround is extra sleep in status function of the init script, but
>> I dont like this solution at all! Do you have idea how to tackle this
>> problem in a proper way? I expected an op attribut which would specify
>> delay after service start and first monitoring, but I could not find
>> it..
>>
>> Thank you, Milos
>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org





More information about the Users mailing list