[Pacemaker] hangs pending

Fri Feb 21 06:12:42 UTC 2014

btw. Whats with all these entries:

Feb 19 10:49:27 [1641] dev-cluster2-node2.unix.tensor.ru pacemakerd:     info: crm_log_init: 	Changed active directory to /var/lib/heartbeat/cores/root
Feb 19 10:49:27 [1641] dev-cluster2-node2.unix.tensor.ru pacemakerd:     info: crm_xml_cleanup: 	Cleaning up memory from libxml2
Feb 19 10:49:27 [1772] dev-cluster2-node2.unix.tensor.ru pacemakerd:     info: crm_log_init: 	Changed active directory to /var/lib/heartbeat/cores/hacluster
Feb 19 10:49:27 [1772] dev-cluster2-node2.unix.tensor.ru pacemakerd:     info: crm_xml_cleanup: 	Cleaning up memory from libxml2
Feb 19 10:49:29 [1851] dev-cluster2-node2.unix.tensor.ru pacemakerd:     info: crm_log_init: 	Changed active directory to /var/lib/heartbeat/cores/root
Feb 19 10:49:29 [1851] dev-cluster2-node2.unix.tensor.ru pacemakerd:     info: crm_xml_cleanup: 	Cleaning up memory from libxml2
Feb 19 10:49:35 [2130] dev-cluster2-node2.unix.tensor.ru pacemakerd:     info: crm_log_init: 	Changed active directory to /var/lib/heartbeat/cores/root
Feb 19 10:49:35 [2130] dev-cluster2-node2.unix.tensor.ru pacemakerd:     info: crm_xml_cleanup: 	Cleaning up memory from libxml2
Feb 19 10:49:35 [2191] dev-cluster2-node2.unix.tensor.ru pacemakerd:     info: crm_log_init: 	Changed active directory to /var/lib/heartbeat/cores/root
Feb 19 10:49:35 [2191] dev-cluster2-node2.unix.tensor.ru pacemakerd:     info: crm_xml_cleanup: 	Cleaning up memory from libxml2
Feb 19 10:49:40 [2288] dev-cluster2-node2.unix.tensor.ru pacemakerd:     info: crm_log_init: 	Changed active directory to /var/lib/heartbeat/cores/root
Feb 19 10:49:40 [2288] dev-cluster2-node2.unix.tensor.ru pacemakerd:     info: crm_xml_cleanup: 	Cleaning up memory from libxml2
Feb 19 10:49:45 [2388] dev-cluster2-node2.unix.tensor.ru pacemakerd:     info: crm_log_init: 	Changed active directory to /var/lib/heartbeat/cores/root
Feb 19 10:49:45 [2388] dev-cluster2-node2.unix.tensor.ru pacemakerd:     info: crm_xml_cleanup: 	Cleaning up memory from libxml2
Feb 19 10:49:51 [2468] dev-cluster2-node2.unix.tensor.ru pacemakerd:     info: crm_log_init: 	Changed active directory to /var/lib/heartbeat/cores/root
Feb 19 10:49:51 [2468] dev-cluster2-node2.unix.tensor.ru pacemakerd:     info: crm_xml_cleanup: 	Cleaning up memory from libxml2

are you calling pacemakerd for some reason?

On 19 Feb 2014, at 7:53 pm, Andrey Groshev <greenx at yandex.ru> wrote:

> 
> 
> 19.02.2014, 09:49, "Andrew Beekhof" <andrew at beekhof.net>:
>> On 19 Feb 2014, at 4:18 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>> 
>>>  19.02.2014, 09:08, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>  On 19 Feb 2014, at 4:00 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>   19.02.2014, 06:48, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>   On 18 Feb 2014, at 11:05 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>    Hi, ALL and Andrew!
>>>>>>> 
>>>>>>>    Today is a good day - I killed a lot, and a lot of shooting at me.
>>>>>>>    In general - I am happy (almost like an elephant)   :)
>>>>>>>    Except resources on the node are important to me eight processes: corosync,pacemakerd,cib,stonithd,lrmd,attrd,pengine,crmd.
>>>>>>>    I killed them with different signals (4,6,11 and even 9).
>>>>>>>    Behavior does not depend of number signal - it's good.
>>>>>>>    If STONITH send reboot to the node - it rebooted and rejoined the cluster - too it's good.
>>>>>>>    But the behavior is different from killing various demons.
>>>>>>> 
>>>>>>>    Turned four groups:
>>>>>>>    1. corosync,cib - STONITH work 100%.
>>>>>>>    Kill via any signals - call STONITH and reboot.
>>>>>>> 
>>>>>>>    2. lrmd,crmd - strange behavior STONITH.
>>>>>>>    Sometimes called STONITH - and the corresponding reaction.
>>>>>>>    Sometimes restart daemon and restart resources with large delay MS:pgsql.
>>>>>>>    One time after restart crmd - pgsql don't restart.
>>>>>>> 
>>>>>>>    3. stonithd,attrd,pengine - not need STONITH
>>>>>>>    This daemons simple restart, resources - stay running.
>>>>>>> 
>>>>>>>    4. pacemakerd - nothing happens.
>>>>>>>    And then I can kill any process of the third group. They do not restart.
>>>>>>>    Generaly don't touch corosync,cib and maybe lrmd,crmd.
>>>>>>> 
>>>>>>>    What do you think about this?
>>>>>>>    The main question of this topic - we decided.
>>>>>>>    But this varied behavior - another big problem.
>>>>>>> 
>>>>>>>    Forgоt logs http://send2me.ru/pcmk-Tue-18-Feb-2014.tar.bz2
>>>>>>   Which of the various conditions above do the logs cover?
>>>>>   All various in day.
>>>>  Are you trying to torture me?
>>>>  Can you give me a rough idea what happened when?
>>>  No, there is 8 processes on the 4th signal and repeats the experiments with unknown outcome :)
>>>  Easier to conduct new experiments and individual new logs .
>>>  Which variant is more interesting?
>> 
>> The long delay in restarting pgsql.
>> Everything else seems correct.
>> 
> 
> He even don't tried start pgsql.
> In Logs tree the tests.  
> kill -s4 lrmd pid.
> 1. STONITH
> 2. STONITH
> 3. hangs
> http://send2me.ru/pcmk-Wed-19-Feb-2014.tar.bz2
> 
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140221/886395df/attachment-0004.sig>