[Pacemaker] What is the reason which the node in which failure has not occurred carries out "lost"?

Thu Feb 20 04:39:14 EST 2014

Hi, Andrew

2014-02-20 17:28 GMT+09:00 Andrew Beekhof <andrew at beekhof.net>:
> Who was pid 16243?
> Doesn't look like a pacemaker daemon.
pid 16243 is crm_mon.
In vm01, crm_mon was started and the state was checked.

If there is information required for analysis to other, I get it.

Regards,
Yusuke
>
>>
>> Overflow of queue of vm09 has taken place between cib and stonithd.
>> Feb 20 14:20:22 [15519] vm09        cib: (       ipc.c:506   )
>> trace: crm_ipcs_flush_events:  Sent 36 events (530 remaining) for
>> 0x105ec10[15520]: Resource temporarily unavailable (-11)
>> Feb 20 14:20:22 [15519] vm09        cib: (       ipc.c:515   )
>> error: crm_ipcs_flush_events:  Evicting slow client 0x105ec10[15520]:
>> event queue reached 530 entries
>>
>> Although I checked the code of the problem part, it was not understood
>> by which it would be solved.
>>
>> Is it less likelihood of sending a message of 100 at a time?
>> Does calculation of the waiting time after message transmission have a problem?
>> Threshold of 500 may be too low?
>
> being 500 behind is really quite a long way.

-- 
----------------------------------------
METRO SYSTEMS CO., LTD

Yusuke Iida
Mail: yusk.iida at gmail.com
----------------------------------------