[ClusterLabs] Antw: Re: Discovering corosync-blackbox

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Wed May 13 12:28:28 UTC 2015


>>> Jan Friesse <jfriesse at redhat.com> schrieb am 13.05.2015 um 14:11 in Nachricht
<55533F54.6090008 at redhat.com>:
> Ulrich Windl napsal(a):
[...]
>> Unfortunately the time stamps with seconds resolution seem to bit a bit like 
> a joke, and supplying the data seems very redundant to me. Consider this 

Sorry, mis-spelled: "the data" should read "the date".

> example:
> 
> blackbox is really more for developers than average administrators (this 
> doesn't mean they cannot use it/profit from using it). That's why it's 
> so redundant.

I meant the "date". However I found out later that some less busy systems really have data from several dates behind, it seems.

> 
>> # corosync-blackbox | grep ringid
>> ...
>> rec=[867124327] time=[2015-05-13 08:47:19] Tracing(1) Messsage=Received 
> ringid(172.20.16.1:2512) seq a0dc116
[...]
>> ...
>>
>> More questions:
>> "2512" is not the port configured; what is it? And should the "seq" be 
> somewhat "gapless"? Or is the whole blackbox lossy?
> 
> 2512 is monotonically increasing number (not port at all) and it 
> increase on every membership change. Basically ringid is 
> leader_ip:increasing_number.

OK, so it's just notation (usually you write address:port).

> 
> Blackbox is and together isn't lossy. It's ring buffer so number of 
> messages is limited and oldest messages are overwritten, but messages 
> should be in order.

So if I see gaps in the seq numbering, what does it mean? Are there multiple seq numbers in one message, and only the first number is logged, or do I have a massive loss of packets somewhere?

> 
> 
>>
>> What's the meaning of "Log Message=Can't store blackbox file: Success (0)"?
> 
> Corosync 1.4.x or 2.3.x?

corosync-1.4.7-0.21.3 of SLES11 SP3

Regards,
Ulrich







More information about the Users mailing list