[Pacemaker] Pacemaker still may include memory leaks

Tue Jun 4 02:30:51 UTC 2013

On 03/06/2013, at 8:55 PM, Yuichi SEINO <seino.cluster2 at gmail.com> wrote:

> Hi,
> 
> I run the test after we updated pacemaker.
> 
> I tested the same way as the previous test. However, I think that the
> memory leak still may be caused.
> 
> I attached the result(smaps and crm_mon and env). And, I also make the
> chart of the total of each address.
> RSS and SHR(Shared_Clean+Shared_Dirty) and PRI(Private_Clean+Private_Dirty)
> 
> The change of PRI is [heap], because the difference of  Private_Dirty
> is only [heap] and there is no the difference of Private_Clean.
> 
>>> --- smaps.5     2013-05-29 02:39:25.032940230 -0400
>>> +++ smaps.6     2013-05-29 03:48:51.278940819 -0400
> 
> I think that your test is about 1h. However, there are intervals that
> the size of memory doesn't change when I tested.
> There are intervals over 1h in those intervals.
> 
> The change of PRI
> ...
> Time:2013/5/30 12:28 PRI:3740
> ...
> Time:2013/5/30 14:16 PRI:3740
> ...
> 
> And, There is the part that the size of memory fluctuate a little in.
> However, as a whole,
> the size of memory continues to increase.
> 
> The change of PRI
> ...
> Time:2013/5/30 17:51 PRI:3792

Ok, so what happened at this time?  Logs?

There is no timer in pacemaker that runs this long (and the 1 hour of my test was equivalent to a few months in real life). 

> ...
> Time:2013/5/30 17:53 PRI:3844
> ...
> Time:2013/5/30 17:55 PRI:3792
> ...
> 
> Perhaps, the difference of the resource structure and the test way
> affect the result.
> I want to run the same test as you. Would you tell me about the detail of test?

I ran cts with:

  cts clean run --stack cman --stonith rhevm --ip 11.0.0.1 --choose Standby 500

Your stonith would be different though.

> 
> Sincerely,
> Yuichi
> 
> 2013/5/29 Yuichi SEINO <seino.cluster2 at gmail.com>:
>> 2013/5/29 Andrew Beekhof <andrew at beekhof.net>:
>>> 
>>> On 28/05/2013, at 4:30 PM, Andrew Beekhof <andrew at beekhof.net> wrote:
>>> 
>>>> 
>>>> On 28/05/2013, at 10:12 AM, Andrew Beekhof <andrew at beekhof.net> wrote:
>>>> 
>>>>> 
>>>>> On 27/05/2013, at 5:08 PM, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
>>>>> 
>>>>>> 27.05.2013 04:20, Yuichi SEINO wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> 2013/5/24 Vladislav Bogdanov <bubble at hoster-ok.com>:
>>>>>>>> 24.05.2013 06:34, Andrew Beekhof wrote:
>>>>>>>>> Any help figuring out where the leaks might be would be very much appreciated :)
>>>>>>>> 
>>>>>>>> One (and the only) suspect is unfortunately crmd itself.
>>>>>>>> It has private heap grown from 2708 to 3680 kB.
>>>>>>>> 
>>>>>>>> All other relevant differences are in qb shm buffers, which are
>>>>>>>> controlled and may grow until they reach configured size.
>>>>>>>> 
>>>>>>>> @Yuichi
>>>>>>>> I would recommend to try running under valgrind on a testing cluster to
>>>>>>>> figure out is that a memleak (lost memory) or some history data
>>>>>>>> (referenced memory). Latter may be a logical memleak though. You may
>>>>>>>> look in /etc/sysconfig/pacemaker for details.
>>>>>>> 
>>>>>>> I got valgrind for about 2 days. And, I attached valgrind in ACT node
>>>>>>> and SBY node.
>>>>>> 
>>>>>> 
>>>>>> I do not see any "direct" memory leaks (repeating 'definitely-lost'
>>>>>> allocations) there.
>>>>>> 
>>>>>> So what we see is probably one of:
>>>>>> * Cache/history/etc, which grows up to some limit (or expired at the
>>>>>> some point in time).
>>>>>> * Unlimited/not-expirable lists/hashes of data structures, which are
>>>>>> correctly freed at exit
>>>>> 
>>>>> There is still plenty of memory chunks not free'd at exit, I'm slowly working through those.
>>>> 
>>>> I've pushed the following to my repo:
>>>> 
>>>> + Andrew Beekhof (2 hours ago) d070092: Test: More glib suppressions
>>>> + Andrew Beekhof (2 hours ago) ec74bf0: Fix: Fencing: Ensure API object is consistently free'd
>>>> + Andrew Beekhof (2 hours ago) 6130d23: Fix: Free additional memory at exit
>>>> + Andrew Beekhof (2 hours ago) b76d6be: Refactor: crmd: Allocate a mainloop before doing anything to help valgrind
>>>> + Andrew Beekhof (3 hours ago) d4041de: Log: init: Remove unnecessary detail from shutdown message
>>>> + Andrew Beekhof (3 hours ago) 282032b: Fix: Clean up internal mainloop structures at exit
>>>> + Andrew Beekhof (4 hours ago) 0947721: Fix: Core: Correctly unreference GSource inputs
>>>> + Andrew Beekhof (25 hours ago) d94140d: Fix: crmd: Clean up more memory before exit
>>>> + Andrew Beekhof (25 hours ago) b44257c: Test: cman: Ignore additional valgrind errors
>>>> 
>>>> If someone would like to run the cluster (no valgrind needed) for a while with
>>>> 
>>>> export PCMK_trace_functions=mainloop_gio_destroy,mainloop_add_fd,mainloop_del_fd,crmd_exit,crm_peer_destroy,empty_uuid_cache,lrm_state_destroy_all,internal_lrm_state_destroy,do_stop,mainloop_destroy_trigger,mainloop_setup_trigger,do_startup,stonith_api_delete
>>>> 
>>>> and then (after grabbing smaps) shut it down, we should have some information about any lists/hashes that are growing too large.
>>>> 
>>>> Also, be sure to run with:
>>>> 
>>>> export G_SLICE=always-malloc
>>>> 
>>>> which will prevent glib from accumulating pools of memory and distorting any results.
>>> 
>>> 
>>> I did this today with 2747e25 and it looks to me like there is no leak (anymore?)
>>> For context, between smaps.5 and smaps.6, the 4 node cluster ran over 120 "standby" tests (lots of PE runs and resource activity).
>>> So unless someone can show me otherwise, I'm going to move on :)
>> 
>> I see. I also want to test a leak. I will report the result after the test.
>> 
>>> 
>>> Note that the [heap] changes are actually the memory usage going _backwards_.
>>> 
>>> Raw results below.
>>> 
>>> [root at corosync-host-1 ~]# cat /proc/`pidof crmd`/smaps  > smaps.6 ; diff -u smaps.5 smaps.6;
>>> --- smaps.5     2013-05-29 02:39:25.032940230 -0400
>>> +++ smaps.6     2013-05-29 03:48:51.278940819 -0400
>>> @@ -40,16 +40,16 @@
>>> Swap:                  0 kB
>>> KernelPageSize:        4 kB
>>> MMUPageSize:           4 kB
>>> -0226b000-02517000 rw-p 00000000 00:00 0                                  [heap]
>>> -Size:               2736 kB
>>> -Rss:                2268 kB
>>> -Pss:                2268 kB
>>> +0226b000-02509000 rw-p 00000000 00:00 0                                  [heap]
>>> +Size:               2680 kB
>>> +Rss:                2212 kB
>>> +Pss:                2212 kB
>>> Shared_Clean:          0 kB
>>> Shared_Dirty:          0 kB
>>> Private_Clean:         0 kB
>>> -Private_Dirty:      2268 kB
>>> -Referenced:         2268 kB
>>> -Anonymous:          2268 kB
>>> +Private_Dirty:      2212 kB
>>> +Referenced:         2212 kB
>>> +Anonymous:          2212 kB
>>> AnonHugePages:         0 kB
>>> Swap:                  0 kB
>>> KernelPageSize:        4 kB
>>> @@ -112,13 +112,13 @@
>>> MMUPageSize:           4 kB
>>> 7f0c6e918000-7f0c6ee18000 rw-s 00000000 00:10 522579                     /dev/shm/qb-pengine-event-27411-27412-6-data
>>> Size:               5120 kB
>>> -Rss:                3572 kB
>>> -Pss:                1785 kB
>>> +Rss:                4936 kB
>>> +Pss:                2467 kB
>>> Shared_Clean:          0 kB
>>> -Shared_Dirty:       3572 kB
>>> +Shared_Dirty:       4936 kB
>>> Private_Clean:         0 kB
>>> Private_Dirty:         0 kB
>>> -Referenced:         3572 kB
>>> +Referenced:         4936 kB
>>> Anonymous:             0 kB
>>> AnonHugePages:         0 kB
>>> Swap:                  0 kB
>>> @@ -841,7 +841,7 @@
>>> 7f0c72b00000-7f0c72b1d000 r-xp 00000000 fd:00 119                        /lib64/libselinux.so.1
>>> Size:                116 kB
>>> Rss:                  36 kB
>>> -Pss:                   5 kB
>>> +Pss:                   4 kB
>>> Shared_Clean:         36 kB
>>> Shared_Dirty:          0 kB
>>> Private_Clean:         0 kB
>>> @@ -1401,7 +1401,7 @@
>>> 7f0c740c6000-7f0c74250000 r-xp 00000000 fd:00 45                         /lib64/libc-2.12.so
>>> Size:               1576 kB
>>> Rss:                 588 kB
>>> -Pss:                  20 kB
>>> +Pss:                  19 kB
>>> Shared_Clean:        588 kB
>>> Shared_Dirty:          0 kB
>>> Private_Clean:         0 kB
>>> 
>>> 
>>>> 
>>>> 
>>>>> Once we know all memory is being cleaned up, the next step is to check the size of things beforehand.
>>>>> 
>>>>> I'm hoping one or more of them show up as unnaturally large, indicating things are being added but not removed.
>>>>> 
>>>>>> (f.e like dlm_controld has(had???) for a
>>>>>> debugging buffer or like glibc resolver had in EL3). This cannot be
>>>>>> caught with valgrind if you use it in a standard way.
>>>>>> 
>>>>>> I believe we have former one. To prove that, it would be very
>>>>>> interesting to run under valgrind *debugger* (--vgdb=yes|full) for some
>>>>>> long enough (2-3 weeks) period of time and periodically get memory
>>>>>> allocation state from there (with 'monitor leak_check full reachable
>>>>>> any' gdb command). I wanted to do that a long time ago, but
>>>>>> unfortunately did not have enough spare time to even try that (although
>>>>>> I tried to valgrind other programs that way).
>>>>>> 
>>>>>> This is described in valgrind documentation:
>>>>>> http://valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.gdbserver
>>>>>> 
>>>>>> We probably do not need to specify '--vgdb-error=0' because we do not
>>>>>> need to install watchpoints at the start (and we do not need/want to
>>>>>> immediately connect to crmd with gdb to tell it to continue), we just
>>>>>> need to periodically get status of memory allocations
>>>>>> (stop-leak_check-cont sequence). Probably that should be done in a
>>>>>> 'fast' manner, so crmd does not stop for a long time, and the rest of
>>>>>> pacemaker does not see it 'hanged'. Again, I did not try that, and I do
>>>>>> not know if it's even possible to do that with crmd.
>>>>>> 
>>>>>> And, as pacemaker heavily utilizes glib, which has own memory allocator
>>>>>> (slices), it is better to switch it to a 'standard' malloc/free for
>>>>>> debugging with G_SLICE=always-malloc env var.
>>>>>> 
>>>>>> Last, I did memleak checks for a 'static' (i.e. no operations except
>>>>>> monitors are performed) cluster for ~1.1.8, and did not find any. It
>>>>>> would be interesting to see if that is true for an 'active' one, which
>>>>>> starts/stops resources, handles failures, etc.
>>>>>> 
>>>>>>> 
>>>>>>> Sincerely,
>>>>>>> Yuichi
>>>>>>> 
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Also, the measurements are in pages... could you run "getconf PAGESIZE" and let us know the result?
>>>>>>>>> I'm guessing 4096 bytes.
>>>>>>>>> 
>>>>>>>>> On 23/05/2013, at 5:47 PM, Yuichi SEINO <seino.cluster2 at gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> I retry the test after we updated packages to the latest tag and OS.
>>>>>>>>>> glue and booth is latest.
>>>>>>>>>> 
>>>>>>>>>> * Environment
>>>>>>>>>> OS:RHEL 6.4
>>>>>>>>>> cluster-glue:latest(commit:2755:8347e8c9b94f) +
>>>>>>>>>> patch[detail:http://www.gossamer-threads.com/lists/linuxha/dev/85787]
>>>>>>>>>> resource-agent:v3.9.5
>>>>>>>>>> libqb:v0.14.4
>>>>>>>>>> corosync:v2.3.0
>>>>>>>>>> pacemaker:v1.1.10-rc2
>>>>>>>>>> crmsh:v1.2.5
>>>>>>>>>> booth:latest(commit:67e1208973de728958432aaba165766eac1ce3a0)
>>>>>>>>>> 
>>>>>>>>>> * Test procedure
>>>>>>>>>> we regularly switch a ticket. The previous test also used the same way.
>>>>>>>>>> And, There was no a memory leak when we tested pacemaker-1.1 before
>>>>>>>>>> pacemaker use libqb.
>>>>>>>>>> 
>>>>>>>>>> * Result
>>>>>>>>>> As a result, I think that crmd may cause the memory leak.
>>>>>>>>>> 
>>>>>>>>>> crmd smaps(a total of each addresses)
>>>>>>>>>> In detail, we attached smaps of  start and end. And, I recorded smaps
>>>>>>>>>> every 1 minutes.
>>>>>>>>>> 
>>>>>>>>>> Start
>>>>>>>>>> RSS: 7396
>>>>>>>>>> SHR(Shared_Clean+Shared_Dirty):3560
>>>>>>>>>> Private(Private_Clean+Private_Dirty):3836
>>>>>>>>>> 
>>>>>>>>>> Interbal(about 30h later)
>>>>>>>>>> RSS:18464
>>>>>>>>>> SHR:14276
>>>>>>>>>> Private:4188
>>>>>>>>>> 
>>>>>>>>>> End(about 70h later)
>>>>>>>>>> RSS:19104
>>>>>>>>>> SHR:14336
>>>>>>>>>> Private:4768
>>>>>>>>>> 
>>>>>>>>>> Sincerely,
>>>>>>>>>> Yuichi
>>>>>>>>>> 
>>>>>>>>>> 2013/5/15 Yuichi SEINO <seino.cluster2 at gmail.com>:
>>>>>>>>>>> Hi,
>>>>>>>>>>> 
>>>>>>>>>>> I ran the test for about two days.
>>>>>>>>>>> 
>>>>>>>>>>> Environment
>>>>>>>>>>> 
>>>>>>>>>>> OS:RHEL 6.3
>>>>>>>>>>> pacemaker-1.1.9-devel (commit 138556cb0b375a490a96f35e7fbeccc576a22011)
>>>>>>>>>>> corosync-2.3.0
>>>>>>>>>>> cluster-glue latest+patch(detail:http://www.gossamer-threads.com/lists/linuxha/dev/85787)
>>>>>>>>>>> libqb- 0.14.4
>>>>>>>>>>> 
>>>>>>>>>>> There may be a memory leak in crmd and lrmd. I regularly got rss of ps.
>>>>>>>>>>> 
>>>>>>>>>>> start-up
>>>>>>>>>>> crmd:5332
>>>>>>>>>>> lrmd:3625
>>>>>>>>>>> 
>>>>>>>>>>> interval(about 30h later)
>>>>>>>>>>> crmd:7716
>>>>>>>>>>> lrmd:3744
>>>>>>>>>>> 
>>>>>>>>>>> ending(about 60h later)
>>>>>>>>>>> crmd:8336
>>>>>>>>>>> lrmd:3780
>>>>>>>>>>> 
>>>>>>>>>>> I still don't run a test that pacemaker-1.1.10-rc2 use. So, I will run its test.
>>>>>>>>>>> 
>>>>>>>>>>> Sincerely,
>>>>>>>>>>> Yuichi
>>>>>>>>>>> 
>>>>>>>>>>> --
>>>>>>>>>>> Yuichi SEINO
>>>>>>>>>>> METROSYSTEMS CORPORATION
>>>>>>>>>>> E-mail:seino.cluster2 at gmail.com
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Yuichi SEINO
>>>>>>>>>> METROSYSTEMS CORPORATION
>>>>>>>>>> E-mail:seino.cluster2 at gmail.com
>>>>>>>>>> <smaps_log.tar.gz>_______________________________________________
>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>> 
>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>> 
>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>> 
>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Yuichi SEINO
>>>>>>> METROSYSTEMS CORPORATION
>>>>>>> E-mail:seino.cluster2 at gmail.com
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>> 
>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>> 
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>> 
>>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>> 
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>> 
>> 
>> 
>> --
>> Yuichi SEINO
>> METROSYSTEMS CORPORATION
>> E-mail:seino.cluster2 at gmail.com
> 
> --
> Yuichi SEINO
> METROSYSTEMS CORPORATION
> E-mail:seino.cluster2 at gmail.com
> <test_info.tar.bz>_______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org