[Pacemaker] lrmd Memory Usage
Andrew Beekhof
andrew at beekhof.net
Tue May 6 09:06:14 UTC 2014
On 6 May 2014, at 6:05 pm, Greg Murphy <greg.murphy at gamesparks.com> wrote:
> Attached are the valgrind outputs from two separate runs of lrmd with the
> suggested variables set. Do they help narrow the issue down?
They do somewhat. I'll investigate. But much of the memory is still reachable:
==26203== indirectly lost: 17,945,950 bytes in 642,546 blocks
==26203== possibly lost: 2,805 bytes in 60 blocks
==26203== still reachable: 26,104,781 bytes in 544,782 blocks
==26203== suppressed: 8,652 bytes in 176 blocks
==26203== Reachable blocks (those to which a pointer was found) are not shown.
==26203== To see them, rerun with: --leak-check=full --show-reachable=yes
Could you add the --show-reachable=yes to VALGRIND_OPTS variable?
>
>
> Thanks
>
> Greg
>
>
> On 02/05/2014 03:01, "Andrew Beekhof" <andrew at beekhof.net> wrote:
>
>>
>> On 30 Apr 2014, at 9:01 pm, Greg Murphy <greg.murphy at gamesparks.com>
>> wrote:
>>
>>> Hi
>>>
>>> I¹m running a two-node Pacemaker cluster on Ubuntu Saucy (13.10),
>>> kernel 3.11.0-17-generic and the Ubuntu Pacemaker package, version
>>> 1.1.10+git20130802-1ubuntu1.
>>
>> The problem is that I have no way of knowing what code is/isn't included
>> in '1.1.10+git20130802-1ubuntu1'.
>> You could try setting the following in your environment before starting
>> pacemaker though
>>
>> # Variables for running child daemons under valgrind and/or checking for
>> memory problems
>> G_SLICE=always-malloc
>> MALLOC_PERTURB_=221 # or 0
>> MALLOC_CHECK_=3 # or 0,1,2
>> PCMK_valgrind_enabled=lrmd
>> VALGRIND_OPTS="--leak-check=full --trace-children=no --num-callers=25
>> --log-file=/var/lib/pacemaker/valgrind-%p
>> --suppressions=/usr/share/pacemaker/tests/valgrind-pcmk.suppressions
>> --gen-suppressions=all"
>>
>>
>>> The cluster is configured with a DRBD master/slave set and then a
>>> failover resource group containing MySQL (along with its DRBD
>>> filesystem) and a Zabbix Proxy and Agent.
>>>
>>> Since I built the cluster around two months ago I¹ve noticed that on
>>> the the active node the memory footprint of lrmd gradually grows to
>>> quite a significant size. The cluster was last restarted three weeks
>>> ago, and now lrmd has over 1GB of mapped memory on the active node and
>>> only 151MB on the passive node. Current excerpts from /proc/PID/status
>>> are:
>>>
>>> Active node
>>> VmPeak:
>>> 1146740 kB
>>> VmSize:
>>> 1146740 kB
>>> VmLck:
>>> 0 kB
>>> VmPin:
>>> 0 kB
>>> VmHWM:
>>> 267680 kB
>>> VmRSS:
>>> 188764 kB
>>> VmData:
>>> 1065860 kB
>>> VmStk:
>>> 136 kB
>>> VmExe:
>>> 32 kB
>>> VmLib:
>>> 10416 kB
>>> VmPTE:
>>> 2164 kB
>>> VmSwap:
>>> 822752 kB
>>>
>>> Passive node
>>> VmPeak:
>>> 220832 kB
>>> VmSize:
>>> 155428 kB
>>> VmLck:
>>> 0 kB
>>> VmPin:
>>> 0 kB
>>> VmHWM:
>>> 4568 kB
>>> VmRSS:
>>> 3880 kB
>>> VmData:
>>> 74548 kB
>>> VmStk:
>>> 136 kB
>>> VmExe:
>>> 32 kB
>>> VmLib:
>>> 10416 kB
>>> VmPTE:
>>> 172 kB
>>> VmSwap:
>>> 0 kB
>>>
>>> During the last week or so I¹ve taken a couple of snapshots of
>>> /proc/PID/smaps on the active node, and the heap particularly stands out
>>> as growing: (I have the full outputs captured if they¹ll help)
>>>
>>> 20140422
>>> 7f92e1578000-7f92f218b000 rw-p 00000000 00:00 0
>>> [heap]
>>> Size: 274508 kB
>>> Rss: 180152 kB
>>> Pss: 180152 kB
>>> Shared_Clean: 0 kB
>>> Shared_Dirty: 0 kB
>>> Private_Clean: 0 kB
>>> Private_Dirty: 180152 kB
>>> Referenced: 120472 kB
>>> Anonymous: 180152 kB
>>> AnonHugePages: 0 kB
>>> Swap: 91568 kB
>>> KernelPageSize: 4 kB
>>> MMUPageSize: 4 kB
>>> Locked: 0 kB
>>> VmFlags: rd wr mr mw me ac
>>>
>>>
>>> 20140423
>>> 7f92e1578000-7f92f305e000 rw-p 00000000 00:00 0
>>> [heap]
>>> Size: 289688 kB
>>> Rss: 184136 kB
>>> Pss: 184136 kB
>>> Shared_Clean: 0 kB
>>> Shared_Dirty: 0 kB
>>> Private_Clean: 0 kB
>>> Private_Dirty: 184136 kB
>>> Referenced: 69748 kB
>>> Anonymous: 184136 kB
>>> AnonHugePages: 0 kB
>>> Swap: 103112 kB
>>> KernelPageSize: 4 kB
>>> MMUPageSize: 4 kB
>>> Locked: 0 kB
>>> VmFlags: rd wr mr mw me ac
>>>
>>> 20140430
>>> 7f92e1578000-7f92fc01d000 rw-p 00000000 00:00 0
>>> [heap]
>>> Size: 436884 kB
>>> Rss: 140812 kB
>>> Pss: 140812 kB
>>> Shared_Clean: 0 kB
>>> Shared_Dirty: 0 kB
>>> Private_Clean: 744 kB
>>> Private_Dirty: 140068 kB
>>> Referenced: 43600 kB
>>> Anonymous: 140812 kB
>>> AnonHugePages: 0 kB
>>> Swap: 287392 kB
>>> KernelPageSize: 4 kB
>>> MMUPageSize: 4 kB
>>> Locked: 0 kB
>>> VmFlags: rd wr mr mw me ac
>>>
>>> I noticed in the release notes for 1.1.10-rc1
>>> (https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.10-r
>>> c1) that there was work done to fix "crmd: lrmd: stonithd: fixed memory
>>> leaks² but I¹m not sure which particular bug this was related to. (And
>>> those fixes should be in the version I¹m running anyway).
>>>
>>> I¹ve also spotted a few memory leak fixes in
>>> https://github.com/beekhof/pacemaker, but I¹m not sure whether they
>>> relate to my issue (assuming I have a memory leak and this isn¹t
>>> expected behaviour).
>>>
>>> Is there additional debugging that I can perform to check whether I
>>> have a leak, or is there enough evidence to justify upgrading to 1.1.11?
>>>
>>> Thanks in advance
>>>
>>> Greg Murphy
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>
> <lrmd.tgz>_______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140506/5898db43/attachment-0004.sig>
More information about the Pacemaker
mailing list