[Pacemaker] lrmd Memory Usage

Greg Murphy greg.murphy at gamesparks.com
Tue May 6 04:05:30 EDT 2014


Attached are the valgrind outputs from two separate runs of lrmd with the
suggested variables set. Do they help narrow the issue down?


Thanks

Greg


On 02/05/2014 03:01, "Andrew Beekhof" <andrew at beekhof.net> wrote:

>
>On 30 Apr 2014, at 9:01 pm, Greg Murphy <greg.murphy at gamesparks.com>
>wrote:
>
>> Hi
>> 
>> I¹m running a two-node Pacemaker cluster on Ubuntu Saucy (13.10),
>>kernel 3.11.0-17-generic and the Ubuntu Pacemaker package, version
>>1.1.10+git20130802-1ubuntu1.
>
>The problem is that I have no way of knowing what code is/isn't included
>in '1.1.10+git20130802-1ubuntu1'.
>You could try setting the following in your environment before starting
>pacemaker though
>
># Variables for running child daemons under valgrind and/or checking for
>memory problems
>G_SLICE=always-malloc
>MALLOC_PERTURB_=221 # or 0
>MALLOC_CHECK_=3     # or 0,1,2
>PCMK_valgrind_enabled=lrmd
>VALGRIND_OPTS="--leak-check=full --trace-children=no --num-callers=25
>--log-file=/var/lib/pacemaker/valgrind-%p
>--suppressions=/usr/share/pacemaker/tests/valgrind-pcmk.suppressions
>--gen-suppressions=all"
>
>
>> The cluster is configured with a DRBD master/slave set and then a
>>failover resource group containing MySQL (along with its DRBD
>>filesystem) and a Zabbix Proxy and Agent.
>> 
>> Since I built the cluster around two months ago I¹ve noticed that on
>>the the active node the memory footprint of lrmd gradually grows to
>>quite a significant size. The cluster was last restarted three weeks
>>ago, and now lrmd has over 1GB of mapped memory on the active node and
>>only 151MB on the passive node. Current excerpts from /proc/PID/status
>>are:
>> 
>> Active node
>> VmPeak:
>> 1146740 kB
>> VmSize:
>> 1146740 kB
>> VmLck:
>>       0 kB
>> VmPin:
>>       0 kB
>> VmHWM:
>>   267680 kB
>> VmRSS:
>>   188764 kB
>> VmData:
>> 1065860 kB
>> VmStk:
>>     136 kB
>> VmExe:
>>       32 kB
>> VmLib:
>>   10416 kB
>> VmPTE:
>>     2164 kB
>> VmSwap:
>>   822752 kB
>> 
>> Passive node
>> VmPeak:
>>   220832 kB
>> VmSize:
>>   155428 kB
>> VmLck:
>>       0 kB
>> VmPin:
>>       0 kB
>> VmHWM:
>>     4568 kB
>> VmRSS:
>>     3880 kB
>> VmData:
>>   74548 kB
>> VmStk:
>>     136 kB
>> VmExe:
>>       32 kB
>> VmLib:
>>   10416 kB
>> VmPTE:
>>     172 kB
>> VmSwap:
>>       0 kB
>> 
>> During the last week or so I¹ve taken a couple of snapshots of
>>/proc/PID/smaps on the active node, and the heap particularly stands out
>>as growing: (I have the full outputs captured if they¹ll help)
>> 
>> 20140422
>> 7f92e1578000-7f92f218b000 rw-p 00000000 00:00 0
>> [heap]
>> Size:             274508 kB
>> Rss:              180152 kB
>> Pss:              180152 kB
>> Shared_Clean:          0 kB
>> Shared_Dirty:          0 kB
>> Private_Clean:         0 kB
>> Private_Dirty:    180152 kB
>> Referenced:       120472 kB
>> Anonymous:        180152 kB
>> AnonHugePages:         0 kB
>> Swap:              91568 kB
>> KernelPageSize:        4 kB
>> MMUPageSize:           4 kB
>> Locked:                0 kB
>> VmFlags: rd wr mr mw me ac
>> 
>> 
>> 20140423
>> 7f92e1578000-7f92f305e000 rw-p 00000000 00:00 0
>> [heap]
>> Size:             289688 kB
>> Rss:              184136 kB
>> Pss:              184136 kB
>> Shared_Clean:          0 kB
>> Shared_Dirty:          0 kB
>> Private_Clean:         0 kB
>> Private_Dirty:    184136 kB
>> Referenced:        69748 kB
>> Anonymous:        184136 kB
>> AnonHugePages:         0 kB
>> Swap:             103112 kB
>> KernelPageSize:        4 kB
>> MMUPageSize:           4 kB
>> Locked:                0 kB
>> VmFlags: rd wr mr mw me ac
>> 
>> 20140430
>> 7f92e1578000-7f92fc01d000 rw-p 00000000 00:00 0
>> [heap]
>> Size:             436884 kB
>> Rss:              140812 kB
>> Pss:              140812 kB
>> Shared_Clean:          0 kB
>> Shared_Dirty:          0 kB
>> Private_Clean:       744 kB
>> Private_Dirty:    140068 kB
>> Referenced:        43600 kB
>> Anonymous:        140812 kB
>> AnonHugePages:         0 kB
>> Swap:             287392 kB
>> KernelPageSize:        4 kB
>> MMUPageSize:           4 kB
>> Locked:                0 kB
>> VmFlags: rd wr mr mw me ac
>> 
>> I noticed in the release notes for 1.1.10-rc1
>>(https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.10-r
>>c1) that there was work done to fix "crmd: lrmd: stonithd: fixed memory
>>leaks² but I¹m not sure which particular bug this was related to. (And
>>those fixes should be in the version I¹m running anyway).
>> 
>> I¹ve also spotted a few memory leak fixes in
>>https://github.com/beekhof/pacemaker, but I¹m not sure whether they
>>relate to my issue (assuming I have a memory leak and this isn¹t
>>expected behaviour).
>> 
>> Is there additional debugging that I can perform to check whether I
>>have a leak, or is there enough evidence to justify upgrading to 1.1.11?
>> 
>> Thanks in advance
>> 
>> Greg Murphy
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: lrmd.tgz
Type: application/octet-stream
Size: 11286 bytes
Desc: lrmd.tgz
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140506/c5e0f089/attachment-0003.obj>


More information about the Pacemaker mailing list