[Pacemaker] lrmd Memory Usage
Greg Murphy
greg.murphy at gamesparks.com
Tue May 6 09:47:14 UTC 2014
Here you go - I’ve only run lrmd for 30 minutes since installing the debug
package, but hopefully that’s enough - if not, let me know and I’ll do a
longer capture.
On 06/05/2014 10:08, "Andrew Beekhof" <andrew at beekhof.net> wrote:
>Oh, any any chance you could install the debug packages? It will make the
>output even more useful :-)
>
>On 6 May 2014, at 7:06 pm, Andrew Beekhof <andrew at beekhof.net> wrote:
>
>>
>> On 6 May 2014, at 6:05 pm, Greg Murphy <greg.murphy at gamesparks.com>
>>wrote:
>>
>>> Attached are the valgrind outputs from two separate runs of lrmd with
>>>the
>>> suggested variables set. Do they help narrow the issue down?
>>
>> They do somewhat. I'll investigate. But much of the memory is still
>>reachable:
>>
>> ==26203== indirectly lost: 17,945,950 bytes in 642,546 blocks
>> ==26203== possibly lost: 2,805 bytes in 60 blocks
>> ==26203== still reachable: 26,104,781 bytes in 544,782 blocks
>> ==26203== suppressed: 8,652 bytes in 176 blocks
>> ==26203== Reachable blocks (those to which a pointer was found) are not
>>shown.
>> ==26203== To see them, rerun with: --leak-check=full
>>--show-reachable=yes
>>
>> Could you add the --show-reachable=yes to VALGRIND_OPTS variable?
>>
>>>
>>>
>>> Thanks
>>>
>>> Greg
>>>
>>>
>>> On 02/05/2014 03:01, "Andrew Beekhof" <andrew at beekhof.net> wrote:
>>>
>>>>
>>>> On 30 Apr 2014, at 9:01 pm, Greg Murphy <greg.murphy at gamesparks.com>
>>>> wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> I¹m running a two-node Pacemaker cluster on Ubuntu Saucy (13.10),
>>>>> kernel 3.11.0-17-generic and the Ubuntu Pacemaker package, version
>>>>> 1.1.10+git20130802-1ubuntu1.
>>>>
>>>> The problem is that I have no way of knowing what code is/isn't
>>>>included
>>>> in '1.1.10+git20130802-1ubuntu1'.
>>>> You could try setting the following in your environment before
>>>>starting
>>>> pacemaker though
>>>>
>>>> # Variables for running child daemons under valgrind and/or checking
>>>>for
>>>> memory problems
>>>> G_SLICE=always-malloc
>>>> MALLOC_PERTURB_=221 # or 0
>>>> MALLOC_CHECK_=3 # or 0,1,2
>>>> PCMK_valgrind_enabled=lrmd
>>>> VALGRIND_OPTS="--leak-check=full --trace-children=no --num-callers=25
>>>> --log-file=/var/lib/pacemaker/valgrind-%p
>>>> --suppressions=/usr/share/pacemaker/tests/valgrind-pcmk.suppressions
>>>> --gen-suppressions=all"
>>>>
>>>>
>>>>> The cluster is configured with a DRBD master/slave set and then a
>>>>> failover resource group containing MySQL (along with its DRBD
>>>>> filesystem) and a Zabbix Proxy and Agent.
>>>>>
>>>>> Since I built the cluster around two months ago I¹ve noticed that on
>>>>> the the active node the memory footprint of lrmd gradually grows to
>>>>> quite a significant size. The cluster was last restarted three weeks
>>>>> ago, and now lrmd has over 1GB of mapped memory on the active node
>>>>>and
>>>>> only 151MB on the passive node. Current excerpts from
>>>>>/proc/PID/status
>>>>> are:
>>>>>
>>>>> Active node
>>>>> VmPeak:
>>>>> 1146740 kB
>>>>> VmSize:
>>>>> 1146740 kB
>>>>> VmLck:
>>>>> 0 kB
>>>>> VmPin:
>>>>> 0 kB
>>>>> VmHWM:
>>>>> 267680 kB
>>>>> VmRSS:
>>>>> 188764 kB
>>>>> VmData:
>>>>> 1065860 kB
>>>>> VmStk:
>>>>> 136 kB
>>>>> VmExe:
>>>>> 32 kB
>>>>> VmLib:
>>>>> 10416 kB
>>>>> VmPTE:
>>>>> 2164 kB
>>>>> VmSwap:
>>>>> 822752 kB
>>>>>
>>>>> Passive node
>>>>> VmPeak:
>>>>> 220832 kB
>>>>> VmSize:
>>>>> 155428 kB
>>>>> VmLck:
>>>>> 0 kB
>>>>> VmPin:
>>>>> 0 kB
>>>>> VmHWM:
>>>>> 4568 kB
>>>>> VmRSS:
>>>>> 3880 kB
>>>>> VmData:
>>>>> 74548 kB
>>>>> VmStk:
>>>>> 136 kB
>>>>> VmExe:
>>>>> 32 kB
>>>>> VmLib:
>>>>> 10416 kB
>>>>> VmPTE:
>>>>> 172 kB
>>>>> VmSwap:
>>>>> 0 kB
>>>>>
>>>>> During the last week or so I¹ve taken a couple of snapshots of
>>>>> /proc/PID/smaps on the active node, and the heap particularly stands
>>>>>out
>>>>> as growing: (I have the full outputs captured if they¹ll help)
>>>>>
>>>>> 20140422
>>>>> 7f92e1578000-7f92f218b000 rw-p 00000000 00:00 0
>>>>> [heap]
>>>>> Size: 274508 kB
>>>>> Rss: 180152 kB
>>>>> Pss: 180152 kB
>>>>> Shared_Clean: 0 kB
>>>>> Shared_Dirty: 0 kB
>>>>> Private_Clean: 0 kB
>>>>> Private_Dirty: 180152 kB
>>>>> Referenced: 120472 kB
>>>>> Anonymous: 180152 kB
>>>>> AnonHugePages: 0 kB
>>>>> Swap: 91568 kB
>>>>> KernelPageSize: 4 kB
>>>>> MMUPageSize: 4 kB
>>>>> Locked: 0 kB
>>>>> VmFlags: rd wr mr mw me ac
>>>>>
>>>>>
>>>>> 20140423
>>>>> 7f92e1578000-7f92f305e000 rw-p 00000000 00:00 0
>>>>> [heap]
>>>>> Size: 289688 kB
>>>>> Rss: 184136 kB
>>>>> Pss: 184136 kB
>>>>> Shared_Clean: 0 kB
>>>>> Shared_Dirty: 0 kB
>>>>> Private_Clean: 0 kB
>>>>> Private_Dirty: 184136 kB
>>>>> Referenced: 69748 kB
>>>>> Anonymous: 184136 kB
>>>>> AnonHugePages: 0 kB
>>>>> Swap: 103112 kB
>>>>> KernelPageSize: 4 kB
>>>>> MMUPageSize: 4 kB
>>>>> Locked: 0 kB
>>>>> VmFlags: rd wr mr mw me ac
>>>>>
>>>>> 20140430
>>>>> 7f92e1578000-7f92fc01d000 rw-p 00000000 00:00 0
>>>>> [heap]
>>>>> Size: 436884 kB
>>>>> Rss: 140812 kB
>>>>> Pss: 140812 kB
>>>>> Shared_Clean: 0 kB
>>>>> Shared_Dirty: 0 kB
>>>>> Private_Clean: 744 kB
>>>>> Private_Dirty: 140068 kB
>>>>> Referenced: 43600 kB
>>>>> Anonymous: 140812 kB
>>>>> AnonHugePages: 0 kB
>>>>> Swap: 287392 kB
>>>>> KernelPageSize: 4 kB
>>>>> MMUPageSize: 4 kB
>>>>> Locked: 0 kB
>>>>> VmFlags: rd wr mr mw me ac
>>>>>
>>>>> I noticed in the release notes for 1.1.10-rc1
>>>>>
>>>>>(https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.1
>>>>>0-r
>>>>> c1) that there was work done to fix "crmd: lrmd: stonithd: fixed
>>>>>memory
>>>>> leaks² but I¹m not sure which particular bug this was related to.
>>>>>(And
>>>>> those fixes should be in the version I¹m running anyway).
>>>>>
>>>>> I¹ve also spotted a few memory leak fixes in
>>>>> https://github.com/beekhof/pacemaker, but I¹m not sure whether they
>>>>> relate to my issue (assuming I have a memory leak and this isn¹t
>>>>> expected behaviour).
>>>>>
>>>>> Is there additional debugging that I can perform to check whether I
>>>>> have a leak, or is there enough evidence to justify upgrading to
>>>>>1.1.11?
>>>>>
>>>>> Thanks in advance
>>>>>
>>>>> Greg Murphy
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started:
>>>>>http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>>
>>>
>>> <lrmd.tgz>_______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started:
>>>http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lrmd-dbg.tgz
Type: application/octet-stream
Size: 61898 bytes
Desc: lrmd-dbg.tgz
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140506/1271c01d/attachment-0004.obj>
More information about the Pacemaker
mailing list