[Pacemaker] lrmd Memory Usage

Andrew Beekhof andrew at beekhof.net
Thu May 1 22:01:03 EDT 2014


On 30 Apr 2014, at 9:01 pm, Greg Murphy <greg.murphy at gamesparks.com> wrote:

> Hi
> 
> I’m running a two-node Pacemaker cluster on Ubuntu Saucy (13.10), kernel 3.11.0-17-generic and the Ubuntu Pacemaker package, version 1.1.10+git20130802-1ubuntu1.

The problem is that I have no way of knowing what code is/isn't included in '1.1.10+git20130802-1ubuntu1'.
You could try setting the following in your environment before starting pacemaker though

# Variables for running child daemons under valgrind and/or checking for memory problems
G_SLICE=always-malloc
MALLOC_PERTURB_=221 # or 0
MALLOC_CHECK_=3     # or 0,1,2
PCMK_valgrind_enabled=lrmd
VALGRIND_OPTS="--leak-check=full --trace-children=no --num-callers=25 --log-file=/var/lib/pacemaker/valgrind-%p --suppressions=/usr/share/pacemaker/tests/valgrind-pcmk.suppressions --gen-suppressions=all"


> The cluster is configured with a DRBD master/slave set and then a failover resource group containing MySQL (along with its DRBD filesystem) and a Zabbix Proxy and Agent.
> 
> Since I built the cluster around two months ago I’ve noticed that on the the active node the memory footprint of lrmd gradually grows to quite a significant size. The cluster was last restarted three weeks ago, and now lrmd has over 1GB of mapped memory on the active node and only 151MB on the passive node. Current excerpts from /proc/PID/status are:
> 
> Active node
> VmPeak:
> 1146740 kB
> VmSize:
> 1146740 kB
> VmLck:
>       0 kB
> VmPin:
>       0 kB
> VmHWM:
>   267680 kB
> VmRSS:
>   188764 kB
> VmData:
> 1065860 kB
> VmStk:
>     136 kB
> VmExe:
>       32 kB
> VmLib:
>   10416 kB
> VmPTE:
>     2164 kB
> VmSwap:
>   822752 kB
> 
> Passive node
> VmPeak:
>   220832 kB
> VmSize:
>   155428 kB
> VmLck:
>       0 kB
> VmPin:
>       0 kB
> VmHWM:
>     4568 kB
> VmRSS:
>     3880 kB
> VmData:
>   74548 kB
> VmStk:
>     136 kB
> VmExe:
>       32 kB
> VmLib:
>   10416 kB
> VmPTE:
>     172 kB
> VmSwap:
>       0 kB
> 
> During the last week or so I’ve taken a couple of snapshots of /proc/PID/smaps on the active node, and the heap particularly stands out as growing: (I have the full outputs captured if they’ll help)
> 
> 20140422
> 7f92e1578000-7f92f218b000 rw-p 00000000 00:00 0                          [heap]
> Size:             274508 kB
> Rss:              180152 kB
> Pss:              180152 kB
> Shared_Clean:          0 kB
> Shared_Dirty:          0 kB
> Private_Clean:         0 kB
> Private_Dirty:    180152 kB
> Referenced:       120472 kB
> Anonymous:        180152 kB
> AnonHugePages:         0 kB
> Swap:              91568 kB
> KernelPageSize:        4 kB
> MMUPageSize:           4 kB
> Locked:                0 kB
> VmFlags: rd wr mr mw me ac
> 
> 
> 20140423
> 7f92e1578000-7f92f305e000 rw-p 00000000 00:00 0                          [heap]
> Size:             289688 kB
> Rss:              184136 kB
> Pss:              184136 kB
> Shared_Clean:          0 kB
> Shared_Dirty:          0 kB
> Private_Clean:         0 kB
> Private_Dirty:    184136 kB
> Referenced:        69748 kB
> Anonymous:        184136 kB
> AnonHugePages:         0 kB
> Swap:             103112 kB
> KernelPageSize:        4 kB
> MMUPageSize:           4 kB
> Locked:                0 kB
> VmFlags: rd wr mr mw me ac
> 
> 20140430
> 7f92e1578000-7f92fc01d000 rw-p 00000000 00:00 0                          [heap]
> Size:             436884 kB
> Rss:              140812 kB
> Pss:              140812 kB
> Shared_Clean:          0 kB
> Shared_Dirty:          0 kB
> Private_Clean:       744 kB
> Private_Dirty:    140068 kB
> Referenced:        43600 kB
> Anonymous:        140812 kB
> AnonHugePages:         0 kB
> Swap:             287392 kB
> KernelPageSize:        4 kB
> MMUPageSize:           4 kB
> Locked:                0 kB
> VmFlags: rd wr mr mw me ac
> 
> I noticed in the release notes for 1.1.10-rc1 (https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.10-rc1) that there was work done to fix "crmd: lrmd: stonithd: fixed memory leaks” but I’m not sure which particular bug this was related to. (And those fixes should be in the version I’m running anyway).
> 
> I’ve also spotted a few memory leak fixes in https://github.com/beekhof/pacemaker, but I’m not sure whether they relate to my issue (assuming I have a memory leak and this isn’t expected behaviour).
> 
> Is there additional debugging that I can perform to check whether I have a leak, or is there enough evidence to justify upgrading to 1.1.11?
> 
> Thanks in advance
> 
> Greg Murphy
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140502/4ef71d69/attachment-0002.sig>


More information about the Pacemaker mailing list