[Pacemaker] When a disk becomes to the full, memory leak happens in pengine of the DC.

Yuusuke IIDA iidayuus at intellilink.co.jp
Mon Sep 19 22:31:34 EDT 2011


Hi, Andrew

When the disk utilization of the DC node became 100%, I found the phenomenon
that memory was used in large quantities by pengine.

When pengine fails in the output of the pe-input file, this memory consumption
seems to happen.
When failed, the following log is output.
Sep  1 14:15:50 sby2 pengine: [3156]: ERROR: write_xml_file: bzWriteClose()
failed: -6

As a result of valgrind, I do not seem to release the memory which I acquired in
libbz2.

(1) ==1606== 4,384,028 (10,208 direct, 4,373,820 indirect) bytes in 2 blocks are
definitely lost in loss record 104 of 109
(1) ==1606==    at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
(1) ==1606==    by 0x37E960B972: BZ2_bzWriteOpen (in /lib64/libbz2.so.1.0.4)
(1) ==1606==    by 0x4E584C9: write_xml_file (xml.c:744)
(1) ==1606==    by 0x52A7818: process_pe_message (pengine.c:191)
(1) ==1606==    by 0x4012DF: pe_msg_callback (main.c:60)
(1) ==1606==    by 0x59127A9: G_CH_dispatch_int (in /usr/lib64/libplumb.so.2.1.0)
(1) ==1606==    by 0x37D5E38F0D: g_main_context_dispatch (in
/lib64/libglib-2.0.so.0.2200.5)
(1) ==1606==    by 0x37D5E3C937: ??? (in /lib64/libglib-2.0.so.0.2200.5)
(1) ==1606==    by 0x37D5E3CD54: g_main_loop_run (in /lib64/libglib-2.0.so.0.2200.5)
(1) ==1606==    by 0x401929: main (main.c:177)
[snip]
(1) ==1606== 22,322,828 (446,144 direct, 21,876,684 indirect) bytes in 8 blocks
are definitely lost in loss record 109 of 109
(1) ==1606==    at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
(1) ==1606==    by 0x37E960AF4A: BZ2_bzCompressInit (in /lib64/libbz2.so.1.0.4)
(1) ==1606==    by 0x37E960B9F1: BZ2_bzWriteOpen (in /lib64/libbz2.so.1.0.4)
(1) ==1606==    by 0x4E584C9: write_xml_file (xml.c:744)
(1) ==1606==    by 0x52A7818: process_pe_message (pengine.c:191)
(1) ==1606==    by 0x4012DF: pe_msg_callback (main.c:60)
(1) ==1606==    by 0x59127A9: G_CH_dispatch_int (in /usr/lib64/libplumb.so.2.1.0)
(1) ==1606==    by 0x37D5E38F0D: g_main_context_dispatch (in
/lib64/libglib-2.0.so.0.2200.5)
(1) ==1606==    by 0x37D5E3C937: ??? (in /lib64/libglib-2.0.so.0.2200.5)
(1) ==1606==    by 0x37D5E3CD54: g_main_loop_run (in /lib64/libglib-2.0.so.0.2200.5)
(1) ==1606==    by 0x401929: main (main.c:177)
[snip]
(1) ==1606== LEAK SUMMARY:
(1) ==1606==    definitely lost: 456,352 bytes in 10 blocks
(1) ==1606==    indirectly lost: 26,250,504 bytes in 30 blocks
(1) ==1606==      possibly lost: 20,977,780 bytes in 215 blocks
(1) ==1606==    still reachable: 7,237 bytes in 43 blocks
(1) ==1606==         suppressed: 0 bytes in 0 blocks
(1) ==1606== Reachable blocks (those to which a pointer was found) are not shown.
(1) ==1606== To see them, rerun with: --leak-check=full --show-reachable=yes
(1) ==1606==
(1) ==1606== For counts of detected and suppressed errors, rerun with: -v
(1) ==1606== Use --track-origins=yes to see where uninitialised values come from
(1) ==1606== ERROR SUMMARY: 134454 errors from 95 contexts (suppressed: 6 from 6)

Best Regards,
Yuusuke
-- 
----------------------------------------
METRO SYSTEMS CO., LTD

Yuusuke Iida
Mail: iidayuus at intellilink.co.jp
----------------------------------------




More information about the Pacemaker mailing list