[Pacemaker] socket is incremented after running crm shell
David Vossel
dvossel at redhat.com
Tue Apr 3 17:53:57 CEST 2012
----- Original Message -----
> From: "Junko IKEDA" <tsukishima.ha at gmail.com>
> To: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
> Sent: Friday, March 30, 2012 1:07:37 AM
> Subject: [Pacemaker] socket is incremented after running crm shell
>
> Hi,
>
> I encountered the following error message during the run of crm
> shell.
>
> crmd: [6837]: ERROR: socket_accept_connection: accept(sock=10): Too
> many open files
>
> The same error is discussed here,
> https://developerbugs.linuxfoundation.org/show_bug.cgi?id=2626
> but it seems that this one is an issue for ubuntus and upstart,
> and my case is probably the other one.
>
> I can reproduce the similar case like this;
> in this case, "bl460g6b" is DC.
> and socket is incremented by one on only DC after running crm shell.
>
>
> * Initial status
>
> # date; hostname; lsof -p $(pgrep crmd) | wc -l
> Fri Mar 30 14:26:04 JST 2012
> bl460g6a
> 42
>
> # date; hostname; lsof -p $(pgrep crmd) | wc -l
> Fri Mar 30 14:26:16 JST 2012
> bl460g6b
> 46
>
>
>
> * Upload the resource setting(just start one Dummy RA)
>
> # crm configure load update cib.crm
> # crm_mon -1
>
> ============
> Last updated: Fri Mar 30 14:26:52 2012
> Stack: Heartbeat
> Current DC: bl460g6b (22222222-2222-2222-2222-222222222222) -
> partition with quorum
> Version: 1.0.12-unknown
> 2 Nodes configured, unknown expected votes
> 1 Resources configured.
> ============
>
> Online: [ bl460g6a bl460g6b ]
>
> dummy01 (ocf::pacemaker:Dummy): Started bl460g6a
>
> # date; hostname; lsof -p $(pgrep crmd) | wc -l
> Fri Mar 30 14:27:10 JST 2012
> bl460g6a
> 42
>
> # date; hostname; lsof -p $(pgrep crmd) | wc -l
> Fri Mar 30 14:27:16 JST 2012
> bl460g6b
> 47 <==== +1
I see the same thing. I'm using the latest pacemaker source from the master branch, so this definitely still exists. For me the file leak occurs every time I issue a "cibadmin --replace --xml-file" command. The shell is doing the same command internally for adding and removing resources, so I see it there as well.
I opened a bug report for this.
http://bugs.clusterlabs.org/show_bug.cgi?id=5051
I'll keep investigating it.
-- Vossel
More information about the Pacemaker
mailing list