[Pacemaker] socket is incremented after running crm shell

David Vossel dvossel at redhat.com
Wed Apr 4 15:05:06 EDT 2012


----- Original Message -----
> From: "Junko IKEDA" <tsukishima.ha at gmail.com>
> To: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
> Sent: Tuesday, April 3, 2012 9:54:42 PM
> Subject: Re: [Pacemaker] socket is incremented after running crm shell
> 
> Hi,
> 
> This is my investigation;
> When "crm configure" or "cibadmin"  are called,
> it seems that pengine process tries to restart.
> 
> Apr  2 14:10:01 bl460g6b crmd: [7186]: info: start_subsystem:
> Starting
> sub-system "pengine"
> Apr  2 14:10:01 bl460g6b crmd: [7186]: WARN: start_subsystem: Client
> pengine already running as pid 7190
> Apr  2 14:10:05 bl460g6b crmd: [7186]: info: do_dc_takeover: Taking
> over DC status for this partition
> 
> Process is already running, so "restart pengine" is canceled,
> but IPC channel is added newly.
> That's why a file descriptor is also increased.
> Is it correct?

The patch isn't wrong, but I believe the patch below is a bit simpler.  There is a flag we can check to see if we are already connected to the PE or not.

diff --git crmd/pengine.c crmd/pengine.c
index 989601b..ae60a59 100644
--- crmd/pengine.c
+++ crmd/pengine.c
@@ -181,7 +181,7 @@ do_pe_control(long long action,
         }
     }
 
-    if (action & start_actions) {
+    if ((action & start_actions) && (is_set(fsa_input_register, R_PE_CONNECTED) == FALSE)) {
         if (cur_state != S_STOPPING) {
             if (is_openais_cluster()) {
                 set_bit_inplace(fsa_input_register, pe_subsystem->flag_required);


> Please see the attached.
> 
> By the way, during the status check of pengine, crmd calls sleep(4)

That is strange.  I have no idea the reasoning behind that.  It only occurs when using the heartbeat stack though.  It looks like an attempt to allow the child process launched in start_subsystem() to initialize something before the parent process proceeds. That kind of logic is never a good idea.


-- Vossel

> in
> do_pe_control().
> I think it's not reasonable to do the check with each "crm configure"
> or "cibadmin".
> It will lead the delay of the transition.
> 
> Thanks,
> Junko
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 




More information about the Pacemaker mailing list