[Pacemaker] Pacemaker and LDAP (389 Directory Service)

Mon Jun 27 21:33:12 UTC 2011

Sorry for the questions. Some days my brain is just slow. :)

Serge Dubrouski <sergeyfd at ...> writes:
> If you want to make your LDAP independent from IP just remove your 
> collocation:colocation ldap-with-eip inf: elastic_ip ldap-clone

Is that really what I want to do? I mean, I need the elastic ip assigned to 
~one~ of the machines... And if LDAP fails on that machine, I need Pacemaker to 
start the Elastic IP on the other machine.

If I remove the co-location, won't the elastic_ip resource just stay where it 
is? Regardless of what happens to LDAP?

> But I'd rather try to find out why monitoring for IP fails. May bet
> it just needs an increased timeout on monitor operation, though it
> looks like you've already increased it. What's in your log files
> when that monitor fails?

Originally, I had the monitor on the elastic_ip resource set to 10 seconds. The 
error in the logs was:

---snip---
pengine: [16980]: notice: unpack_rsc_op: Operation elastic_ip_monitor_0 found 
resource elastic_ip active on ldap1.example.ec2
pengine: [16980]: WARN: unpack_rsc_op: Processing failed op 
elastic_ip_monitor_10000 on ldap1.example.ec2: unknown exec error (-2)
pengine: [16980]: WARN: unpack_rsc_op: Processing failed op elastic_ip_stop_0 on 
ldap1.example.ec2: unknown exec error (-2)
pengine: [16980]: info: native_add_running: resource elastic_ip isnt managed
pengine: [16980]: notice: unpack_rsc_op: Operation ldap:1_monitor_0 found 
resource ldap:1 active on ldap2.example.ec2
pengine: [16980]: WARN: unpack_rsc_op: Processing failed op elastic_ip_start_0 
on ldap2.example.ec2: unknown exec error (-2)
pengine: [16980]: notice: native_print: elastic_ip       (lsb:elastic-ip):       
Started ldap1.example.ec2 (unmanaged) FAILED
pengine: [16980]: notice: clone_print:  Clone Set: ldap-clone
pengine: [16980]: notice: short_print:      Stopped: [ ldap:0 ldap:1 ]
pengine: [16980]: info: get_failcount: elastic_ip has failed INFINITY times on 
ldap1.example.ec2
pengine: [16980]: WARN: common_apply_stickiness: Forcing elastic_ip away from 
ldap1.example.ec2 after 1000000 failures (max=1000000)
pengine: [16980]: info: get_failcount: elastic_ip has failed INFINITY times on 
ldap2.example.ec2
pengine: [16980]: WARN: common_apply_stickiness: Forcing elastic_ip away from 
ldap2.example.ec2 after 1000000 failures (max=1000000)
pengine: [16980]: info: native_color: Unmanaged resource elastic_ip allocated to 
'nowhere': failed
pengine: [16980]: notice: RecurringOp:  Start recurring monitor (15s) for ldap:0 
on ldap1.example.ec2
pengine: [16980]: notice: RecurringOp:  Start recurring monitor (15s) for ldap:1 
on ldap2.example.ec2
pengine: [16980]: notice: LogActions: Leave   resource elastic_ip        
(Started unmanaged)
pengine: [16980]: notice: LogActions: Start   ldap:0     (ldap1.example.ec2)
pengine: [16980]: notice: LogActions: Start   ldap:1     (ldap2.example.ec2)
---snip---

Now that I have set the monitor interval for the elastic_ip resource to "0", it 
keeps thinking everything is either stopped or should be stopped:

---snip---
pengine: [7287]: notice: unpack_rsc_op: Operation elastic_ip_monitor_0 found 
resource elastic_ip active on ldap1.example.ec2
pengine: [7287]: notice: unpack_rsc_op: Operation ldap:0_monitor_0 found 
resource ldap:0 active on ldap2.example.ec2
pengine: [7287]: notice: native_print: elastic_ip (lsb:elastic-ip):       
Stopped 
pengine: [7287]: notice: clone_print:  Clone Set: ldap-clone
pengine: [7287]: notice: short_print:      Stopped: [ ldap:0 ldap:1 ]
pengine: [7287]: notice: LogActions: Leave   resource elastic_ip  (Stopped)
pengine: [7287]: notice: LogActions: Leave   resource ldap:0      (Stopped)
pengine: [7287]: notice: LogActions: Leave   resource ldap:1      (Stopped)
---snip---

Very strange.