[Pacemaker] The active trap of the SNMP is delayed.
renayama19661014 at ybb.ne.jp
renayama19661014 at ybb.ne.jp
Wed Jun 15 01:29:35 UTC 2011
Hi All,
I found a problem with a trap of the SNMP.(from hbagent.)
A trap of active of the node seems to have possibilities to be delayed.
In addition, this problem sometimes occurs and does not always occur.
I confirmed it in the next procedure.
Step1) Start a node.
============
Last updated: Wed Jun 15 19:23:39 2011
Stack: Heartbeat
Current DC: srv02 (afe72fff-b7b4-4663-b845-872df29c635d) - partition WITHOUT quorum
Version: 1.0.11-6e010d6b0d49a6b929d17c0114e9d2d934dc8e04
2 Nodes configured, unknown expected votes
1 Resources configured.
============
Online: [ srv01 srv02 ]
Resource Group: group-1
prmDummy1 (ocf::heartbeat:Dummy): Started srv01
Migration summary:
* Node srv02:
* Node srv01:
Step2) Intercept one interface of the Heartbeat communication.
# iptables -A INPUT -i eth1 -s ! 192.168.10.110 -j DROP
# iptables -A INPUT -i eth1 -s ! 192.168.10.120 -j DROP
Step3) The next trap is received in SNMP managers.
(snip)
Jun 15 19:24:30 snmp-manager snmptrapd[4771]: 2011-06-15 19:24:30 <UNKNOWN> [UDP: [192.168.40.120]:59010]: DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (23014) 0:03:50.14 SNMPv2-MIB::snmpTrapOID.0 = OID: LINUX-HA-MIB::LHAIFStatusUpdate LINUX-HA-MIB::LHANodeName = STRING: srv01 LINUX-HA-MIB::LHAIFName = STRING: eth1 LINUX-HA-MIB::LHAIFStatus = INTEGER: down(2)
----> No problem.
Jun 15 19:24:32 snmp-manager snmptrapd[4771]: 2011-06-15 19:24:32 <UNKNOWN> [UDP: [192.168.40.110]:44001]: DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (23597) 0:03:55.97 SNMPv2-MIB::snmpTrapOID.0 = OID: LINUX-HA-MIB::LHANodeStatusUpdate LINUX-HA-MIB::LHANodeName = STRING: srv02 LINUX-HA-MIB::LHANodeStatus = INTEGER: active(3)
----> The trap of active is improper in this timing.
Jun 15 19:24:34 snmp-manager snmptrapd[4771]: 2011-06-15 19:24:34 <UNKNOWN> [UDP: [192.168.40.110]:44001]: DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (23803) 0:03:58.03 SNMPv2-MIB::snmpTrapOID.0 = OID: LINUX-HA-MIB::LHAIFStatusUpdate LINUX-HA-MIB::LHANodeName = STRING: srv02 LINUX-HA-MIB::LHAIFName = STRING: eth1 LINUX-HA-MIB::LHAIFStatus = INTEGER: down(2)
----> No problem.
(snip)
Between the traps which interface intercepted, it is strange that the active trap of the node comes.
And I think that it is necessary for the active trap to be sent in an earlier timing.
This problem seems to happen in Heartbeat2.1.4.
I watched some sources, but think that client_lib of Heartbeat has a problem somehow or other.
Transmitted F_STATUS message is late and seems to be handled.
Best Regards,
Hideo Yamauchi.
More information about the Pacemaker
mailing list