[Pacemaker] Not connected to AIS
Proskurin Kirill
k.proskurin at corp.mail.ru
Fri Jun 24 08:56:23 UTC 2011
Hello.
I have a strange problem.
One node in cluster are not work right.
In logs:
Jun 23 20:25:25 mysender39.example.com lrmd: [10371]: WARN: For LSB init
script, no additional parameters are needed.
Jun 23 20:25:25 mysender39.example.com lrmd: [30679]: info: RA output:
(onlineconf.init:3:stop:stdout) Stopping onlineconf_updater:
Jun 23 20:25:25 mysender39.example.com lrmd: [30679]: info: RA output:
(onlineconf.init:3:stop:stdout) [
Jun 23 20:25:25 mysender39.example.com lrmd: [30679]: info: RA output:
(onlineconf.init:3:stop:stdout) OK
Jun 23 20:25:25 mysender39.example.com lrmd: [30679]: info: RA output:
(onlineconf.init:3:stop:stdout) ]
Jun 23 20:25:25 mysender39.example.com crmd: [30682]: info:
process_lrm_event: LRM operation onlineconf.init:3_stop_0 (call=181,
rc=0, cib-update=683339, confirm
ed=true) ok
Jun 23 20:25:25 mysender39.example.com cib: [30678]: ERROR:
send_ais_message: Not connected to AIS
And then many errors and this string over and over.
But at crm_mod all seems quite:
Last updated: Fri Jun 24 12:35:05 2011
Stack: openais
Current DC: mysender6.example.com - partition with quorum
Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87
4 Nodes configured, 4 expected votes
7 Resources configured.
Online: [ mysender6.example.com mysender31.example.com
mysender38.example.com mysender39.example.com ]
And clone resource at this not is "unmanaged".
onlineconf.init:3 (lsb:onlineconf): Started
mysender39.example.com (unmanaged) FAILED
Failed actions:
onlineconf.init:3_monitor_5000 (node=mysender39.example.com,
call=180, rc=7, status=complete): not running
onlineconf.init:3_stop_0 (node=mysender39.example.com, call=-1,
rc=1, status=Timed Out): unknown error
At logs:
Jun 24 12:43:15 mysender39.example.com attrd: [30680]: WARN:
attrd_cib_callback: Update 333725 for
fail-count-onlineconf.init:2=(null) failed: Remote node did not respond
But if I run it by hands it is answers immediately:
# /etc/init.d/onlineconf status
onlineconf_updater is stopped
I do /etc/init.d/corosync restart
I wait for 5 min but it still "Waiting for corosync services to unload"
So i kill with -9 and restart.
And all start normal again.
What was wrong?
Corosync-1.2.7
Pacemaker-1.0.11
--
Best regards,
Proskurin Kirill
More information about the Pacemaker
mailing list