[Pacemaker] pacemaker unable to start

Shravan Mishra shravan.mishra at gmail.com
Wed Oct 21 11:13:02 EDT 2009


Hello guys,

We are running

corosync-1.0.0
heartbeat-2.99.1
pacemaker-1.0.4

the corosync.conf  under /etc/corosync/ is

============
# Please read the corosync.conf.5 manual page
compatibility: whitetank

aisexec {
       user: root
       group: root
}
totem {
       version: 2
       secauth: off
       threads: 0
       interface {
               ringnumber: 0
               bindnetaddr: 172.30.0.0
               mcastaddr:226.94.1.1
               mcastport: 5406
       }
}

logging {
       fileline: off
       to_stderr: yes
       to_logfile: yes
       to_syslog: yes
       logfile: /tmp/corosync.log
       debug: on
       timestamp: on
       logger_subsys {
               subsys: pacemaker
               debug: on
               tags: enter|leave|trace1|trace2| trace3|trace4|trace6
       }
}


service {
       name: pacemaker
       ver: 0
    #   use_mgmtd: yes
     #  use_logd:yes
}


corosync {
       user: root
       group: root
}


amf {
       mode: disabled
}
============


#service corosync start

starts the messaging but fails to load pacemaker,

/tmp/corosync.log  ---

==================

Oct 21 11:05:43 corosync [MAIN  ] Corosync Cluster Engine ('trunk'): started
and ready to provide service.
Oct 21 11:05:43 corosync [MAIN  ] Successfully read main configuration file
'/etc/corosync/corosync.conf'.
Oct 21 11:05:43 corosync [TOTEM ] Token Timeout (1000 ms) retransmit timeout
(238 ms)
Oct 21 11:05:43 corosync [TOTEM ] token hold (180 ms) retransmits before
loss (4 retrans)
Oct 21 11:05:43 corosync [TOTEM ] join (50 ms) send_join (0 ms) consensus
(800 ms) merge (200 ms)
Oct 21 11:05:43 corosync [TOTEM ] downcheck (1000 ms) fail to recv const (50
msgs)
Oct 21 11:05:43 corosync [TOTEM ] seqno unchanged const (30 rotations)
Maximum network MTU 1500
Oct 21 11:05:43 corosync [TOTEM ] window size per rotation (50 messages)
maximum messages per rotation (17 messages)
Oct 21 11:05:43 corosync [TOTEM ] send threads (0 threads)
Oct 21 11:05:43 corosync [TOTEM ] RRP token expired timeout (238 ms)
Oct 21 11:05:43 corosync [TOTEM ] RRP token problem counter (2000 ms)
Oct 21 11:05:43 corosync [TOTEM ] RRP threshold (10 problem count)
Oct 21 11:05:43 corosync [TOTEM ] RRP mode set to none.
Oct 21 11:05:43 corosync [TOTEM ] heartbeat_failures_allowed (0)
Oct 21 11:05:43 corosync [TOTEM ] max_network_delay (50 ms)
Oct 21 11:05:43 corosync [TOTEM ] HeartBeat is Disabled. To enable set
heartbeat_failures_allowed > 0
Oct 21 11:05:43 corosync [TOTEM ] Initializing transmit/receive security:
libtomcrypt SOBER128/SHA1HMAC (mode 0).
Oct 21 11:05:43 corosync [TOTEM ] Receive multicast socket recv buffer size
(262142 bytes).
Oct 21 11:05:43 corosync [TOTEM ] Transmit multicast socket send buffer size
(262142 bytes).
Oct 21 11:05:43 corosync [TOTEM ] The network interface [172.30.0.145] is
now up.
Oct 21 11:05:43 corosync [TOTEM ] Created or loaded sequence id
184.172.30.0.145 for this ring.
Oct 21 11:05:43 corosync [TOTEM ] entering GATHER state from 15.
Oct 21 11:05:43 corosync [SERV  ] *Service failed to load 'pacemaker'.*
Oct 21 11:05:43 corosync [SERV  ] Service initialized 'corosync extended
virtual synchrony service'
Oct 21 11:05:43 corosync [SERV  ] Service initialized 'corosync
configuration service'
Oct 21 11:05:43 corosync [SERV  ] Service initialized 'corosync cluster
closed process group service v1.01'
Oct 21 11:05:43 corosync [SERV  ] Service initialized 'corosync cluster
config database access v1.01'
Oct 21 11:05:43 corosync [SERV  ] Service initialized 'corosync profile
loading service'
Oct 21 11:05:43 corosync [MAIN  ] Compatibility mode set to whitetank.
 Using V1 and V2 of the synchronization engine.
Oct 21 11:05:43 corosync [TOTEM ] Creating commit token because I am the
rep.
Oct 21 11:05:43 corosync [TOTEM ] Saving state aru 0 high seq received 0
Oct 21 11:05:43 corosync [TOTEM ] Storing new sequence id for ring bc
Oct 21 11:05:43 corosync [TOTEM ] entering COMMIT state.
Oct 21 11:05:43 corosync [TOTEM ] got commit token
Oct 21 11:05:43 corosync [TOTEM ] entering RECOVERY state.
Oct 21 11:05:43 corosync [TOTEM ] position [0] member 172.30.0.145:
Oct 21 11:05:43 corosync [TOTEM ] previous ring seq 184 rep 172.30.0.145
Oct 21 11:05:43 corosync [TOTEM ] aru 0 high delivered 0 received flag 1
Oct 21 11:05:43 corosync [TOTEM ] Did not need to originate any messages in
recovery.
Oct 21 11:05:43 corosync [TOTEM ] got commit token
Oct 21 11:05:43 corosync [TOTEM ] Sending initial ORF token
Oct 21 11:05:43 corosync [TOTEM ] token retrans flag is 0 my set retrans
flag0 retrans queue empty 1 count 0, aru 0
Oct 21 11:05:43 corosync [TOTEM ] install seq 0 aru 0 high seq received 0
Oct 21 11:05:43 corosync [TOTEM ] token retrans flag is 0 my set retrans
flag0 retrans queue empty 1 count 1, aru 0
Oct 21 11:05:43 corosync [TOTEM ] install seq 0 aru 0 high seq received 0
Oct 21 11:05:43 corosync [TOTEM ] token retrans flag is 0 my set retrans
flag0 retrans queue empty 1 count 2, aru 0
Oct 21 11:05:43 corosync [TOTEM ] install seq 0 aru 0 high seq received 0
Oct 21 11:05:43 corosync [TOTEM ] token retrans flag is 0 my set retrans
flag0 retrans queue empty 1 count 3, aru 0
Oct 21 11:05:43 corosync [TOTEM ] install seq 0 aru 0 high seq received 0
Oct 21 11:05:43 corosync [TOTEM ] retrans flag count 4 token aru 0 install
seq 0 aru 0 0
Oct 21 11:05:43 corosync [TOTEM ] recovery to regular 1-0
Oct 21 11:05:43 corosync [TOTEM ] Delivering to app 1 to 0
Oct 21 11:05:43 corosync [SYNC  ] This node is within the primary component
and will provide service.
Oct 21 11:05:43 corosync [TOTEM ] entering OPERATIONAL state.
Oct 21 11:05:43 corosync [TOTEM ] A processor joined or left the membership
and a new membership was formed.
Oct 21 11:05:43 corosync [TOTEM ] mcasted message added to pending queue
Oct 21 11:05:43 corosync [TOTEM ] Delivering 0 to 1
Oct 21 11:05:43 corosync [TOTEM ] Delivering MCAST message with seq 1 to
pending delivery queue
Oct 21 11:05:43 corosync [SYNC  ] confchg entries 1
Oct 21 11:05:43 corosync [SYNC  ] Barrier Start Received From -1862263124
Oct 21 11:05:43 corosync [SYNC  ] Barrier completion status for nodeid
-1862263124 = 1.
==================


I'm curious to know how actually corosync/openais loads pacemaker, the
config directive seems to have done the magic but apparently not in my case.
What should I be looking for, as the log message hardly gives any
information.

Pacemaker comprises bunch of daemons like crmd, stonithd and stuff, I ran
them individually to see any permission problems
like /var/lib/heartbeat and /var/run/heartbeat which should be chown
hacluster:haclient.


Even after doing those it fails to load.


Please advise me what should I do.


Thanks
Shravan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20091021/77daf705/attachment.html>


More information about the Pacemaker mailing list