[Pacemaker] pacemaker unable to start
Steven Dake
sdake at redhat.com
Wed Oct 21 15:49:42 UTC 2009
I recommend using corosync 1.1.1 - several bug fixes one critical for
proper pacemaker operation. It won't fix this particular problem
however.
Corosync loads pacemaker by searching for a pacemaker lcrso file. These
files are default installed in /usr/libexec/lcrso but may be in a
different location depending on your distribution.
Regards
-steve
On Wed, 2009-10-21 at 11:13 -0400, Shravan Mishra wrote:
> Hello guys,
>
> We are running
>
> corosync-1.0.0
> heartbeat-2.99.1
> pacemaker-1.0.4
>
> the corosync.conf under /etc/corosync/ is
>
> ============
> # Please read the corosync.conf.5 manual page
> compatibility: whitetank
>
> aisexec {
> user: root
> group: root
> }
> totem {
> version: 2
> secauth: off
> threads: 0
> interface {
> ringnumber: 0
> bindnetaddr: 172.30.0.0
> mcastaddr:226.94.1.1
> mcastport: 5406
> }
> }
>
> logging {
> fileline: off
> to_stderr: yes
> to_logfile: yes
> to_syslog: yes
> logfile: /tmp/corosync.log
> debug: on
> timestamp: on
> logger_subsys {
> subsys: pacemaker
> debug: on
> tags: enter|leave|trace1|trace2| trace3|trace4|trace6
> }
> }
>
>
> service {
> name: pacemaker
> ver: 0
> # use_mgmtd: yes
> # use_logd:yes
> }
>
>
> corosync {
> user: root
> group: root
> }
>
>
> amf {
> mode: disabled
> }
> ============
>
>
> #service corosync start
>
> starts the messaging but fails to load pacemaker,
>
> /tmp/corosync.log ---
>
> ==================
>
> Oct 21 11:05:43 corosync [MAIN ] Corosync Cluster Engine ('trunk'):
> started and ready to provide service.
> Oct 21 11:05:43 corosync [MAIN ] Successfully read main configuration
> file '/etc/corosync/corosync.conf'.
> Oct 21 11:05:43 corosync [TOTEM ] Token Timeout (1000 ms) retransmit
> timeout (238 ms)
> Oct 21 11:05:43 corosync [TOTEM ] token hold (180 ms) retransmits
> before loss (4 retrans)
> Oct 21 11:05:43 corosync [TOTEM ] join (50 ms) send_join (0 ms)
> consensus (800 ms) merge (200 ms)
> Oct 21 11:05:43 corosync [TOTEM ] downcheck (1000 ms) fail to recv
> const (50 msgs)
> Oct 21 11:05:43 corosync [TOTEM ] seqno unchanged const (30 rotations)
> Maximum network MTU 1500
> Oct 21 11:05:43 corosync [TOTEM ] window size per rotation (50
> messages) maximum messages per rotation (17 messages)
> Oct 21 11:05:43 corosync [TOTEM ] send threads (0 threads)
> Oct 21 11:05:43 corosync [TOTEM ] RRP token expired timeout (238 ms)
> Oct 21 11:05:43 corosync [TOTEM ] RRP token problem counter (2000 ms)
> Oct 21 11:05:43 corosync [TOTEM ] RRP threshold (10 problem count)
> Oct 21 11:05:43 corosync [TOTEM ] RRP mode set to none.
> Oct 21 11:05:43 corosync [TOTEM ] heartbeat_failures_allowed (0)
> Oct 21 11:05:43 corosync [TOTEM ] max_network_delay (50 ms)
> Oct 21 11:05:43 corosync [TOTEM ] HeartBeat is Disabled. To enable set
> heartbeat_failures_allowed > 0
> Oct 21 11:05:43 corosync [TOTEM ] Initializing transmit/receive
> security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Oct 21 11:05:43 corosync [TOTEM ] Receive multicast socket recv buffer
> size (262142 bytes).
> Oct 21 11:05:43 corosync [TOTEM ] Transmit multicast socket send
> buffer size (262142 bytes).
> Oct 21 11:05:43 corosync [TOTEM ] The network interface [172.30.0.145]
> is now up.
> Oct 21 11:05:43 corosync [TOTEM ] Created or loaded sequence id
> 184.172.30.0.145 for this ring.
> Oct 21 11:05:43 corosync [TOTEM ] entering GATHER state from 15.
> Oct 21 11:05:43 corosync [SERV ] Service failed to load 'pacemaker'.
> Oct 21 11:05:43 corosync [SERV ] Service initialized 'corosync
> extended virtual synchrony service'
> Oct 21 11:05:43 corosync [SERV ] Service initialized 'corosync
> configuration service'
> Oct 21 11:05:43 corosync [SERV ] Service initialized 'corosync
> cluster closed process group service v1.01'
> Oct 21 11:05:43 corosync [SERV ] Service initialized 'corosync
> cluster config database access v1.01'
> Oct 21 11:05:43 corosync [SERV ] Service initialized 'corosync
> profile loading service'
> Oct 21 11:05:43 corosync [MAIN ] Compatibility mode set to
> whitetank. Using V1 and V2 of the synchronization engine.
> Oct 21 11:05:43 corosync [TOTEM ] Creating commit token because I am
> the rep.
> Oct 21 11:05:43 corosync [TOTEM ] Saving state aru 0 high seq received
> 0
> Oct 21 11:05:43 corosync [TOTEM ] Storing new sequence id for ring bc
> Oct 21 11:05:43 corosync [TOTEM ] entering COMMIT state.
> Oct 21 11:05:43 corosync [TOTEM ] got commit token
> Oct 21 11:05:43 corosync [TOTEM ] entering RECOVERY state.
> Oct 21 11:05:43 corosync [TOTEM ] position [0] member 172.30.0.145:
> Oct 21 11:05:43 corosync [TOTEM ] previous ring seq 184 rep
> 172.30.0.145
> Oct 21 11:05:43 corosync [TOTEM ] aru 0 high delivered 0 received flag
> 1
> Oct 21 11:05:43 corosync [TOTEM ] Did not need to originate any
> messages in recovery.
> Oct 21 11:05:43 corosync [TOTEM ] got commit token
> Oct 21 11:05:43 corosync [TOTEM ] Sending initial ORF token
> Oct 21 11:05:43 corosync [TOTEM ] token retrans flag is 0 my set
> retrans flag0 retrans queue empty 1 count 0, aru 0
> Oct 21 11:05:43 corosync [TOTEM ] install seq 0 aru 0 high seq
> received 0
> Oct 21 11:05:43 corosync [TOTEM ] token retrans flag is 0 my set
> retrans flag0 retrans queue empty 1 count 1, aru 0
> Oct 21 11:05:43 corosync [TOTEM ] install seq 0 aru 0 high seq
> received 0
> Oct 21 11:05:43 corosync [TOTEM ] token retrans flag is 0 my set
> retrans flag0 retrans queue empty 1 count 2, aru 0
> Oct 21 11:05:43 corosync [TOTEM ] install seq 0 aru 0 high seq
> received 0
> Oct 21 11:05:43 corosync [TOTEM ] token retrans flag is 0 my set
> retrans flag0 retrans queue empty 1 count 3, aru 0
> Oct 21 11:05:43 corosync [TOTEM ] install seq 0 aru 0 high seq
> received 0
> Oct 21 11:05:43 corosync [TOTEM ] retrans flag count 4 token aru 0
> install seq 0 aru 0 0
> Oct 21 11:05:43 corosync [TOTEM ] recovery to regular 1-0
> Oct 21 11:05:43 corosync [TOTEM ] Delivering to app 1 to 0
> Oct 21 11:05:43 corosync [SYNC ] This node is within the primary
> component and will provide service.
> Oct 21 11:05:43 corosync [TOTEM ] entering OPERATIONAL state.
> Oct 21 11:05:43 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Oct 21 11:05:43 corosync [TOTEM ] mcasted message added to pending
> queue
> Oct 21 11:05:43 corosync [TOTEM ] Delivering 0 to 1
> Oct 21 11:05:43 corosync [TOTEM ] Delivering MCAST message with seq 1
> to pending delivery queue
> Oct 21 11:05:43 corosync [SYNC ] confchg entries 1
> Oct 21 11:05:43 corosync [SYNC ] Barrier Start Received From
> -1862263124
> Oct 21 11:05:43 corosync [SYNC ] Barrier completion status for nodeid
> -1862263124 = 1.
> ==================
>
>
>
>
> I'm curious to know how actually corosync/openais loads pacemaker, the
> config directive seems to have done the magic but apparently not in my
> case.
> What should I be looking for, as the log message hardly gives any
> information.
>
>
> Pacemaker comprises bunch of daemons like crmd, stonithd and stuff, I
> ran them individually to see any permission problems
> like /var/lib/heartbeat and /var/run/heartbeat which should be chown
> hacluster:haclient.
>
>
>
>
> Even after doing those it fails to load.
>
>
>
>
> Please advise me what should I do.
>
>
>
>
> Thanks
> Shravan
>
>
>
>
>
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
More information about the Pacemaker
mailing list