[Pacemaker] Pacemaker Digest, Vol 59, Issue 59

Tue Oct 30 08:17:06 EDT 2012

On Thu, Oct 25, 2012 at 07:37:36PM +0530, vishal kumar wrote:
> >
> > Hi Dejan,
> >
> 
> I tried with  lrmadmin -M lsb httpd NULL and was able to get the meta-data.
> 
> I was able to add a resource from lrmadmin but i cannot see that resource
> in crm shell.

Right now I'm out of ideas. If your installation is ok, then it
should work. Do you use the "user" option in crm? Do you run crm
as the same user as you tried lrmadmin?

Thanks,

Dejan

> Many Thanks
> Vishal
> 
> Message: 1
> > Date: Tue, 23 Oct 2012 13:48:24 +0200
> > From: Dejan Muhamedagic <dejanmm at fastmail.fm>
> > To: The Pacemaker cluster resource manager
> >         <pacemaker at oss.clusterlabs.org>
> > Subject: Re: [Pacemaker] Pacemaker Digest, Vol 59, Issue 57
> > Message-ID: <20121023114823.GA19526 at walrus.homenet>
> > Content-Type: text/plain; charset=us-ascii
> >
> > Hi,
> >
> > On Tue, Oct 23, 2012 at 10:18:36AM +0530, vishal kumar wrote:
> > > Hello Florian,
> > >
> > > Thanks for the help.
> > >
> > > I tried with  crm ra meta sshd lsb but it gave me the same meta-data
> > error.
> > >
> > > The problem is while trying to add a lsb resource from crm shell, like
> > > httpd as i am using rhel it gives the following error
> > > ERROR: lsb:httpd: could not parse meta-data:
> > > ERROR: lsb:httpd: no such resource agent
> > >
> > > the httpd package is installed and i have a httpd file /etc/init.d
> > > directory. I am unable to add any resource which in /etc/init.d
> > directory.
> >
> > Do you have cluster-glue installed? And is pacemaker running?
> > lrmadmin (part of cluster-glue) gets the meta-data from lrmd. You
> > can try it by hand like this:
> >
> > lrmadmin -M lsb httpd NULL
> >
> > Thanks,
> >
> > Dejan
> >
> > > Many thanks
> > >  Vishal
> > >
> > >
> > > Message: 1
> > > > Date: Mon, 22 Oct 2012 17:53:23 +0530
> > > > From: vishal kumar <vishal.kanojwar at gmail.com>
> > > > To: pacemaker at oss.clusterlabs.org
> > > > Subject: [Pacemaker] lsb: could not parse meta-data
> > > > Message-ID:
> > > >         <CANz=
> > > > RdkSq4t-SbbF+KBNWoB2fGR7xu7G6MRUgBPfZASBvNPdnA at mail.gmail.com>
> > > > Content-Type: text/plain; charset="iso-8859-1"
> > > >
> > > > Hi
> > > >
> > > > I am trying to configure pacemaker with corosync on RHEL 6.
> > > >
> > > > While trying to add lsb resources i get  "could not parse meta-data".
> > > >
> > > > The pacemaker version is 1.1.7 . Below is the error what i get when i
> > check
> > > > for metadata in crm shell.
> > > >
> > > > *crm(live)ra# list lsb*
> > > > *abrt-ccpp                   abrt-oops                   abrtd
> > > >           acpid                       atd*
> > > > *auditd                      autofs                      certmonger
> > > >          cgconfig                    cgred*
> > > > *corosync                    corosync-notifyd            cpuspeed
> > > >          crond                       cups*
> > > > *haldaemon                   halt                        heartbeat
> > > >           httpd                       ip6tables*
> > > > *iptables                    irqbalance                  kdump
> > > >           killall                     ktune*
> > > > *lvm2-lvmetad                lvm2-monitor
> >  matahari-broker
> > > >           matahari-host               matahari-network*
> > > > *matahari-rpc                matahari-service
> >  matahari-sysconfig
> > > >          matahari-sysconfig-console  mcelogd*
> > > > *mdmonitor                   messagebus                  netconsole
> > > >          netfs                       network*
> > > > *nfs                         nfslock                     ntpd
> > > >          ntpdate                     oddjobd*
> > > > *pacemaker                   portreserve                 postfix
> > > >           psacct                      qpidd*
> > > > *quota_nld                   rdisc                       restorecond
> > > >           rhnsd                       rhsmcertd*
> > > > *rngd                        rpcbind                     rpcgssd
> > > >           rpcidmapd                   rpcsvcgssd*
> > > > *rsyslog                     sandbox                     saslauthd
> > > >           single                      smartd*
> > > > *sshd                        sssd                        sysstat
> > > >           tuned                       udev-post*
> > > > *ypbind                      *
> > > > *crm(live)ra# meta lsb heartbeat*
> > > > *ERROR: heartbeat:lsb: could not parse meta-data: *
> > > > *crm(live)ra# meta lsb sshd*
> > > > *ERROR: sshd:lsb: could not parse meta-data: *
> > > > *crm(live)ra# meta lsb ntpd*
> > > > *ERROR: ntpd:lsb: could not parse meta-data: *
> > > > *crm(live)ra# meta lsb httpd*
> > > > *ERROR: httpd:lsb: could not parse meta-data*
> > > >
> > > > Please do suggest me where am i going wrong.
> > > > Thanks for the help.
> > > >
> > > > Thanks
> > > > Vishal
> > > > -------------- next part --------------
> > > > An HTML attachment was scrubbed...
> > > > URL: <
> > > >
> > http://oss.clusterlabs.org/pipermail/pacemaker/attachments/20121022/0ade914c/attachment-0001.html
> > > > >
> > > >
> > > > ------------------------------
> > > >
> > > > Message: 2
> > > > Date: Mon, 22 Oct 2012 14:48:31 +0200
> > > > From: Florian Crouzat <gentoo at floriancrouzat.net>
> > > > To: pacemaker at oss.clusterlabs.org
> > > > Subject: Re: [Pacemaker] lsb: could not parse meta-data
> > > > Message-ID: <5085409F.801 at floriancrouzat.net>
> > > > Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> > > >
> > > > Le 22/10/2012 14:23, vishal kumar a ?crit :
> > > > > Hi
> > > >
> > > > > Please do suggest me where am i going wrong.
> > > > > Thanks for the help.
> > > > >
> > > >
> > > > See: crm ra help meta
> > > > Then try something like: crm ra meta sshd lsb # parameters order matter
> > > >
> > > > Anyway, you won't learn anything out of meta-datas from a LSB
> > > > initscript, because it's just a script (not cluster oriented, not a
> > real
> > > > resource agent), it's not multistate, nothing like that, only
> > > > start/stop/monitor and default mandatory settings.
> > > >
> > > >
> > > > --
> > > > Cheers,
> > > > Florian Crouzat
> > > >
> > > >
> >
> > > _______________________________________________
> > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > >
> > > Project Home: http://www.clusterlabs.org
> > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs: http://bugs.clusterlabs.org
> >
> >
> >
> >
> > ------------------------------
> >
> > Message: 2
> > Date: Tue, 23 Oct 2012 16:44:29 +0200
> > From: Dejan Muhamedagic <dejanmm at fastmail.fm>
> > To: The Pacemaker cluster resource manager
> >         <pacemaker at oss.clusterlabs.org>
> > Subject: Re: [Pacemaker] Crm configure Error In Centos 5.4
> > Message-ID: <20121023144428.GA3970 at squib>
> > Content-Type: text/plain; charset=us-ascii
> >
> > Hi,
> >
> > On Mon, Oct 15, 2012 at 01:17:47PM +0530, Vinoth Narasimhan wrote:
> > > Hi Guys,
> > >
> > > I just installed the latest heartbeat,pacemaker and corosync as per the
> > > guidelines from the below link
> > >
> > > http://www.clusterlabs.org/wiki/Install#READ_ME_FIRST
> > >
> > > I am able to successfully configure the heartbeat and corosync and start
> > it
> > > as well.
> > >
> > > But when i tried to add the resource i am getting the error from python.
> > >
> > > [root at CHE-PSS-072 lib64]# crm configure
> > > Traceback (most recent call last):
> > >   File "/usr/sbin/crm", line 41, in ?
> > >     crm.main.run()
> > >   File "/usr/lib64/python2.4/site-packages/crm/main.py", line 240, in run
> > >     parse_line(levels,["configure"])
> > >   File "/usr/lib64/python2.4/site-packages/crm/main.py", line 123, in
> > > parse_line
> > >     lvl.new_level(pt[token],token)
> > >   File "/usr/lib64/python2.4/site-packages/crm/levels.py", line 70, in
> > > new_level
> > >     self.current_level = level_obj()
> > >   File "/usr/lib64/python2.4/site-packages/crm/ui.py", line 1295, in
> > > __init__
> > >     cib_factory.initialize()
> > >   File "/usr/lib64/python2.4/site-packages/crm/cibconfig.py", line 1780,
> > in
> > > initialize
> > >     if not self.import_cib():
> > >   File "/usr/lib64/python2.4/site-packages/crm/cibconfig.py", line 1454,
> > in
> > > import_cib
> > >     self.doc,cib = read_cib(cibdump2doc)
> > >   File "/usr/lib64/python2.4/site-packages/crm/xmlutil.py", line 72, in
> > > read_cib
> > >     doc = fun(params)
> > >   File "/usr/lib64/python2.4/site-packages/crm/xmlutil.py", line 53, in
> > > cibdump2doc
> > >     doc = xmlparse(p.stdout)
> > >   File "/usr/lib64/python2.4/site-packages/crm/xmlutil.py", line 30, in
> > > xmlparse
> > >     except xml.parsers.expat.ExpatError,msg:
> > > AttributeError: 'module' object has no attribute 'expat'
> >
> > Hmm, the official python documentation
> > (http://docs.python.org/library/pyexpat.html) states that expat
> > is "New in version 2.0". Can you check your python installation.
> > AFAIK, crmsh is being used on various EL5.
> >
> > Thanks,
> >
> > Dejan
> >
> > > However i am get the status correctly.
> > >
> > > [root at CHE-PSS-072 lib64]# crm status
> > > ============
> > > Last updated: Mon Oct 15 00:46:53 2012
> > > Stack: Heartbeat
> > > Current DC: che-pss-072.ps.in (dd4d1fd0-97ff-4a28-aac2-302ee1066e4c) -
> > > partition with quorum
> > > Version: 1.0.12-unknown
> > > 2 Nodes configured, unknown expected votes
> > > 0 Resources configured.
> > > ============
> > >
> > > Online: [ che-pss-072.ps.in ops-pss-084.ps.in ]
> > >
> > > Any help is greatly appreciated to solve the error.
> > >
> > > Thanks,
> > > vinoth.
> >
> > > _______________________________________________
> > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > >
> > > Project Home: http://www.clusterlabs.org
> > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs: http://bugs.clusterlabs.org
> >
> >
> >
> >
> > ------------------------------
> >
> > Message: 3
> > Date: Tue, 23 Oct 2012 16:46:14 +0200
> > From: Dejan Muhamedagic <dejanmm at fastmail.fm>
> > To: The Pacemaker cluster resource manager
> >         <pacemaker at oss.clusterlabs.org>
> > Subject: Re: [Pacemaker] external/ssh stonith and repeated reboots
> > Message-ID: <20121023144614.GB3970 at squib>
> > Content-Type: text/plain; charset=us-ascii
> >
> > Hi,
> >
> > On Tue, Oct 16, 2012 at 03:00:08PM +1100, Andrew Beekhof wrote:
> > > On Sun, Oct 14, 2012 at 5:04 PM, James Harper
> > > <james.harper at bendigoit.com.au> wrote:
> > > > I'm using external/ssh in my test cluster (a bunch of vm's), and for
> > some reason the cluster has tried to terminate it but failed, like:
> > >
> > > Try fence_xvm instead.  Its actually reliable.
> > > You'd need the fence-virtd on the host and guests package and I've had
> > > plenty of success with the following as the config file on the host.
> > > Make sure key_file exists everywhere, start fence-virtd and test with
> > > "fence_xvm -o list" on the guest(s)
> >
> > There's also external/libvirt which should do fine.
> >
> > > ssh based fencing isn't just "not for production" its a flat out
> > terrible idea.
> > > With much handwaving it is barely usable even for testing as it
> > > requires the target to be alive, reachable and behaving.
> >
> > Indeed.
> >
> > Thanks,
> >
> > Dejan
> >
> >
> >
> > ------------------------------
> >
> > Message: 4
> > Date: Tue, 23 Oct 2012 10:04:31 -0500 (CDT)
> > From: Andrew Martin <amartin at xes-inc.com>
> > To: The Pacemaker cluster resource manager
> >         <pacemaker at oss.clusterlabs.org>
> > Subject: Re: [Pacemaker] Behavior of Corosync+Pacemaker with DRBD
> >         primary power   loss
> > Message-ID: <5651a700-b509-40ec-8d79-49b77e53c30e at zimbra>
> > Content-Type: text/plain; charset="iso-8859-1"
> >
> > Hello,
> >
> > Under the Clusters from Scratch documentation, allow-two-primaries is set
> > in the DRBD configuration for an active/passive cluster:
> >
> > http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1-crmsh/html-single/Clusters_from_Scratch/index.html#_write_the_drbd_config
> >
> > "TODO: Explain the reason for the allow-two-primaries option"
> >
> > Is the reason for allow-two-primaries in this active/passive cluster
> > (using ext4, a non-cluster filesystem) to allow for failover in the type of
> > situation I have described (where the old primary/master is suddenly
> > offline like with a power supply failure)? Are split-brains prevented
> > because Pacemaker ensures that only one node is promoted to Primary at any
> > time?
> >
> > Is it possible to recover from such a failure without allow-two-primaries?
> >
> > Thanks,
> >
> > Andrew
> >
> > ----- Original Message -----
> >
> > From: "Andrew Martin" <amartin at xes-inc.com>
> > To: "The Pacemaker cluster resource manager" <
> > pacemaker at oss.clusterlabs.org>
> > Sent: Friday, October 19, 2012 10:45:04 AM
> > Subject: [Pacemaker] Behavior of Corosync+Pacemaker with DRBD primary
> > power loss
> >
> >
> > Hello,
> >
> > I have a 3 node Pacemaker + Corosync cluster with 2 "real" nodes, node0
> > and node1, running a DRBD resource (single-primary) and the 3rd node in
> > standby acting as a quorum node. If node0 were running the DRBD resource,
> > and thus is DRBD primary, and its power supply fails, will the DRBD
> > resource be promoted to primary on node1?
> >
> > If I simply cut the DRBD replication link, node1 reports the following
> > state:
> > Role:
> > Secondary/Unknown
> >
> > Disk State:
> > UpToDate/DUnknown
> >
> > Connection State:
> > WFConnection
> >
> >
> > I cannot manually promote the DRBD resource because the peer is not
> > outdated:
> > 0: State change failed: (-7) Refusing to be Primary while peer is not
> > outdated
> > Command 'drbdsetup 0 primary' terminated with exit code 11
> >
> > I have configured the CIB-based crm-fence-peer.sh utility in my drbd.conf
> > fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
> > but I do not believe it would be applicable in this scenario.
> >
> > If node0 goes offline like this and doesn't come back (e.g. after a
> > STONITH), does Pacemaker have a way to tell node1 that its peer is outdated
> > and to proceed with promoting the resource to primary?
> >
> > Thanks,
> >
> > Andrew
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL: <
> > http://oss.clusterlabs.org/pipermail/pacemaker/attachments/20121023/dfe33346/attachment-0001.html
> > >
> >
> > ------------------------------
> >
> > Message: 5
> > Date: Tue, 23 Oct 2012 10:29:44 -0500
> > From: Justin Pasher <justinp at distribion.com>
> > To: pacemaker at oss.clusterlabs.org
> > Subject: Re: [Pacemaker] "Simple" LVM/drbd backed Primary/Secondary
> >         NFS cluster doesn't always failover cleanly
> > Message-ID: <5086B7E8.7010900 at distribion.com>
> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> >
> > ----- Original Message -----
> >  > From: Andreas Kurz <andreas at hastexo.com>
> >  > Date: Sun, 21 Oct 2012 01:38:46 +0200
> >  > Subject: Re: [Pacemaker] "Simple" LVM/drbd backed Primary/Secondary
> > NFS cluster doesn't always failover cleanly
> >  > To: pacemaker at oss.clusterlabs.org
> >  >
> >  >
> > > On 10/18/2012 08:02 PM, Justin Pasher wrote:
> > >> I have a pretty basic setup by most people's standards, but there must
> > >> be something that is not quite right about it. Sometimes when I force a
> > >> resource failover from one server to the other, the clients with the NFS
> > >> mounts don't cleanly migrate to the new server. I configured this using
> > >> a few different "Pacemaker-DRBD-NFS" guides out there for reference (I
> > >> believe they were the Linbit guides).
> > > Are you using the latest "exportfs" resource-agent from github-repo? ...
> > > there have been bugfixes/improvements... and try to move the VIP for
> > > each export to the end of its group so the IP where the clients connect
> > > is started at the last/stopped at the first position.
> > >
> > > Regards,
> > > Andreas
> >
> > I'm current running the version that comes with the Debian
> > squeeze-backports resource-agents package (1:3.9.2-5~bpo60+1). I went
> > ahead and grabbed a copy of exportfs from the git repository. It's a
> > little risky for me to update the file right now, since the two
> > resources I am worried about the most are the NFS shares for the
> > XenServer VDIs, so when it has a hiccup in the connection to the NFS
> > server, things start exploding (e.g. guest VMs start having disk errors
> > and go read-only).
> >
> > I scanned through the changes real quick and the biggest change I
> > noticed was how the .rmtab file backup is restored (it sorts and filters
> > unique entries instead of just concatenating the results to the end of
> > /var/lib/nfs/rmtab). I had actually tweaked that a little bit myself
> > before when I was trying to trace down the problem.
> >
> > Ultimately I think my problem is more related to the NFS server itself
> > and how it handles "unknown" client connections after a failover. I've
> > see people here and there mention that /var/lib/nfs should be on the
> > replicated device to maintain consistency after fail over, but the
> > exportfs resource agent doesn't do anything like that. Is that not
> > actually needed anymore? At any rate, in my situation, the problem is
> > that I am maintaining four independent NFS shares and each one can be
> > failed over separately (and running on either server at any time), so a
> > simple copy of the directory won't work since there is no "master"
> > server at any given time.
> >
> > Also, I did find a bug in the way backup_rmtab() filters the export list
> > for its backup. Since it looks for a leading AND trailing colon (:), it
> > doesn't properly copy information about mounts that pulled from
> > subdirectories under the NFS mount (e.g. instead of mounting /home, a
> > server might mount /home/username such as with autofs, which won't get
> > copied to the .rmtab backup). I'll file a bug report about that.
> >
> > Thanks.
> >
> > --
> > Justin Pasher
> >
> >
> >
> > ------------------------------
> >
> > Message: 6
> > Date: Tue, 23 Oct 2012 10:50:11 -0500
> > From: Cal Heldenbrand <cal at fbsdata.com>
> > To: pacemaker at oss.clusterlabs.org
> > Subject: [Pacemaker] crm_simulate a resource failure
> > Message-ID:
> >         <
> > CAAcwKhf6iLv9q9H3BgTSQ5scLPpuDmUeZ1fLWhYf09nRwRjn2g at mail.gmail.com>
> > Content-Type: text/plain; charset="iso-8859-1"
> >
> > Hi everyone,
> >
> > I'm not able to find documentation or examples on this.  If I have a cloned
> > primitive set across a cluster, how can I simulate a failure of a resource
> > on an individual node?  I mainly want to see the scores on why a particular
> > action is taken so I can adjust my configs.
> >
> > I think the --op-fail parameter is what I need, but I just don't get the
> > syntax of the value in the man page.
> >
> > Thank you!
> >
> > --Cal
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL: <
> > http://oss.clusterlabs.org/pipermail/pacemaker/attachments/20121023/1c3630c3/attachment.html
> > >
> >
> > ------------------------------
> >
> > _______________________________________________
> > Pacemaker mailing list
> > Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> >
> > End of Pacemaker Digest, Vol 59, Issue 59
> > *****************************************
> >

> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org