[Pacemaker] crm node delete

Tue Jun 15 16:04:55 EDT 2010

Hi,

On Tue, Jun 15, 2010 at 05:09:14PM +0100, Maros Timko wrote:
> > On Fri, Jun 11, 2010 at 03:45:19PM +0100, Maros Timko wrote:
> >> Hi all,
> >>
> >> using heartbeat stack. I have a system with one node offline:
> >>  ============
> >>  Last updated: Fri Jun 11 13:52:40 2010
> >>  Stack: Heartbeat
> >>  Current DC: vsp7.example.com (ba6d6332-71dd-465b-a030-227bcd31a25f) -
> >> partition with quorum
> >>  Version: 1.0.7-d3fa20fc76c7947d6de66db7e52526dc6bd7d782
> >>  2 Nodes configured, 2 expected votes
> >>  3 Resources configured.
> >>  ============
> >>
> >>  Online: [ vsp7.example.com ]
> >>  OFFLINE: [ vsp8.example.com ]
> >>
> >> If I try to remove this offline node, I get:
> >>  [root at vsp7 ~]# crm node delete vsp8.example.com
> >>  WARNING: crm_node bad format:
> >>  ERROR: node vsp8.example.com/state "lost" not found in the id list
> >>  INFO: check output of crm_node -l
> >>  [root at vsp7 ~]# crm_node -l
> >>  [root at vsp7 ~]# echo $?
> >>  0
> >>  [root at vsp7 ~]# crm_node --list
> >>  [root at vsp7 ~]# echo $?
> >>  0
> >>  [root at vsp7 ~]# crm configure show
> >>  node $id="ba6d6332-71dd-465b-a030-227bcd31a25f" vsp7.example.com
> >>  node $id="edc0ba6f-017f-424e-9dbf-302021a2cbce" vsp8.example.com
> >>
> >> Pacemaker explained suggests to use lower level commands for both HA and AIS:
> >>  cibadmin --delete --obj_type nodes --crm_xml '<node uname="pcmk-1"/>'
> >>  cibadmin --delete --obj_type status --crm_xml '<node_state uname="pcmk-1"/>'
> >>
> >>  [root at vsp7 ~]# crm_node --help | grep list
> >>   -l, --list  (AIS-Only) Display all known members (past and present)
> >> of this cluster
> >>
> >> So what is the truth of "crm node delete", is it supported for
> >> heartbeat or not?
> >
> > Yes it is, but it looks like the stack wasn't recognized
> > correctly, i.e. crm thought it was running on openais. This is
> > the command for the check:
> >
> >        ps -e -o pid,command | grep -qs 'heartbeat:.[m]aster'
> >
> Thanks Dejan, so it is caused by my old issue that heartbeat does not
> report the details of the processes:
> # ps -ef|grep heart
> root     10682     1  0 11:39 ?        00:00:00
> /usr/lib64/heartbeat/ha_logd -d -c /etc/logd.cf
> root     10683 10682  0 11:39 ?        00:00:00
> /usr/lib64/heartbeat/ha_logd -d -c /etc/logd.cf
> root     10764     1  0 11:39 ?        00:00:00 /usr/lib64/heartbeat/heartbeat
> root     10783 10764  0 11:39 ?        00:00:00 /usr/lib64/heartbeat/heartbeat
> root     10784 10764  0 11:39 ?        00:00:00 /usr/lib64/heartbeat/heartbeat
> root     10785 10764  0 11:39 ?        00:00:00 /usr/lib64/heartbeat/heartbeat
> root     10786 10764  0 11:39 ?        00:00:00 /usr/lib64/heartbeat/heartbeat
> root     10788 10764  0 11:39 ?        00:00:00 /usr/lib64/heartbeat/heartbeat
> root     10791 10764  0 11:39 ?        00:00:00 /usr/lib64/heartbeat/heartbeat
> root     10792 10764  0 11:39 ?        00:00:00 /usr/lib64/heartbeat/heartbeat
> root     10793 10764  0 11:39 ?        00:00:00 /usr/lib64/heartbeat/heartbeat
> root     10794 10764  0 11:39 ?        00:00:00 /usr/lib64/heartbeat/heartbeat
> 781      10809 10764  0 11:39 ?        00:00:00 /usr/lib64/heartbeat/dopd
> 781      10810 10764  0 11:39 ?        00:00:00 /usr/lib64/heartbeat/ccm
> 781      10811 10764  0 11:39 ?        00:00:02 /usr/lib64/heartbeat/cib
> root     10812 10764  0 11:39 ?        00:00:00 /usr/lib64/heartbeat/lrmd -r
> root     10813 10764  0 11:39 ?        00:00:00 /usr/lib64/heartbeat/stonithd
> 781      10814 10764  0 11:39 ?        00:00:00 /usr/lib64/heartbeat/attrd
> 781      10815 10764  0 11:39 ?        00:00:00 /usr/lib64/heartbeat/crmd
> root     21477 17322  0 17:06 pts/2    00:00:00 grep heart
> 781      23957 10815  0 12:56 ?        00:00:00 /usr/lib64/heartbeat/pengine
> 
> I did not get a chance to try with latest heartbeat. Will try as soon
> as possible.

I don't think that would help, there weren't many changes in
heartbeat recently. Anyway, I'll figure out a better way to find
out about the stack. Probably best to check the
cluster-infrastructure property from the running CIB. Or try to
use cl_status hbstatus (that should always exit with 0). Anybody
with a better idea?

Thanks,

Dejan

> > Thanks,
> >
> > Dejan
> >
> >> Thanks,
> >> Tino
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker