[Pacemaker] /var/lib/pacemaker/cores cleanup
Andrew Beekhof
andrew at beekhof.net
Thu Nov 7 23:27:39 UTC 2013
On 7 Oct 2013, at 5:52 pm, Mailing List SVR <lists at svrinformatica.it> wrote:
> Il 07/10/2013 04:16, Andrew Beekhof ha scritto:
>> On 05/10/2013, at 7:11 AM, Mailing List SVR <lists at svrinformatica.it>
>> wrote:
>>
>>
>>> Hi,
>>>
>>> I have a pacemaker cluster running fine since 2 months, I noticed that in the folder /var/lib/pacemaker/cores/root I have about 1,5 GB of files core.xxxx, who is responsabile to cleanup these files,
>>>
>> Ideally they would have been reported upstream so the underlying problem that caused them could be fixed.
>
> if you are interested here are some core dumps:
>
> http://195.250.34.59/temp/cores.tar.bz2
>
dammit, we're not correctly collecting metadata for the 'service' class.
these core files are produced when we try to parse the result as xml.
at least
[root at pcmk-5 ~]# crm_resource --show-metadata service:nfs
Usage: nfs {start|stop|status|restart|reload|force-reload|condrestart|try-restart|condstop}
vs.
[root at pcmk-5 ~]# crm_resource --show-metadata lsb:nfs
<?xml version='1.0'?>
<!DOCTYPE resource-agent SYSTEM 'ra-api-1.dtd'>
<resource-agent name='nfs' version='0.1'>
<version>1.0</version>
<longdesc lang='en'>
NFS is a popular protocol for file sharing across networks.
This service provides NFS server functionality, which is \
configured via the /etc/exports file.
</longdesc>
<shortdesc lang='en'>nfs</shortdesc>
<parameters>
</parameters>
<actions>
<action name='meta-data' timeout='5' />
<action name='start' timeout='15' />
<action name='stop' timeout='15' />
<action name='status' timeout='15' />
<action name='restart' timeout='15' />
<action name='force-reload' timeout='15' />
<action name='monitor' timeout='15' interval='15' />
</actions>
<special tag='LSB'>
<Provides></Provides>
<Required-Start></Required-Start>
<Required-Stop></Required-Stop>
<Should-Start></Should-Start>
<Should-Stop></Should-Stop>
<Default-Start></Default-Start>
<Default-Stop></Default-Stop>
</special>
</resource-agent>
Fixed in https://github.com/beekhof/pacemaker/commit/644752e
> this is a pacemaker/cman cluster on centos 6.4
>
> pacemaker-libs-1.1.8-7.el6.x86_64
> pacemaker-cluster-libs-1.1.8-7.el6.x86_64
> pacemaker-1.1.8-7.el6.x86_64
> pacemaker-cli-1.1.8-7.el6.x86_64
> cman-3.0.12.1-49.el6_4.2.x86_64
>
> pcs config
> Corosync Nodes:
>
> Pacemaker Nodes:
> server3.<domain.com> server4.<domain.com>
>
> Resources:
> Master: DatiClone
> Resource: Dati (provider=linbit type=drbd class=ocf)
> Attributes: drbd_resource=dati
> Operations: monitor interval=120s
> Resource: DatiFs (provider=heartbeat type=Filesystem class=ocf)
> Attributes: device=/dev/drbd/by-res/dati directory=/srv/dati fstype=ext4 options=noatime,nodiratime,nodev run_fsck=force
> Resource: ClusterIp (provider=heartbeat type=IPaddr2 class=ocf)
> Attributes: ip=172.16.20.9 cidr_netmask=32
> Operations: monitor interval=60s
> Resource: Smb (type=smb class=service)
> Operations: monitor interval=1min
> Resource: Nmb (type=nmb class=service)
> Operations: monitor interval=1min
> Resource: PgSQL (type=postgresql class=service)
> Operations: monitor interval=1min
> Resource: SmbManager (type=smbmanager class=service)
> Operations: monitor interval=5min
> Resource: ipmi-fencing3 (type=fence_ipmilan class=stonith)
> Attributes: pcmk_host_list=server3.<domain.com>.com ipaddr=172.16.20.6 login=root passwd=pwd123 lanplus=1
> Operations: monitor interval=60s
> Resource: ipmi-fencing4 (type=fence_ipmilan class=stonith)
> Attributes: pcmk_host_list=server4.<domain.com> ipaddr=172.16.20.7 login=root passwd=pwd123 lanplus=1
> Operations: monitor interval=60s
>
> Location Constraints:
> Resource: ipmi-fencing4
> Disabled on: server4.<domain.com>
> Resource: ipmi-fencing3
> Disabled on: server3.<domain.com>
> Ordering Constraints:
> start ClusterIp then start Smb
> start Nmb then start Smb
> promote DatiClone then start DatiFs
> start DatiFs then start Nmb
> start DatiFs then start PgSQL
> start PgSQL then start SmbManager
> Colocation Constraints:
> ClusterIp with Smb
> Smb with Nmb
> Smb with DatiFs
> DatiFs with DatiClone (with-rsc-role:Master)
> PgSQL with DatiFs
> SmbManager with DatiFs
>
> Cluster Properties:
> dc-version: 1.1.8-7.el6-394e906
> cluster-infrastructure: cman
> no-quorum-policy: ignore
> stonith-enabled: true
>
>
>>
>>> is it safe to remove the files older than a months with a cron script?
>>>
>> Yes
>
> ok thanks,
> Nicola
>
>>
>>> thanks
>>> Nicola
>>>
>>> _______________________________________________
>>> Pacemaker mailing list:
>>> Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>>
>>> Project Home:
>>> http://www.clusterlabs.org
>>>
>>> Getting started:
>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>
>>> Bugs:
>>> http://bugs.clusterlabs.org
>
More information about the Pacemaker
mailing list