[ClusterLabs] custom resource agent FAILED (blocked)
Bishoy Mikhael
b.s.mikhael at gmail.com
Thu Apr 12 17:38:21 EDT 2018
Hi All,
I'm trying to create a resource agent to promote a standby HDFS namenode to
active when the virtual IP failover to another node.
I've taken the skeleton from the Dummy OCF agent.
The modifications I've done to the Dummy agent are as follows:
HDFSHA_start() {
HDFSHA_monitor
if [ $? = $OCF_SUCCESS ]; then
/opt/hadoop/sbin/hdfs-ha.sh start
return $OCF_SUCCESS
fi
}
HDFSHA_stop() {
HDFSHA_monitor
if [ $? = $OCF_SUCCESS ]; then
/opt/hadoop/sbin/hdfs-ha.sh stop
fi
return $OCF_SUCCESS
}
HDFSHA_monitor() {
# Monitor _MUST!_ differentiate correctly between running
# (SUCCESS), failed (ERROR) or _cleanly_ stopped (NOT RUNNING).
# That is THREE states, not just yes/no.
active_nn=$(hdfs haadmin -getAllServiceState | grep active | cut -d":" -f 1)
current_node=$(uname -n)
if [[ ${active_nn} == ${current_node} ]]; then
return $OCF_SUCCESS
fi
}
HDFSHA_validate() {
return $OCF_SUCCESS
}
I've created the resource as follows:
# pcs resource create hdfs-ha ocf:heartbeat:HDFSHA op monitor interval=30s
The resource fails right away as follows:
# pcs status
Cluster name: hdfs_cluster
Stack: corosync
Current DC: taulog (version 1.1.16-12.el7_4.8-94ff4df) - partition with
quorum
Last updated: Thu Apr 12 03:30:57 2018
Last change: Thu Apr 12 03:30:54 2018 by root via cibadmin on lingcod
3 nodes configured
2 resources configured
Online: [ dentex lingcod taulog ]
Full list of resources:
VirtualIP (ocf::heartbeat:IPaddr2): Started taulog
hdfs-ha (ocf::heartbeat:HDFSHA): FAILED (blocked)[ taulog dentex ]
Failed Actions:
* hdfs-ha_stop_0 on taulog 'insufficient privileges' (4): call=12,
status=complete, exitreason='none',
last-rc-change='Thu Apr 12 03:17:37 2018', queued=0ms, exec=1ms
* hdfs-ha_stop_0 on dentex 'insufficient privileges' (4): call=10,
status=complete, exitreason='none',
last-rc-change='Thu Apr 12 03:17:43 2018', queued=0ms, exec=1ms
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
I debug the resource as follows, and it returns 0
# pcs resource debug-monitor hdfs-ha
Operation monitor for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0
> stderr: DEBUG: hdfs-ha monitor : 0
# pcs resource debug-stop hdfs-ha
Operation stop for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0
> stderr: DEBUG: hdfs-ha stop : 0
# pcs resource debug-start hdfs-ha
Operation start for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0
> stderr: DEBUG: hdfs-ha start : 0
I don't understand what am I doing wrong!
Regards,
Bishoy Mikhael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20180412/6795728d/attachment.html>
More information about the Users
mailing list