[ClusterLabs] Pacemaker resource parameter reload confusion

Ferenc Wágner wferi at niif.hu
Fri Sep 22 10:23:47 EDT 2017


Hi,

I'm running a custom resourcre agent under Pacemaker 1.1.16, which has
several reloadable parameters:

$ /usr/sbin/crm_resource --show-metadata=ocf:niif:TransientDomain | fgrep unique=
<parameter name="domxml" unique="1" required="1">
<parameter name="graceful" unique="0" required="0">
<parameter name="desturi_template" unique="0" required="1">
<parameter name="migrateuri_template" unique="0" required="0">
<parameter name="migr_timeout" unique="0" required="0">
<parameter name="admins" unique="0" required="0">
<parameter name="expect_startup_signal" unique="0" required="0">
<parameter name="dummy" unique="1" required="0">
<parameter name="dummy_delay" unique="0" required="0">

I used to routinely change the unique="0" parameters without having the
corresponding resources restarted.  But now something like

$ sudo crm_resource -r vm-alder -p admins -v "kissg wferi"

restarts the resource in a somewhat strange way:

crmd[27037]:   notice: State transition S_IDLE -> S_POLICY_ENGINE
pengine[27036]:   notice: Reload  vm-alder#011(Started vhbl05)
pengine[27036]:   notice: Calculated transition 1309, saving inputs in /var/lib/pacemaker/pengine/pe-input-1033.bz2
crmd[27037]:   notice: Initiating stop operation vm-alder_stop_0 on vhbl05
crmd[27037]:   notice: Initiating reload operation vm-alder_reload_0 on vhbl05
crmd[27037]:   notice: Transition aborted by deletion of lrm_rsc_op[@id='vm-alder_last_failure_0']: Resource operation removal
crmd[27037]:   notice: Transition 1309 (Complete=10, Pending=0, Fired=0, Skipped=1, Incomplete=2, Source=/var/lib/pacemaker/pengine/pe-input-1033.bz2): Stopped
pengine[27036]:   notice: Calculated transition 1310, saving inputs in /var/lib/pacemaker/pengine/pe-input-1034.bz2
crmd[27037]:   notice: Initiating monitor operation vm-alder_monitor_60000 on vhbl05
crmd[27037]:  warning: Action 228 (vm-alder_monitor_60000) on vhbl05 failed (target: 0 vs. rc: 7): Error
crmd[27037]:   notice: Transition aborted by operation vm-alder_monitor_60000 'create' on vhbl05: Event failed
crmd[27037]:   notice: Transition 1310 (Complete=7, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-1034.bz2): Complete
pengine[27036]:  warning: Processing failed op monitor for vm-alder on vhbl05: not running (7)
pengine[27036]:   notice: Recover vm-alder#011(Started vhbl05)
pengine[27036]:   notice: Calculated transition 1311, saving inputs in /var/lib/pacemaker/pengine/pe-input-1035.bz2
pengine[27036]:  warning: Processing failed op monitor for vm-alder on vhbl05: not running (7)
pengine[27036]:   notice: Recover vm-alder#011(Started vhbl05)
pengine[27036]:   notice: Calculated transition 1312, saving inputs in /var/lib/pacemaker/pengine/pe-input-1036.bz2
crmd[27037]:   notice: Initiating stop operation vm-alder_stop_0 on vhbl05
crmd[27037]:   notice: Initiating start operation vm-alder_start_0 on vhbl05
crmd[27037]:   notice: Initiating monitor operation vm-alder_monitor_60000 on vhbl05
crmd[27037]:   notice: Transition 1312 (Complete=10, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-1036.bz2): Complete
crmd[27037]:   notice: State transition S_TRANSITION_ENGINE -> S_IDLE

I've got info level logs as well, but those are rather long and maybe
someone can pinpoint my problem without going through those.  I remember
past discussions about "doing reload right", but I'm not sure what was
implemented in the end, and I can't find anything in the changelog
either.  So, what do I miss here?  Parallel reload and stop looks rather
suspicious, though...
-- 
Thanks,
Feri




More information about the Users mailing list