[ClusterLabs] Pacemaker resource parameter reload confusion
Ferenc Wágner
wferi at niif.hu
Fri Sep 22 10:23:47 EDT 2017
Hi,
I'm running a custom resourcre agent under Pacemaker 1.1.16, which has
several reloadable parameters:
$ /usr/sbin/crm_resource --show-metadata=ocf:niif:TransientDomain | fgrep unique=
<parameter name="domxml" unique="1" required="1">
<parameter name="graceful" unique="0" required="0">
<parameter name="desturi_template" unique="0" required="1">
<parameter name="migrateuri_template" unique="0" required="0">
<parameter name="migr_timeout" unique="0" required="0">
<parameter name="admins" unique="0" required="0">
<parameter name="expect_startup_signal" unique="0" required="0">
<parameter name="dummy" unique="1" required="0">
<parameter name="dummy_delay" unique="0" required="0">
I used to routinely change the unique="0" parameters without having the
corresponding resources restarted. But now something like
$ sudo crm_resource -r vm-alder -p admins -v "kissg wferi"
restarts the resource in a somewhat strange way:
crmd[27037]: notice: State transition S_IDLE -> S_POLICY_ENGINE
pengine[27036]: notice: Reload vm-alder#011(Started vhbl05)
pengine[27036]: notice: Calculated transition 1309, saving inputs in /var/lib/pacemaker/pengine/pe-input-1033.bz2
crmd[27037]: notice: Initiating stop operation vm-alder_stop_0 on vhbl05
crmd[27037]: notice: Initiating reload operation vm-alder_reload_0 on vhbl05
crmd[27037]: notice: Transition aborted by deletion of lrm_rsc_op[@id='vm-alder_last_failure_0']: Resource operation removal
crmd[27037]: notice: Transition 1309 (Complete=10, Pending=0, Fired=0, Skipped=1, Incomplete=2, Source=/var/lib/pacemaker/pengine/pe-input-1033.bz2): Stopped
pengine[27036]: notice: Calculated transition 1310, saving inputs in /var/lib/pacemaker/pengine/pe-input-1034.bz2
crmd[27037]: notice: Initiating monitor operation vm-alder_monitor_60000 on vhbl05
crmd[27037]: warning: Action 228 (vm-alder_monitor_60000) on vhbl05 failed (target: 0 vs. rc: 7): Error
crmd[27037]: notice: Transition aborted by operation vm-alder_monitor_60000 'create' on vhbl05: Event failed
crmd[27037]: notice: Transition 1310 (Complete=7, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-1034.bz2): Complete
pengine[27036]: warning: Processing failed op monitor for vm-alder on vhbl05: not running (7)
pengine[27036]: notice: Recover vm-alder#011(Started vhbl05)
pengine[27036]: notice: Calculated transition 1311, saving inputs in /var/lib/pacemaker/pengine/pe-input-1035.bz2
pengine[27036]: warning: Processing failed op monitor for vm-alder on vhbl05: not running (7)
pengine[27036]: notice: Recover vm-alder#011(Started vhbl05)
pengine[27036]: notice: Calculated transition 1312, saving inputs in /var/lib/pacemaker/pengine/pe-input-1036.bz2
crmd[27037]: notice: Initiating stop operation vm-alder_stop_0 on vhbl05
crmd[27037]: notice: Initiating start operation vm-alder_start_0 on vhbl05
crmd[27037]: notice: Initiating monitor operation vm-alder_monitor_60000 on vhbl05
crmd[27037]: notice: Transition 1312 (Complete=10, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-1036.bz2): Complete
crmd[27037]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE
I've got info level logs as well, but those are rather long and maybe
someone can pinpoint my problem without going through those. I remember
past discussions about "doing reload right", but I'm not sure what was
implemented in the end, and I can't find anything in the changelog
either. So, what do I miss here? Parallel reload and stop looks rather
suspicious, though...
--
Thanks,
Feri
More information about the Users
mailing list