[Pacemaker] hangs pending
Andrew Beekhof
andrew at beekhof.net
Thu Feb 20 09:50:31 UTC 2014
On 20 Feb 2014, at 5:33 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>
>
> 20.02.2014, 01:22, "Andrew Beekhof" <andrew at beekhof.net>:
>> On 20 Feb 2014, at 4:18 am, Andrey Groshev <greenx at yandex.ru> wrote:
>>
>>> 19.02.2014, 06:47, "Andrew Beekhof" <andrew at beekhof.net>:
>>>> On 18 Feb 2014, at 9:29 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>> Hi, ALL and Andrew!
>>>>>
>>>>> Today is a good day - I killed a lot, and a lot of shooting at me.
>>>>> In general - I am happy (almost like an elephant) :)
>>>>> Except resources on the node are important to me eight processes: corosync,pacemakerd,cib,stonithd,lrmd,attrd,pengine,crmd.
>>>>> I killed them with different signals (4,6,11 and even 9).
>>>>> Behavior does not depend of number signal - it's good.
>>>>> If STONITH send reboot to the node - it rebooted and rejoined the cluster - too it's good.
>>>>> But the behavior is different from killing various demons.
>>>>>
>>>>> Turned four groups:
>>>>> 1. corosync,cib - STONITH work 100%.
>>>>> Kill via any signals - call STONITH and reboot.
>>>> excellent
>>>>> 3. stonithd,attrd,pengine - not need STONITH
>>>>> This daemons simple restart, resources - stay running.
>>>> right
>>>>> 2. lrmd,crmd - strange behavior STONITH.
>>>>> Sometimes called STONITH - and the corresponding reaction.
>>>>> Sometimes restart daemon
>>>> The daemon will always try to restart, the only variable is how long it takes the peer to notice and initiate fencing.
>>>> If the failure happens just before a they're due to receive totem token, the failure will be very quickly detected and the node fenced.
>>>> If the failure happens just after, then detection will take longer - giving the node longer to recover and not be fenced.
>>>>
>>>> So fence/not fence is normal and to be expected.
>>>>> and restart resources with large delay MS:pgsql.
>>>>> One time after restart crmd - pgsql don't restart.
>>>> I would not expect pgsql to ever restart - if the RA does its job properly anyway.
>>>> In the case the node is not fenced, the crmd will respawn and the the PE will request that it re-detect the state of all resources.
>>>>
>>>> If the agent reports "all good", then there is nothing more to do.
>>>> If the agent is not reporting "all good", you should really be asking why.
>>>>> 4. pacemakerd - nothing happens.
>>>> On non-systemd based machines, correct.
>>>>
>>>> On a systemd based machine pacemakerd is respawned and reattaches to the existing daemons.
>>>> Any subsequent daemon failure will be detected and the daemon respawned.
>>> And! I almost forgot about IT!
>>> Exist another (NORMAL) the variants, the methods, the ideas?
>>> Without this ... @$%#$%&$%^&$%^&##@#$$^$%& !!!!!
>>> Otherwise - it's a full epic fail ;)
>>
>> -ENOPARSE
>
> OK, I remove my personal attitude to "systemd".
> Let me explain.
>
> Somewhere in the beginning of this topic, I wrote:
> A.G.:Who knows who runs lrmd?
> A.B.:Pacemakerd.
> That's one!
>
> Let's see the list of processes:
> #ps -axf
> .....
> 6067 ? Ssl 7:24 corosync
> 6092 ? S 0:25 pacemakerd
> 6094 ? Ss 116:13 \_ /usr/libexec/pacemaker/cib
> 6095 ? Ss 0:25 \_ /usr/libexec/pacemaker/stonithd
> 6096 ? Ss 1:27 \_ /usr/libexec/pacemaker/lrmd
> 6097 ? Ss 0:49 \_ /usr/libexec/pacemaker/attrd
> 6098 ? Ss 0:25 \_ /usr/libexec/pacemaker/pengine
> 6099 ? Ss 0:29 \_ /usr/libexec/pacemaker/crmd
> .....
> That's two!
Whats two? I don't follow.
> And more, more...
> Now you must understand - why I want this process to work always.
> Even I think, No need for anyone here to explain it!
>
> And Now you say about "pacemakerd nice work, but only on systemd distros" !!!
No, I;m saying it works _better_ on systemd distros.
On non-systemd distros you still need quite a few unlikely-to-happen failures to trigger a situation in which the node still gets fenced and recovered (assuming no-one saw any of the error messages and didn't run "service pacemaker restart" prior to the additional failures).
> What should I do now?
> * Integrate systemd in CentOS?
> * Migrate to Fefora?
> * Buy RHEL7 !?
Option 3 is particularly good :)
> Each a variants is great, but don't fit for me.
>
> P.S. And I'm not talking distros which don't migrate to systemd (and will not do).
Are there any? Even debian and ubuntu have raised the white flag.
> Do not be offended! We also do so.
> We are building a secret military factory,
> large concrete fence around it,
> wall barbed wire, but forget to install the gates. :)
>
>
>>>>> And then I can kill any process of the third group. They do not restart.
>>>> Until they become needed.
>>>> Eg. if the DC goes to invoke the policy engine, that will fail causing the crmd to fail and the node to be fenced.
>>>>> Generaly don't touch corosync,cib and maybe lrmd,crmd.
>>>>>
>>>>> What do you think about this?
>>>>> The main question of this topic - we decided.
>>>>> But this varied behavior - another big problem.
>>>>>
>>>>> 17.02.2014, 08:52, "Andrey Groshev" <greenx at yandex.ru>:
>>>>>> 17.02.2014, 02:27, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>> With no quick follow-up, dare one hope that means the patch worked? :-)
>>>>>> Hi,
>>>>>> No, unfortunately the chief changed my plans on Friday and all day I was engaged in a parallel project.
>>>>>> I hope that today have time to carry out the necessary tests.
>>>>>>> On 14 Feb 2014, at 3:37 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>> Yes, of course. Now beginning build world and test )
>>>>>>>>
>>>>>>>> 14.02.2014, 04:41, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>>> The previous patch wasn't quite right.
>>>>>>>>> Could you try this new one?
>>>>>>>>>
>>>>>>>>> http://paste.fedoraproject.org/77123/13923376/
>>>>>>>>>
>>>>>>>>> [11:23 AM] beekhof at f19 ~/Development/sources/pacemaker/devel ☺ # git diff
>>>>>>>>> diff --git a/crmd/callbacks.c b/crmd/callbacks.c
>>>>>>>>> index ac4b905..d49525b 100644
>>>>>>>>> --- a/crmd/callbacks.c
>>>>>>>>> +++ b/crmd/callbacks.c
>>>>>>>>> @@ -199,8 +199,7 @@ peer_update_callback(enum crm_status_type type, crm_node_t * node, const void *d
>>>>>>>>> stop_te_timer(down->timer);
>>>>>>>>>
>>>>>>>>> flags |= node_update_join | node_update_expected;
>>>>>>>>> - crm_update_peer_join(__FUNCTION__, node, crm_join_none);
>>>>>>>>> - crm_update_peer_expected(__FUNCTION__, node, CRMD_JOINSTATE_DOWN);
>>>>>>>>> + crmd_peer_down(node, FALSE);
>>>>>>>>> check_join_state(fsa_state, __FUNCTION__);
>>>>>>>>>
>>>>>>>>> update_graph(transition_graph, down);
>>>>>>>>> diff --git a/crmd/crmd_utils.h b/crmd/crmd_utils.h
>>>>>>>>> index bc472c2..1a2577a 100644
>>>>>>>>> --- a/crmd/crmd_utils.h
>>>>>>>>> +++ b/crmd/crmd_utils.h
>>>>>>>>> @@ -100,6 +100,7 @@ void crmd_join_phase_log(int level);
>>>>>>>>> const char *get_timer_desc(fsa_timer_t * timer);
>>>>>>>>> gboolean too_many_st_failures(void);
>>>>>>>>> void st_fail_count_reset(const char * target);
>>>>>>>>> +void crmd_peer_down(crm_node_t *peer, bool full);
>>>>>>>>>
>>>>>>>>> # define fsa_register_cib_callback(id, flag, data, fn) do { \
>>>>>>>>> fsa_cib_conn->cmds->register_callback( \
>>>>>>>>> diff --git a/crmd/te_actions.c b/crmd/te_actions.c
>>>>>>>>> index f31d4ec..3bfce59 100644
>>>>>>>>> --- a/crmd/te_actions.c
>>>>>>>>> +++ b/crmd/te_actions.c
>>>>>>>>> @@ -80,11 +80,8 @@ send_stonith_update(crm_action_t * action, const char *target, const char *uuid)
>>>>>>>>> crm_info("Recording uuid '%s' for node '%s'", uuid, target);
>>>>>>>>> peer->uuid = strdup(uuid);
>>>>>>>>> }
>>>>>>>>> - crm_update_peer_proc(__FUNCTION__, peer, crm_proc_none, NULL);
>>>>>>>>> - crm_update_peer_state(__FUNCTION__, peer, CRM_NODE_LOST, 0);
>>>>>>>>> - crm_update_peer_expected(__FUNCTION__, peer, CRMD_JOINSTATE_DOWN);
>>>>>>>>> - crm_update_peer_join(__FUNCTION__, peer, crm_join_none);
>>>>>>>>>
>>>>>>>>> + crmd_peer_down(peer, TRUE);
>>>>>>>>> node_state =
>>>>>>>>> do_update_node_cib(peer,
>>>>>>>>> node_update_cluster | node_update_peer | node_update_join |
>>>>>>>>> diff --git a/crmd/te_utils.c b/crmd/te_utils.c
>>>>>>>>> index ad7e573..0c92e95 100644
>>>>>>>>> --- a/crmd/te_utils.c
>>>>>>>>> +++ b/crmd/te_utils.c
>>>>>>>>> @@ -247,10 +247,7 @@ tengine_stonith_notify(stonith_t * st, stonith_event_t * st_event)
>>>>>>>>>
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> - crm_update_peer_proc(__FUNCTION__, peer, crm_proc_none, NULL);
>>>>>>>>> - crm_update_peer_state(__FUNCTION__, peer, CRM_NODE_LOST, 0);
>>>>>>>>> - crm_update_peer_expected(__FUNCTION__, peer, CRMD_JOINSTATE_DOWN);
>>>>>>>>> - crm_update_peer_join(__FUNCTION__, peer, crm_join_none);
>>>>>>>>> + crmd_peer_down(peer, TRUE);
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> diff --git a/crmd/utils.c b/crmd/utils.c
>>>>>>>>> index 3988cfe..2df53ab 100644
>>>>>>>>> --- a/crmd/utils.c
>>>>>>>>> +++ b/crmd/utils.c
>>>>>>>>> @@ -1077,3 +1077,13 @@ update_attrd_remote_node_removed(const char *host, const char *user_name)
>>>>>>>>> crm_trace("telling attrd to clear attributes for remote host %s", host);
>>>>>>>>> update_attrd_helper(host, NULL, NULL, user_name, TRUE, 'C');
>>>>>>>>> }
>>>>>>>>> +
>>>>>>>>> +void crmd_peer_down(crm_node_t *peer, bool full)
>>>>>>>>> +{
>>>>>>>>> + if(full && peer->state == NULL) {
>>>>>>>>> + crm_update_peer_state(__FUNCTION__, peer, CRM_NODE_LOST, 0);
>>>>>>>>> + crm_update_peer_proc(__FUNCTION__, peer, crm_proc_none, NULL);
>>>>>>>>> + }
>>>>>>>>> + crm_update_peer_join(__FUNCTION__, peer, crm_join_none);
>>>>>>>>> + crm_update_peer_expected(__FUNCTION__, peer, CRMD_JOINSTATE_DOWN);
>>>>>>>>> +}
>>>>>>>>>
>>>>>>>>> On 16 Jan 2014, at 7:24 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>>> 16.01.2014, 01:30, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>>>>> On 16 Jan 2014, at 12:41 am, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>>>>> 15.01.2014, 02:53, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>>>>>>> On 15 Jan 2014, at 12:15 am, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>>>>>>> 14.01.2014, 10:00, "Andrey Groshev" <greenx at yandex.ru>:
>>>>>>>>>>>>>>> 14.01.2014, 07:47, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>>>>>>>>>> Ok, here's what happens:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1. node2 is lost
>>>>>>>>>>>>>>>> 2. fencing of node2 starts
>>>>>>>>>>>>>>>> 3. node2 reboots (and cluster starts)
>>>>>>>>>>>>>>>> 4. node2 returns to the membership
>>>>>>>>>>>>>>>> 5. node2 is marked as a cluster member
>>>>>>>>>>>>>>>> 6. DC tries to bring it into the cluster, but needs to cancel the active transition first.
>>>>>>>>>>>>>>>> Which is a problem since the node2 fencing operation is part of that
>>>>>>>>>>>>>>>> 7. node2 is in a transition (pending) state until fencing passes or fails
>>>>>>>>>>>>>>>> 8a. fencing fails: transition completes and the node joins the cluster
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thats in theory, except we automatically try again. Which isn't appropriate.
>>>>>>>>>>>>>>>> This should be relatively easy to fix.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 8b. fencing passes: the node is incorrectly marked as offline
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This I have no idea how to fix yet.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On another note, it doesn't look like this agent works at all.
>>>>>>>>>>>>>>>> The node has been back online for a long time and the agent is still timing out after 10 minutes.
>>>>>>>>>>>>>>>> So "Once the script makes sure that the victim will rebooted and again available via ssh - it exit with 0." does not seem true.
>>>>>>>>>>>>>>> Damn. Looks like you're right. At some time I broke my agent and had not noticed it. Who will understand.
>>>>>>>>>>>>>> I repaired my agent - after send reboot he is wait STDIN.
>>>>>>>>>>>>>> Returned "normally" a behavior - hangs "pending", until manually send reboot. :)
>>>>>>>>>>>>> Right. Now you're in case 8b.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can you try this patch: http://paste.fedoraproject.org/68450/38973966
>>>>>>>>>>>> Killed all day experiences.
>>>>>>>>>>>> It turns out here that:
>>>>>>>>>>>> 1. Did cluster.
>>>>>>>>>>>> 2. On the node-2 send signal (-4) - killed corosink
>>>>>>>>>>>> 3. From node-1 (there DC) - stonith sent reboot
>>>>>>>>>>>> 4. Noda rebooted and resources start.
>>>>>>>>>>>> 5. Again. On the node-2 send signal (-4) - killed corosink
>>>>>>>>>>>> 6. Again. From node-1 (there DC) - stonith sent reboot
>>>>>>>>>>>> 7. Noda-2 rebooted and hangs in "pending"
>>>>>>>>>>>> 8. Waiting, waiting..... manually reboot.
>>>>>>>>>>>> 9. Noda-2 reboot and raised resources start.
>>>>>>>>>>>> 10. GOTO p.2
>>>>>>>>>>> Logs?
>>>>>>>>>> Yesterday I wrote an additional letter why not put the logs.
>>>>>>>>>> Read it please, it contains a few more questions.
>>>>>>>>>> Today again began to hang and continue along the same cycle.
>>>>>>>>>> Logs here http://send2me.ru/crmrep2.tar.bz2
>>>>>>>>>>>>>> New logs: http://send2me.ru/crmrep1.tar.bz2
>>>>>>>>>>>>>>>> On 14 Jan 2014, at 1:19 pm, Andrew Beekhof <andrew at beekhof.net> wrote:
>>>>>>>>>>>>>>>>> Apart from anything else, your timeout needs to be bigger:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Jan 13 12:21:36 [17223] dev-cluster2-node1.unix.tensor.ru stonith-ng: ( commands.c:1321 ) error: log_operation: Operation 'reboot' [11331] (call 2 from crmd.17227) for host 'dev-cluster2-node2.unix.tensor.ru' with device 'st1' returned: -62 (Timer expired)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 14 Jan 2014, at 7:18 am, Andrew Beekhof <andrew at beekhof.net> wrote:
>>>>>>>>>>>>>>>>>> On 13 Jan 2014, at 8:31 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>>>>>>>>>>>> 13.01.2014, 02:51, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>>>>>>>>>>>>>> On 10 Jan 2014, at 9:55 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>>>>>>>>>>>>>> 10.01.2014, 14:31, "Andrey Groshev" <greenx at yandex.ru>:
>>>>>>>>>>>>>>>>>>>>>> 10.01.2014, 14:01, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>>>>>>>>>>>>>>>>> On 10 Jan 2014, at 5:03 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> 10.01.2014, 05:29, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>>>>>>>>>>>>>>>>>>> On 9 Jan 2014, at 11:11 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> 08.01.2014, 06:22, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>>>>>>>>>>>>>>>>>>>>> On 29 Nov 2013, at 7:17 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, ALL.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm still trying to cope with the fact that after the fence - node hangs in "pending".
>>>>>>>>>>>>>>>>>>>>>>>>>>> Please define "pending". Where did you see this?
>>>>>>>>>>>>>>>>>>>>>>>>>> In crm_mon:
>>>>>>>>>>>>>>>>>>>>>>>>>> ......
>>>>>>>>>>>>>>>>>>>>>>>>>> Node dev-cluster2-node2 (172793105): pending
>>>>>>>>>>>>>>>>>>>>>>>>>> ......
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> The experiment was like this:
>>>>>>>>>>>>>>>>>>>>>>>>>> Four nodes in cluster.
>>>>>>>>>>>>>>>>>>>>>>>>>> On one of them kill corosync or pacemakerd (signal 4 or 6 oк 11).
>>>>>>>>>>>>>>>>>>>>>>>>>> Thereafter, the remaining start it constantly reboot, under various pretexts, "softly whistling", "fly low", "not a cluster member!" ...
>>>>>>>>>>>>>>>>>>>>>>>>>> Then in the log fell out "Too many failures ...."
>>>>>>>>>>>>>>>>>>>>>>>>>> All this time in the status in crm_mon is "pending".
>>>>>>>>>>>>>>>>>>>>>>>>>> Depending on the wind direction changed to "UNCLEAN"
>>>>>>>>>>>>>>>>>>>>>>>>>> Much time has passed and I can not accurately describe the behavior...
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Now I am in the following state:
>>>>>>>>>>>>>>>>>>>>>>>>>> I tried locate the problem. Came here with this.
>>>>>>>>>>>>>>>>>>>>>>>>>> I set big value in property stonith-timeout="600s".
>>>>>>>>>>>>>>>>>>>>>>>>>> And got the following behavior:
>>>>>>>>>>>>>>>>>>>>>>>>>> 1. pkill -4 corosync
>>>>>>>>>>>>>>>>>>>>>>>>>> 2. from node with DC call my fence agent "sshbykey"
>>>>>>>>>>>>>>>>>>>>>>>>>> 3. It sends reboot victim and waits until she comes to life again.
>>>>>>>>>>>>>>>>>>>>>>>>> Hmmm.... what version of pacemaker?
>>>>>>>>>>>>>>>>>>>>>>>>> This sounds like a timing issue that we fixed a while back
>>>>>>>>>>>>>>>>>>>>>>>> Was a version 1.1.11 from December 3.
>>>>>>>>>>>>>>>>>>>>>>>> Now try full update and retest.
>>>>>>>>>>>>>>>>>>>>>>> That should be recent enough. Can you create a crm_report the next time you reproduce?
>>>>>>>>>>>>>>>>>>>>>> Of course yes. Little delay.... :)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> ......
>>>>>>>>>>>>>>>>>>>>>> cc1: warnings being treated as errors
>>>>>>>>>>>>>>>>>>>>>> upstart.c: In function ‘upstart_job_property’:
>>>>>>>>>>>>>>>>>>>>>> upstart.c:264: error: implicit declaration of function ‘g_variant_lookup_value’
>>>>>>>>>>>>>>>>>>>>>> upstart.c:264: error: nested extern declaration of ‘g_variant_lookup_value’
>>>>>>>>>>>>>>>>>>>>>> upstart.c:264: error: assignment makes pointer from integer without a cast
>>>>>>>>>>>>>>>>>>>>>> gmake[2]: *** [libcrmservice_la-upstart.lo] Error 1
>>>>>>>>>>>>>>>>>>>>>> gmake[2]: Leaving directory `/root/ha/pacemaker/lib/services'
>>>>>>>>>>>>>>>>>>>>>> make[1]: *** [all-recursive] Error 1
>>>>>>>>>>>>>>>>>>>>>> make[1]: Leaving directory `/root/ha/pacemaker/lib'
>>>>>>>>>>>>>>>>>>>>>> make: *** [core] Error 1
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I'm trying to solve this a problem.
>>>>>>>>>>>>>>>>>>>>> Do not get solved quickly...
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> https://developer.gnome.org/glib/2.28/glib-GVariant.html#g-variant-lookup-value
>>>>>>>>>>>>>>>>>>>>> g_variant_lookup_value () Since 2.28
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> # yum list installed glib2
>>>>>>>>>>>>>>>>>>>>> Loaded plugins: fastestmirror, rhnplugin, security
>>>>>>>>>>>>>>>>>>>>> This system is receiving updates from RHN Classic or Red Hat Satellite.
>>>>>>>>>>>>>>>>>>>>> Loading mirror speeds from cached hostfile
>>>>>>>>>>>>>>>>>>>>> Installed Packages
>>>>>>>>>>>>>>>>>>>>> glib2.x86_64 2.26.1-3.el6 installed
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> # cat /etc/issue
>>>>>>>>>>>>>>>>>>>>> CentOS release 6.5 (Final)
>>>>>>>>>>>>>>>>>>>>> Kernel \r on an \m
>>>>>>>>>>>>>>>>>>>> Can you try this patch?
>>>>>>>>>>>>>>>>>>>> Upstart jobs wont work, but the code will compile
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> diff --git a/lib/services/upstart.c b/lib/services/upstart.c
>>>>>>>>>>>>>>>>>>>> index 831e7cf..195c3a4 100644
>>>>>>>>>>>>>>>>>>>> --- a/lib/services/upstart.c
>>>>>>>>>>>>>>>>>>>> +++ b/lib/services/upstart.c
>>>>>>>>>>>>>>>>>>>> @@ -231,12 +231,21 @@ upstart_job_exists(const char *name)
>>>>>>>>>>>>>>>>>>>> static char *
>>>>>>>>>>>>>>>>>>>> upstart_job_property(const char *obj, const gchar * iface, const char *name)
>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>> + char *output = NULL;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +#if !GLIB_CHECK_VERSION(2,28,0)
>>>>>>>>>>>>>>>>>>>> + static bool err = TRUE;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> + if(err) {
>>>>>>>>>>>>>>>>>>>> + crm_err("This version of glib is too old to support upstart jobs");
>>>>>>>>>>>>>>>>>>>> + err = FALSE;
>>>>>>>>>>>>>>>>>>>> + }
>>>>>>>>>>>>>>>>>>>> +#else
>>>>>>>>>>>>>>>>>>>> GError *error = NULL;
>>>>>>>>>>>>>>>>>>>> GDBusProxy *proxy;
>>>>>>>>>>>>>>>>>>>> GVariant *asv = NULL;
>>>>>>>>>>>>>>>>>>>> GVariant *value = NULL;
>>>>>>>>>>>>>>>>>>>> GVariant *_ret = NULL;
>>>>>>>>>>>>>>>>>>>> - char *output = NULL;
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> crm_info("Calling GetAll on %s", obj);
>>>>>>>>>>>>>>>>>>>> proxy = get_proxy(obj, BUS_PROPERTY_IFACE);
>>>>>>>>>>>>>>>>>>>> @@ -272,6 +281,7 @@ upstart_job_property(const char *obj, const gchar * iface, const char *name)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> g_object_unref(proxy);
>>>>>>>>>>>>>>>>>>>> g_variant_unref(_ret);
>>>>>>>>>>>>>>>>>>>> +#endif
>>>>>>>>>>>>>>>>>>>> return output;
>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>> Ok :) I patch source.
>>>>>>>>>>>>>>>>>>> Type "make rc" - the same error.
>>>>>>>>>>>>>>>>>> Because its not building your local changes
>>>>>>>>>>>>>>>>>>> Make new copy via "fetch" - the same error.
>>>>>>>>>>>>>>>>>>> It seems that if not exist ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3.tar.gz, then download it.
>>>>>>>>>>>>>>>>>>> Otherwise use exist archive.
>>>>>>>>>>>>>>>>>>> Cutted log .......
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> # make rc
>>>>>>>>>>>>>>>>>>> make TAG=Pacemaker-1.1.11-rc3 rpm
>>>>>>>>>>>>>>>>>>> make[1]: Entering directory `/root/ha/pacemaker'
>>>>>>>>>>>>>>>>>>> rm -f pacemaker-dirty.tar.* pacemaker-tip.tar.* pacemaker-HEAD.tar.*
>>>>>>>>>>>>>>>>>>> if [ ! -f ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3.tar.gz ]; then \
>>>>>>>>>>>>>>>>>>> rm -f pacemaker.tar.*; \
>>>>>>>>>>>>>>>>>>> if [ Pacemaker-1.1.11-rc3 = dirty ]; then \
>>>>>>>>>>>>>>>>>>> git commit -m "DO-NOT-PUSH" -a; \
>>>>>>>>>>>>>>>>>>> git archive --prefix=ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3/ HEAD | gzip > ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3.tar.gz; \
>>>>>>>>>>>>>>>>>>> git reset --mixed HEAD^; \
>>>>>>>>>>>>>>>>>>> else \
>>>>>>>>>>>>>>>>>>> git archive --prefix=ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3/ Pacemaker-1.1.11-rc3 | gzip > ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3.tar.gz; \
>>>>>>>>>>>>>>>>>>> fi; \
>>>>>>>>>>>>>>>>>>> echo `date`: Rebuilt ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3.tar.gz; \
>>>>>>>>>>>>>>>>>>> else \
>>>>>>>>>>>>>>>>>>> echo `date`: Using existing tarball: ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3.tar.gz; \
>>>>>>>>>>>>>>>>>>> fi
>>>>>>>>>>>>>>>>>>> Mon Jan 13 13:23:21 MSK 2014: Using existing tarball: ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3.tar.gz
>>>>>>>>>>>>>>>>>>> .......
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Well, "make rpm" - build rpms and I create cluster.
>>>>>>>>>>>>>>>>>>> I spent the same tests and confirmed the behavior.
>>>>>>>>>>>>>>>>>>> crm_reoprt log here - http://send2me.ru/crmrep.tar.bz2
>>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>> ,
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>>>> ,
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>>
>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>
>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>> ,
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>
>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>
>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>> ,
>>>>>>>>> _______________________________________________
>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>
>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>> _______________________________________________
>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>
>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>> ,
>>>>>>> _______________________________________________
>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>
>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>> ,
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>> ,
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140220/a16c0c09/attachment-0004.sig>
More information about the Pacemaker
mailing list