[Pacemaker] RFC: Any interesting in 2.0.0 betas?

Wed Nov 7 20:57:40 EST 2012

On Mon, Nov 5, 2012 at 7:26 PM, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
> 05.11.2012 09:28, Andrew Beekhof wrote:
> ...
>>> But you can guess it, as admins usually name nodes the same way. If not
>>> - that is problem of admins.
>>
>> No, its the problem of developers that get yelled at by admins :)
>
> :)
>
>>
>>>
>>>>
>>>>> Something says me this would provide better backwards
>>>>> compatibility, while visible result for the discussed use-case will be
>>>>> exactly the same. I know at least one cluster (not mine) which will be
>>>>> broken if just to strip everything at the first dot - it uses long
>>>>> hostnames (and this is the default for a fresh-installed redhat/fedora
>>>>> if you enter FQDN in the anaconda prompt when installing a node).
>>>>
>>>> How do they configure corosync.conf / cluster.conf though?
>>>
>>> That is for corosync1. But when/if they decide to migrate to corosync2 -
>>
>> Rephrase?
>>
>
> I mean: imagine that somebody does have working cluster based on
> corosync1 and wants to migrate to corosync2 (with quick 5 mins restart).
> corosync.conf either has memberlist with node IP addresses in interface
> clause with udpu or just uses mcast without explicit node list. Cluster
> nodes have unames in FQDN format. CIB has number of location constraints
> which refer to uname. Admin changes necessary minimum in corosync.conf
> (adds votequorum and probably copy-pastes memberlist to nodelist,
> leaving ip addresses there). I would say that is natural way for such
> migration.
>
> If you just blindly strip everything after the first dot, then that
> setup will be severely broken. Location constraints will not work and
> CIB will have duplicate entries for all nodes, one is FQDN (which
> remains there from corosync1-based setup) another is a new stripped
> name. But with my proposal it should cleanly start after upgrade without
> any modifications to CIB. With that proposal you can guess remote node
> uname with big chance of being correct (unless cluster members have been
> setup differently regarding to uname, but I would say that such setup is
> brain-dead). And admins will get expected result - they had it
> configured such way and they now have it configured the same way.
> Nothing changed, everything works.
>
> And, one more interesting issue arises - what will be used as a node
> name for multi-ring clusters? Even if corosync.conf has names in a
> nodelist instead of addresses, which one will be used? ring0?

ring0_addr: or name:

> I think it
> would be natural to look at all ring-specific names/addresses and choose
> one of them which matches local uname (with reverse DNS lookup for
> addresses and may be double DNS lookup for names - name->address->fqdn).
> After that you can guess remote unames based on ring id and domain name
> obtained with method I propose from address for that ring and local
> uname. That double lookup (but in a reverse order -
> address->fqdn->address) is common for mail servers btw.
>
> The more I think about it the more I believe that is the right way to
> go. IMHO it is the most universal method.

Not a chance in hell. Sorry. Way too many moving parts (equals more
fun ways to get screwed up, bonus points for any solution using the
word "guess") just to get a name.

If anything I'm more inclined to drop the lookup and go for:
1. nodelist -> name: if present, otherwise
2. uname()