[Pacemaker] Can somebody please explain pengine's urge to move all resources?

Thu Sep 23 03:28:53 EDT 2010

On Wed, Sep 22, 2010 at 12:37 PM, Raoul Bhatia [IPAX] <r.bhatia at ipax.at> wrote:
> hi,
>
> i have a 2node cluster with drbd+nfs+webservices(clones)
>
> basically, i have some rules:
>
> 1. promote drbd before starting fs+nfs-server (group_www_data)
>> order drbd_before_group_www_data : ms_drbd_www:promote group_www_data:start
>
> 2. start nfs-server (group_www_data) before nfsclient+apache
> (clone_webservice)
>> order group_www_data_before_webservices : group_www_data:start clone_webservice:start
>
> 3. start ftp-server after everything is up:
>> order fs_www_before_pure-ftpd 0: clone_webservice:start pure-ftpd:start
>> order webservices_before_group_ftpd 0: clone_webservice:start group_ftpd:start
>
> (actually, from what i see now, these two rules are redundant, right?)

well, one of them is. probably want the other one though :-)

>
>
> i also colocate clone_webservice (nfs client) with the ftps server, so
> that the ftp server can acutally serve the user's folders ;)
>> colocation colo_webservices_group_ftpd inf: group_ftpd clone_webservice:Started
>
>
> crm status:
>>  Resource Group: group_www_data
>>      fs_www_data        (ocf::heartbeat:Filesystem):    Started wc01
>>      nfs-kernel-server  (lsb:nfs-kernel-server):        Started wc01
>>      intip_nfs  (ocf::heartbeat:IPaddr2):       Started wc01
>>      backupip_nfs       (ocf::heartbeat:IPaddr2):       Started wc01
> ...
>>  Master/Slave Set: ms_drbd_www
>>      Masters: [ wc02 ]
>>      Slaves: [ wc01 ]
>>  Clone Set: clone_nfs-common
>>      Started: [ wc01 wc02 ]
>>  Clone Set: clone_webservice
>>      Started: [ wc02 wc01 ]
>>  Resource Group: group_ftpd
>>      intip_ftp  (ocf::heartbeat:IPaddr2):       Started wc01
>>      pure-ftpd  (ocf::heartbeat:Pure-FTPd):     Started wc01
>
>
> now i want to move pure-ftpd from wc01 to wc02:
>> crm resource migrate pure-ftpd
>
> imho, as clone_webservice is running on both wc01 and wc02, only
> group_ftpd should be stopped and (re-)started.
>
> but pengine thinks:
>> Sep 22 11:24:06 wc01 pengine: [4083]: notice: LogActions: Move resource fs_www_data#011(Started wc01 -> wc02)
>> Sep 22 11:24:06 wc01 pengine: [4083]: notice: LogActions: Move resource nfs-kernel-server#011(Started wc01 -> wc02)
>> Sep 22 11:24:06 wc01 pengine: [4083]: notice: LogActions: Move resource intip_nfs#011(Started wc01 -> wc02)
>> Sep 22 11:24:06 wc01 pengine: [4083]: notice: LogActions: Move resource backupip_nfs#011(Started wc01 -> wc02)
>
> can someone please explain the reason for that?

Probably a bug.
The good news is that 1.1.3 doesn't have that behavior.
Lets see how 1.0 goes once all the relevant patches have been backported.

> hb report atached.

Note to self: pe-input-90.bz2 from wc01 is the relevant test file.

> thanks,
> raoul
> --
> ____________________________________________________________________
> DI (FH) Raoul Bhatia M.Sc.          email.          r.bhatia at ipax.at
> Technischer Leiter
>
> IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
> Barawitzkagasse 10/2/2/11           email.            office at ipax.at
> 1190 Wien                           tel.               +43 1 3670030
> FN 277995t HG Wien                  fax.            +43 1 3670030 15
> ____________________________________________________________________
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>