[Pacemaker] Antwort: Re: WG: time pressure - software raid cluster, raid1 ressource agent, help needed

Mon Mar 7 10:30:46 UTC 2011

Hi,
SAN drivers often cave large timeouts configured, so are you patient
enough ?
At least this demonstrates that the problem is currently not in the
cluster...
- holger
On Mon, 2011-03-07 at 11:04 +0100, Patrik.Rapposch at knapp.com wrote:
> Hy, 
> 
> thx for answer. I tested this now, the problem is, mdadm hangs totally
> when we simulate the fail of one storage. (we already tried two ways:
> 1. removing the mapping., 2. removing one path, and then disabling the
> remaining path through the port on the san switch - which is nearly
> the same like a total fail of the storage). 
> 
> So I can't get the output of mdadm, because it hangs. 
> 
> I think it must be a problem with mdadm. This is my mdadm.conf: 
> 
> "DEVICE /dev/mapper/3600a0b800050c94e000007874d2e0028_part1 /dev/mapper/3600a0b8000511f54000014b14d2df1b1_part1 /dev/mapper/3600a0b800050c94e000007874d2e0028_part2 /dev/mapper/3600a0b8000511f54000014b14d2df1b1_part2 /dev/mapper/3600a0b800050c94e000007874d2e0028_part3 /dev/mapper/3600a0b8000511f54000014b14d2df1b1_part3 
> ARRAY /dev/md0 metadata=0.90 UUID=c411c076:bb28916f:d50a93ef:800dd1f0 
> ARRAY /dev/md1 metadata=0.90 UUID=522279fa:f3cdbe3a:d50a93ef:800dd1f0 
> ARRAY /dev/md2 metadata=0.90
> UUID=01e07d7d:5305e46c:d50a93ef:800dd1f0" 
> 
> kr Patrik 
> 
> 
> Mit freundlichen Grüßen / Best Regards
> 
> Patrik Rapposch, BSc
> System Administration
> 
> KNAPP Systemintegration GmbH
> Waltenbachstraße 9
> 8700 Leoben, Austria 
> Phone: +43 3842 805-915
> Fax: +43 3842 805-500
> patrik.rapposch at knapp.com 
> www.KNAPP.com
> 
> Commercial register number: FN 138870x
> Commercial register court: Leoben
> 
> The information in this e-mail (including any attachment) is
> confidential and intended to be for the use of the addressee(s) only.
> If you have received the e-mail by mistake, any disclosure, copy,
> distribution or use of the contents of the e-mail is prohibited, and
> you must delete the e-mail from your system. As e-mail can be changed
> electronically KNAPP assumes no responsibility for any alteration to
> this e-mail or its attachments. KNAPP has taken every reasonable
> precaution to ensure that any attachment to this e-mail has been swept
> for virus. However, KNAPP does not accept any liability for damage
> sustained as a result of such attachment being virus infected and
> strongly recommend that you carry out your own virus check before
> opening any attachment. 
> 
> 
> Holger Teutsch
> <holger.teutsch at web.de> 
> 
> 06.03.2011 19:56 
>         Bitte antworten an
>   The Pacemaker cluster resource
>               manager
>   <pacemaker at oss.clusterlabs.org>
> 
> 
> 
> 
>                An
> The Pacemaker
> cluster resource
> manager
> <pacemaker at oss.clusterlabs.org> 
>             Kopie
> 
>             Thema
> Re: [Pacemaker]
> WG: time pressure
> - software raid
> cluster, raid1
> ressource agent,
> help needed
> 
> 
> 
> 
> 
> 
> 
> 
> On Sun, 2011-03-06 at 12:40 +0100, Patrik.Rapposch at knapp.com wrote:
> Hi,
> assume the basic problem is in your raid configuration.
> 
> If you unmap one box the devices should not be in status FAIL but in
> degraded.
> 
> So what is the exit status of
> 
> mdadm --detail --test /dev/md0
> 
> after unmapping ?
> 
> Furthermore I would start start with one isolated group containing the
> raid, LVM, and FS to keep it simple.
> 
> Regards
> Holger
> 
> >  Hy, 
> > 
> > 
> > does anyone have an idea to that? I only have the servers till next
> > week friday, so to my regret I am under time pressure :(
> > 
> > 
> > 
> > Like I already wrote, I would appreciate and test any idea of you.
> > Also if someone already made clusters with lvm-mirror, I would be
> > happy to get a cib or some configuration examples.
> > 
> >  
> > 
> > 
> > 
> > 
> > 
> > Thank you very much in advance.
> > 
> >  
> > 
> > 
> > 
> > 
> > 
> > kr Patrik
> > 
> > 
> > 
> > 
> > 
> > Patrik.Rapposch at knapp.com
> > 03.03.2011 15:11Bitte antworten anThe Pacemaker cluster resource
> > manager
> > 
> > An   pacemaker at oss.clusterlabs.org
> > Kopie   
> > Blindkopie   
> > Thema   [Pacemaker] software raid cluster, raid1 ressource
> agent,help
> > needed
> > 
> > 
> > Good Day, 
> > 
> > I have a 2 node active/passive cluster which is connected to two
>  ibm
> > 4700 storages. I configured 3 raids and I use the Raid1 ressource
> > agent for managing the Raid1s in the cluster. 
> > When I now disable the mapping of one storage, to simulate the fail
> of
> > one storage, the Raid1 Ressources change to the State "FAILED" and
> the
> > second node then takes over the ressources and is able to start the
> > raid devices. 
> > 
> > So I am confused, why the active node can't keep the raid1
> ressources
> > and the former passive node takes them over and can start them
> > correct. 
> > 
> > I would really appreciate your advice, or maybe someone already has
> a
> > example configuration for Raid1 with two storages.
> > 
> > Thank you very much in advance. Attached you can find my cib.xml. 
> > 
> > kr Patrik 
> > 
> > 
> > 
> > Mit freundlichen Grüßen / Best Regards
> > 
> > Patrik Rapposch, BSc
> > System Administration
> > 
> > KNAPP Systemintegration GmbH
> > Waltenbachstraße 9
> > 8700 Leoben, Austria 
> > Phone: +43 3842 805-915
> > Fax: +43 3842 805-500
> > patrik.rapposch at knapp.com 
> > www.KNAPP.com
> > 
> > Commercial register number: FN 138870x
> > Commercial register court: Leoben
> > 
> > The information in this e-mail (including any attachment) is
> > confidential and intended to be for the use of the addressee(s)
> only.
> > If you have received the e-mail by mistake, any disclosure, copy,
> > distribution or use of the contents of the e-mail is prohibited, and
> > you must delete the e-mail from your system. As e-mail can be
> changed
> > electronically KNAPP assumes no responsibility for any alteration to
> > this e-mail or its attachments. KNAPP has taken every reasonable
> > precaution to ensure that any attachment to this e-mail has been
> swept
> > for virus. However, KNAPP does not accept any liability for damage
> > sustained as a result of such attachment being virus infected and
> > strongly recommend that you carry out your own virus check before
> > opening any attachment.
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs:
> >
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> > 
> > 
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker