Docstoc

neeman_condorweek2007_ou_unl_colinux_20070430_notary bond

Document Sample
neeman_condorweek2007_ou_unl_colinux_20070430_notary bond Powered By Docstoc
					Unclipped Condor
  in Windows®
   via coLinux
         Henry Neeman, Horst Severini,
         Chris Franklin, Josh Alexander
                University of Oklahoma
                    Sumanth J.V.
            University of Nebraska-Lincoln
 Condor Week, University of Wisconsin, Tuesday May 1 2007
             Condor: Linux vs Windows
   Condor inside Linux: full featured
   Condor inside Windows®: “clipped”
       No autocheckpointing
       No job automigration
       No remote system calls
       No Standard Universe
                                 http://www.our-picks.com/archives/2006/10/page/2/




                            Unclipped Condor in Windows via coLinux
                               Condor Week, Tuesday May 1 2007                       2
             Lots of PCs in IT Labs
At many institutions, there are lots of PC labs
   managed by a central IT organizations.
If the head of IT (e.g., CIO) is on board,
   then all of these PCs can be Condorized.
But, these labs tend to be Windows® labs, not Linux.
   So you can’t take the Windows® desktop
   experience away from the desktop users, just to get
   Condor.
So, how can we have Linux Condor AND Windows®
   desktop on the same PC at the same time?
                       Unclipped Condor in Windows via coLinux
                          Condor Week, Tuesday May 1 2007        3
         Solution Attempt #1: VMware
Attempted solution: VMware
 Linux as native host OS

 Condor inside Linux
 VMware inside Linux

 Windows® inside VMware

Tested on ~200 PCs in IT PC labs (Union, library,
   dorms, Physics Dept)
In production for over a year



                        Unclipped Condor in Windows via coLinux
                           Condor Week, Tuesday May 1 2007        4
               VMware Disadvantages
Attempted solution: VMware
 Linux as native host OS

 Condor inside Linux

 VMware inside Linux

 Windows® inside VMware
Disadvantages
 VMware costs money! (Less so now than then.)

 Crashy

 VMware performance tuning (straight to disk) was unstable
 Sensitive to hardware heterogeneity

 Painful to manage

 CD/DVD burners and USB drives didn’t work in some PCs.


                           Unclipped Condor in Windows via coLinux
                              Condor Week, Tuesday May 1 2007        5
            A Better Solution: coLinux
Cooperative Linux (coLinux)
http://www.colinux.org/
   FREE!
   Runs inside native Windows®
   No sensitivity to hardware type
   Better performance
   Easier to customize
   Smaller disk footprint and lower CPU usage in idle
   Minimal management required (~10 hours/month)
                         Unclipped Condor in Windows via coLinux
                            Condor Week, Tuesday May 1 2007        6
               Compatibility Issue
About 30 of the 200 lab PCs we installed coLinux on
 had problems with it, so those PCs now run a
 prerelease version of coLinux.
We have no idea why the production version of
 coLinux was a miserable failure on these 30 PCs,
 nor why the prerelease version succeeded.




                       Unclipped Condor in Windows via coLinux
                          Condor Week, Tuesday May 1 2007        7
                 Preventing BSOD
The Data Execution Prevention feature inside
  Windows®, when running on some newer
  processors, can conflict with coLinux and cause
  system failure. The solution to this problem is to
  add the /NOEXECUTE switch to the Windows®
  boot.ini.




                        Unclipped Condor in Windows via coLinux
                           Condor Week, Tuesday May 1 2007        8
                     Network Issue
Networking options
 Bridged: Each PC has to have a second IP address,
  so the institution has to have plenty of spare IP
  addresses available. (Oklahoma solution)
 NAT: The Condor pool requires a Generic
  Connection Broker (GCB) on a separate, dedicated
  PC (hardware $), and has some instability.
  Switched to OpenVPN.(Nebraska solution)
      Nebraska experimented with port forwarding in
       Windows®, but abandoned it for OpenVPN because of
       security and usability.

                          Unclipped Condor in Windows via coLinux
                             Condor Week, Tuesday May 1 2007        9
        Traversing NATs and Firewalls
   What is GCB (Connection Broker)?
       Socket level approach.
       A broker arranges connections between machines inside the firewall
        and machines outside the firewall.
   What is OpenVPN (Open Virtual Private Network)?
       A network within a network.
       Virtual network adapter.
       Virtual IP (static/dynamic).
       TCP within UDP.
       Client/Server architecture.
       All to All communication.
       All traffic is encrypted by default.

                                   Unclipped Condor in Windows via coLinux
                                      Condor Week, Tuesday May 1 2007        10
                                     OpenVPN
   When using GCB, each machine is represented by a unique port on the
    broker.
        Central Manager sees all the machines as <GCB_IP:port 1>, <GCB_IP: port
         2> etc.
        Only applications linked against GCB work. (Condor is already linked)
   When using OpenVPN, each machine has a unique virtual IP address in
    the VPN.
        Simplifies troubleshooting.
   Central Manager is also part of the OpenVPN and runs in server mode.
   ClientConnect.py
        Determines Virtual IP of a new client based on its Real IP.
             E.g. node-25-55 has real IP129.93.25.55 gets virtual IP10.1.25.55
        Pushes this configuration to the clients.
        Updates /etc/hosts.
   OpenVPN lockups can be fixed with mssfix 1200 and fragment 1200
    options?

                                         Unclipped Condor in Windows via coLinux
                                            Condor Week, Tuesday May 1 2007        11
                                 OpenVPN
   No modification of application required to use OpenVPN.
       We have successfully mounted NFS shares (CMS stack).
       Inbound SSH access
       Since all-to-all communication is present, even MPICH works.
            Remember all-to-all communication still has to go through the
             OpenVPN server.
   Secure
       No firewall required in coLinux.




                                     Unclipped Condor in Windows via coLinux
                                        Condor Week, Tuesday May 1 2007        12
                    Monitoring Issue
Condor inside Linux monitors keyboard and mouse usage to
   decide when to suspend a job.
In coLinux, this is tricky.
We had to set up a Visual Basic script on the Windows® side
   to send the keyboard and mouse information to coLinux.
UNL implements a similar idea in C++, and OU is now doing
   likewise.
UNL collects all the keyboard and mouse data on a server,
   while OU does it on each local machine. But the result is the
   same.


                            Unclipped Condor in Windows via coLinux
                               Condor Week, Tuesday May 1 2007        13
                 Monitoring coLinux Labs
   How to determine whether all the machines in each lab are up an running?
        condor_status only displays working machines. What about missing
         machines?
   We need a list of expected but missing nodes per lab.
   We need a physical layout of the nodes in each lab.
   MYSQL database to store lab info.
        Need to separately handle static and dynamic IP labs.
        Static IPs are easy to handle.
             Store IP and relative co-ordinates of the node.
        Dynamic IP store a lambda function expressing how to determine if a
         machine belongs to a lab.
             E.g. lambda x: '18-' in x - matches node-18-2, node-18-3 …
        Store expected number of machines per lab, known hardware/software issues
         as notes per machine.
        Compare output of condor_status and MYSQL database.
   Demo: http://mindspawn.unl.edu/condor/stats
   Web front-end developed for mod_python.

                                          Unclipped Condor in Windows via coLinux
                                             Condor Week, Tuesday May 1 2007        14
       How to Build a Multistate Grid
To make a prairie
It takes a clover and one bee.
One clover, and a bee, and reverie.
The reverie alone will do,
If bees are few.
       – Emily Dickinson, 1858                  http://magickcanoe.com/blog/
                                                   2006/08/24/on-our-walk/




                       Unclipped Condor in Windows via coLinux
                          Condor Week, Tuesday May 1 2007               15
         OU’s NSF CI-TEAM Project
OU recently received a grant from the National
  Science Foundation’s Cyberinfrastructure Training,
  Education, Advancement, and Mentoring for Our
  21st Century Workforce (CI-TEAM) program.
Objectives:
 Teach general HPC concepts to a broad audience

 Provide Condor resources to the national
  community
 Teach users to use Condor and sysadmins to deploy
  and administer it
 Teach bioinformatics students to use BLAST over
  Condor
                       Unclipped Condor in Windows via coLinux
                          Condor Week, Tuesday May 1 2007        16
                OU NSF CI-TEAM Project
  Cyberinfrastructure Education for Bioinformatics and Beyond
Objectives:                    OU will provide:
   teach students and               Condor pool of 750 desktop PCs
    faculty to use FREE               (already part of the
    Condor middleware,                Open Science Grid);
    stealing computing               Supercomputing in Plain English
    time on idle PCs;
                                      workshops via videoconferencing;
   teach system
    administrators to                Cyberinfrastructure rounds
    deploy and maintain               (consulting) via videoconferencing;
    Condor on PCs;                   drop-in CDs for installing full-featured
   teach bioinformatics              Condor on a Windows PC
    students to use                   (Cyberinfrastructure for FREE);
    BLAST on Condor;
                                     sysadmin consulting for installing and
   provide Condor                    maintaining Condor on desktop PCs.
    Cyberinfrastructure
    to the national                OU’s team includes: High School, Minority
    community (FREE).                Serving, 2-year, 4-year, masters-granting;
                                     11 of the 15 institutions are in
                                     4 EPSCoR states (AR, KS, NE, OK).
                           Unclipped Condor in Windows via coLinux
                              Condor Week, Tuesday May 1 2007              17
                 OU NSF CI-TEAM Project
Participants at OU                                       Participants at other institutions
(29 faculty/staff in 16 depts)                           (19 faculty/staff at 14 institutions)
   Information Technology                                   California State U Pomona (masters-granting,
                                    E
          OSCER: Neeman (PI)                    E




                                        E
      




                                                    E
                                                              minority serving): Lee
   College of Arts & Sciences
         Botany & Microbiology: Conway, Wren                Contra Costa College (2-year, minority serving):
         Chemistry & Biochemistry: Roe (Co-PI),              Murphy
          Wheeler                                            Earlham College (4-year): Peck
         Mathematics: White                                 Emporia State U (masters-granting, EPSCoR):
         Physics & Astronomy: Kao, Severini (Co-PI),         Pheatt, Ballester
          Skubic, Strauss
         Zoology: Ray                                       Kansas State U (EPSCoR): Andresen, Monaco
   College of Earth & Energy                                Langston U (masters-granting, minority
         Sarkeys Energy Center: Chesnokov                    serving, EPSCoR): Snow
   College of Engineering
                                                             Oklahoma Baptist U (4-year, EPSCoR): Chen,
         Aerospace & Mechanical Engr: Striz                  Jett, Jordan
         Chemical, Biological & Materials Engr:
          Papavassiliou                                      Oklahoma School of Science & Mathematics
         Civil Engr & Environmental Science: Vieux           (high school, EPSCoR): Samadzadeh
         Computer Science: Dhall, Fagg, Hougen,             St. Gregory’s U (4-year, EPSCoR): Meyer
          Lakshmivarahan, McGovern, Radhakrishnan            U Arkansas (EPSCoR): Apon
         Electrical & Computer Engr: Cruz, Todd,
          Yeary, Yu                                          U Central Oklahoma (masters-granting,
         Industrial Engr: Trafalis                           EPSCoR): Lemley, Wilson
   OU Health Sciences Center, Oklahoma City                 U Kansas (EPSCoR): Bishop
         Biochemistry & Molecular Biology: Zlotnick         U Nebraska-Lincoln (EPSCoR): Swanson
         Radiological Sciences: Wu (Co-PI)                  U Northern Iowa (masters-granting): Gray
         Surgery: Gusev

                                                  Unclipped Condor in Windows via coLinux
                                                     Condor Week, Tuesday May 1 2007                     18
       How to Create a Multistate Grid?
    Grids aren’t primarily about technology!
    You need to recruit people, by offering them
     more than you ask them to provide.
1.   Go to their institution.
2.   Give a really fun and interesting talk
     about your stuff.
3.   Tell them that they can use your stuff
     for free.
4.   Make them commit to using your stuff.
5.   Help them use your stuff.
6.   If possible, get them to visit you and see your
     stuff.
                         Unclipped Condor in Windows via coLinux
                            Condor Week, Tuesday May 1 2007        19
                 OU NSF CI-TEAM Project
Participants at OU                                       Participants at other institutions
(29 faculty/staff in 16 depts)                           (19 faculty/staff at 14 institutions)
   Information Technology                                   California State U Pomona (masters-granting,
                                    E
          OSCER: Neeman (PI)                    E




                                        E
      




                                                    E
                                                              minority serving): Lee
   College of Arts & Sciences
         Botany & Microbiology: Conway, Wren                Contra Costa College (2-year, minority serving):
         Chemistry & Biochemistry: Roe (Co-PI),              Murphy
          Wheeler                                            Earlham College (4-year): Peck
         Mathematics: White                                 Emporia State U (masters-granting, EPSCoR):
         Physics & Astronomy: Kao, Severini (Co-PI),         Pheatt, Ballester
          Skubic, Strauss
         Zoology: Ray                                       Kansas State U (EPSCoR): Andresen, Monaco
   College of Earth & Energy                                Langston U (masters-granting, minority
         Sarkeys Energy Center: Chesnokov                    serving, EPSCoR): Snow
   College of Engineering
                                                             Oklahoma Baptist U (4-year, EPSCoR): Chen,
         Aerospace & Mechanical Engr: Striz                  Jett, Jordan
         Chemical, Biological & Materials Engr:
          Papavassiliou                                      Oklahoma School of Science & Mathematics
         Civil Engr & Environmental Science: Vieux           (high school, EPSCoR): Samadzadeh
         Computer Science: Dhall, Fagg, Hougen,             St. Gregory’s U (4-year, EPSCoR): Meyer
          Lakshmivarahan, McGovern, Radhakrishnan            U Arkansas (EPSCoR): Apon
         Electrical & Computer Engr: Cruz, Todd,
          Yeary, Yu                                          U Central Oklahoma (masters-granting,
         Industrial Engr: Trafalis                           EPSCoR): Lemley, Wilson
   OU Health Sciences Center, Oklahoma City                 U Kansas (EPSCoR): Bishop
         Biochemistry & Molecular Biology: Zlotnick         U Nebraska-Lincoln (EPSCoR): Swanson
         Radiological Sciences: Wu (Co-PI)                  U Northern Iowa (masters-granting): Gray
         Surgery: Gusev

                                                  Unclipped Condor in Windows via coLinux
                                                     Condor Week, Tuesday May 1 2007                     20
Thanks for your
  attention!

  Questions?

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:3
posted:8/19/2010
language:English
pages:21