NPACI AHM03 – Tutorial #10 Grid Computing Portals

Shared by: iff67063
-
Stats
views:
5
posted:
3/2/2010
language:
English
pages:
45
Document Sample
scope of work template
							NPACI AHM03 – Tutorial #10
  Grid Computing Portals

                   Mary Thomas
       Texas Advanced Computing Center
        The University of Texas at Austin
                          and
                        NPACI
     Presented at the NPACI AHM 2003, San Diego, CA
                        Goals
• Introduce basic portal technologies and concepts
• Provide enough knowledge to go out and begin process
  of evaluating/understanding technologies
• Provide enough knowledge to build a computing portal
  based on GridPort/Perl/CGI
• Not going to teach you how to install Grid software:
  assume you can do this is you have someone who takes
  care of this
• Audience: application developers or scientists
• Not a sales pitch – will address realities and tradeoffs
                      Outline

• Motivation for Computational Grid Portals
• Understanding Portals and Web Technologies
    – Grid Portal = Portals + Web + the Grid
•   Emerging Technologies
•   The GridPort Toolkit
•   GridPort-Based Portals
•   Future Directions
Motivation for Computational
         Grid Portals
    Why Use Portals for Computational
               Science?
• Computational science environment is complex:
   – Users have access to a variety of distributed resources (compute,
     storage, etc.).
   – Interfaces, OS‟s, Grid tools to these resources vary and change often
   – Environment changes:
        Relocation/upgrade of binaries
   – Policies at sites sometimes differ, allocations change
   – Using multiple resources can be cumbersome
• Grid adds complexity for programmers and users; portals
  can simplify if done well.
• Best Fit: Community Models  a lot of users to justify
  development effort
    Portals Provide Simple Interfaces
• Portals are web based and that has advantages -
   – Users know & understand the web
• Can serve as a layer in the middle-tier infrastructure of
  the Grid
   – Integrate various Grid services and resources
• Users can be isolated from resource specific details
• Single web interface isolates system
  changes/differences
• Not and end-all solution - several issues/challenges here
   – Performance, scalability
   – Tradeoffs
  “Simple” Computational Grid




                        LSF




Resource View:
  •Full Functionality
  •Very Complex
Portal View:
  •Hides Complexities
  •Limits User Functionality
Grid Portal = Portals +
    Web + the Grid
What is the Web?
            Web Server Technologies
• Web Servers:
   – Run on a machine, and clients access the process
   – Common Versions:
       Netscape (http://www.netscape.com)
       Apache (http://www.apache.org) - open source

• OS‟s: Windows, Unix, MacIntosh, Linux, etc.
• Web Programming Languages
   – Server: Java, Javascript, Python, PHP, Perl;
   – Client: HTML, Javascript
   – Protocols/Components: HTTP, CGI, Servlets, Applets,
     Jetspeed/portlets
• Security: HTTPS, SSL, Encryption, Cookies, Certificates
                       Web Clients
• Multiple display devices:
   – Desktop workstations, PC‟s, PDA‟s, cell phones, pagers, other
     wireless devices, televisions
• Various viewing tools:
   – Browsers: Internet Explorer, Netscape Navigator, Opera
   – Visually Impaired tools
• OS‟s: Windows/WinCE, Unix, Mac, Linux, Palm, etc.
• Web Programming Languages
   – HTML, Javascript
   – Perl, Java for „scrapers‟
• Security
   – HTTPS, SSL, Encryption, Cookies
   – Certificates
     Portal Features and Capabilities
• Web sites that provide centralized access to a set of
  resources: yahoo, msnbc, google
• Characterized by richer features than set of HTML pages
   – Personalization
   – Security/authentication/authorization
   – What you see often changes based on what you are looking for
     (e.g.: adds)
   – Navigation/choices
• Gateway for Web access to distributed
  resources/information
   – Hub from which users can locate all the Web content that they
     commonly need.
                  Classes of portals
• Horizontal or “mega-portals”:
   – information from search engines and the ISP's (yahoo)
   – everybody comes in, sees the same thing
   – allow personalization to some degree
• Vertical
   – portals that are customized by the system.
   – the system recognizes who you are, and gives you a different
     view of the university or the company that you're going to build.
   – More specialized (amazon, travelocity, etc.)
• Intranet:
   – inside a company that give particular people the information that
     they need
              Scientific Web Portals
• Use The Grid
• Three standard Types:
   – User Portals:
       simplify user‟s ability to interact with and utilize a complex,
        often distributed environment
       direct access to resources (compute, data, archival, etc.)
   – Application Interfaces
       Enables scientists to conduct simulations on multiple
        resources
   – Customized Portals
       Users can roll out their own portals by writing web pages
        using standard HTML or Perl/CGI, Java/JSP, etc.
• NOTE: Open Grid Services Architecture (OGSA) is
  changing all of this (will address later)
Conceptual: Web + Grid
Reality: GridPort Architecture


                                 NPACI

                                         Horizon/1124

                                         Power4/224

                                            SV/16

                                          AMD/256

                                          AMD/100

                                           SP/176

                                            SP/24
            Portal Technology Choices
• Commercial: more and more are adopting Grid (OGSA)
   –   Sun: Java Servlets, iPlanet
   –   IBM: WebSphere
   –   MSFT: .NET
   –   AVAKI (A. Grimshaw, Legion)  ahead of the curve
   –   Special interest groups:
            » JetSpeed (emerging grid portal standard), uPortal,
• R&D within Grid community:
   –   GCE-RG Common Portal Architecture – portlets (future)
   –   GridPort Toolkit (http://gridport.npaci.edu)
   –   GPDK (http://www.nlanr.org ) – Java based
   –   GridSphere (Ed Seidel, GridLab)
   –   Gateway (Fox)
   –   CCA (Gannon, large DOE project)
   –   GRADS (large NSF Project)
The GridPort Toolkit
                              GridPort 2.0
• Part of NPACkage
• Perl/CGI:
   – Easy to install
   – Dynamic
• Multiportal architecture
• Account data:
   – manage certs/keys,
     session info for users
• Grid: Globus, SRB,
  NWS, etc.
• Thin client
                         PACI HotPage
• Access portal to all resources
• Information Portal to all users
• Secure access for authorized
  users
• PACI Grid Software used:
    – Globus Toolkit(GRAM, GSI,
      GRIS, GIIS), SRB, MyProxy,
      NWS
• Built with the GridPort Toolkit
    – GP 2.0: Perl/CGI
• Services provided:
    –   Resource information/status
    –   job control
    –   data collection management,
    –   command execution
    –   personalization
                    Portal Accounts
• Portal accounts are not the same as resource accounts.
   – valid Grid user on resource, need allocations
   – processes run under own account with same access and
     privileges as if they had logged onto resource
• Portal users must have a digital certificate signed by a
  known Certificate Authority (CA)
   – And must get DN into mapfile
• Accounts for NPACI users obtained via an on-line web
  form:
   – Can generate a certificate - certificate and key are placed in a
     secure repository
GridPort 3.0: GCE Portal
              • Expanded CE Layer: thin
                client, GCE Shell, Portals,
                Portlets, Apps, etc.)
              • Distributed grid and web
                services (OGSA)
              • Workflow – interaction
                between components
              • Component Approach
                 – need OOPs capability 
                   Java
                 – Python, PHP/Perl
              • XML, database at core
       Grid Technologies Employed
• Globus GT 2.x (NMI R1, R2 (also earlier versions)
   – GT 3.0 in next version
• Security:
   – GSI is key enabling infrastructure
   – Globus Grid Security Infrastructure (GSI), SSH
   – MyProxy for remote proxies
• Job Execution:
   – Globus/GRAM Gatekeeper (key)
       used to run batch, interactive jobs and tasks on remote
        resources
   – Scheduler: Platform Computing (LSF, Multicluster);
       Integration with SGE, AVAKI, others (Texas grid)
   – Queing systems include PBS and others
  Grid Technologies Employed (cont.)
• Information & Monitoring Services:
   – Globus MDS 2.2, GIIS, GRIS
   – NWS, data from LSF, United Devices, etc.
   – Web service based GIS archival system: Grid-IAS
       Custom information provider scripts
       Grid Monitoring System (Java enhanced version of NCSA)

• File Management:
   – GridFTP --> key technology
   – SDSC Storage Resource (SRB)
       for file collection management
   – Sun SAN between TACC/Campus
Emerging Technologies:
       OGSA
  Jetspeed/Portlets
                     Web Services
• Architecture mechanisms for
    – dynamic service discovery (UDDI)
    – Separation of implementation from function (WSDL)
    – Knowx protocol (SOAP/HTTP, SOAP/RPC)
•   Service provider encapsulates implementation details
•   Client doesn‟t need details, just where/how to send request
•   Commercial world developing P2P web services
•   In some ways, Globus/GRAM is a web service
•   Advantage: language independent, so can run on any
    system
    – Community pursuing Python, Java, C++ at this time
     Open Grid Services Architecture
• IBM and Globus team               • Grid:
  integrated key concepts of Grid      – Security (PKI, GSI)
  and web                              – persistence – stateless web is
• Taking Grid community to next          gone: track task, user info, etc.
  level – services are                 – Handles to instances
  interoperable                     • Web:
• protocol based rather than           – HTTP transport layer
  implementation                       – Simple Object Access
   – PROTOCOLS                           Protocol (SOAP)
     examples: telnet, ftp, ssh        – XML
   – telnet                            – Web Services Description
        Login                           Language (WSDL)
        password
OGSA Component Approach: Workflow




• Grid & Web services components
• Standard interface
• Dynamic composition and exchange of data
           JetSpeed and Portlets
• New direction for grid computing portal
  community based on Apache and open source
• Uses Java plug-in software behind web servers
• Builds dynamic web pages based on client
  request:
  – Executes set of components (Java Portlets)
  – Composites them into a web page
  – Returns page to user
• Portlets exchanged by sharing code
  – WSDL will be employed
     Portlet-based Tools and Technology
Provided Capability
  – Management of user proxy
     certificates
  – Remote file Management via
     Grid FTP
  – Collaborations tools -
     News/Message systems
  – Event/Logging service
  – Access to OGSA services
  – Specialized Application
     Factories
  – Access to directory services
     and Metadata tools
                                   See
                                    http://www.extreme.indiana.edu/xportlets
            Jetspeed + Gridport 2.0
• Demonstrates advantage of using Jetspeed portlet
  classes to create URL connections to Perl/CGI code
  which make calls to Gridport.
   – This is the method we use to wrap Gridport with Jetspeed.
   – Needed for backwards compatibility of existing NPACI portals
• Only minor modification made to Gridport
   – Perl modules - authentication
   – pass Jetspeed session datap-set Gridport cookies
• Current Progress
   – Gridport Login/Logout
   – Globus Run
               Jetspeed Advantages
• Overall portal customization
   – Java Portlet mini code perform tasks.
   – Can install someone elses portlets
• Individual user customization
   – This will fulfill a need for users to tailor their portal interface to
     their liking.
• Open Source
   – Always being debugged, re-released.
   – One downside of Open Source is that documentation is limited.
     But, tight user/developer community provides some assistance.
• Template interfaces such as Velocity and JSP allow for
  presentation layer to be separated from java program
  layer.
                 Wrapping Gridport
• By utilizing Jetspeed portlet classes we can create URL
  connections to Perl/CGI scripts which make calls to
  Gridport.
• There is some minor modification that needs to be made
  to the Gridport Perl modules.
   – This is mostly passing session information in from Jetspeed and
     forcing Gridport to use that instead of it‟s own.
• Current Progress
   – Gridport Login/Logout
   – Globus Run
http://gridtest.tacc.utexas.edu:2080/jetspeed
Services grid computing portals (OGSA)

Grid-Specific                  Portal specific
• Security                     • view customization, user
• Account and allocation         session management, and
  management                     portal logging.
• Information, discovery and   • Groups, roles, sharing, access
  monitoring                     control
• Resource scheduling and      • collaboration and
  management                     communication systems -
                                 chat/instant messaging
• Data and collection            services, whiteboards,
  management                     calendars, newsgroups,
• Application support            citation browsers
                               • Ubiquitous access: browsers,
                                 cmd line, cell, pda….
          Recommended Technologies
                                            •   Information & Monitoring Services:
Grid                                             –   Globus MDS 2.2, GIIS, GRIS
•   OGSI/OGSA; Globus 3.0                        –   NWS, data from LSF, United Devices,
•   Globus GT 2.x (NMI R1, R2 (also                  etc.
    earlier versions)                            –   Web service based GIS archival
                                                     system: Grid-IAS
•   Security:                                           Custom information provider
     – GSI is key enabling Techn.                         scripts
                                                        Grid Monitoring System (Java
     – Grid Security Infrastructure                       enhanced version of NCSA)
     – MyProxy for remote proxies           •   File Management:
•   Job Execution:                               –   GridFTP --> key technology
     – Globus GRAM Gatekeeper (key)              –   SDSC Storage Resource (SRB)
          used to run batch, interactive              for file collection management
           jobs and tasks on remote              –   Sun SAN between TACC/Campus
           resources                        Portals
     – Scheduler: Platform Computing        •   Java, Jetspeed, portlets, CC
       (LSF, Multi-cluster);
                                            •   Web services (in addtn to grid)
          Integration with SGE, AVAKI,
           others (Texas grid)              •   Database back end
     – Queuing systems –PBS, LSF, etc       •   XML
GridPort-Based Portals
      Variety of GridPort Applications
• Current applications in production:
  – NPACI/PACI HotPages https://hotpage.npaci.edu
  – Telescience (Ellisman):
      https://gridport.npaci.edu/Telescience
  – Protein Data Bank CE Portal (Phil Bourne)
      https://gridport.npaci.edu/CE
  – LAPK Portal: Pharmacokinetic Modeling (live demo of
    Pharmacokinetic Modeling Portal)
      https://gridport.npaci.edu/LAPK
  – GAMESS:
      https://gridport.npaci.edu/GAMESS
                         PACI HotPage
• Access portal to all resources
• Information Portal to all users
• Secure access for authorized
  users
• PACI Grid Software used:
    – Globus Toolkit(GRAM, GSI,
      GRIS, GIIS), SRB, MyProxy,
      NWS
• Built with the GridPort Toolkit
    – GP 2.0: Perl/CGI
• Services provided:
    –   Resource information/status
    –   job control
    –   data collection management,
    –   command execution
    –   personalization
   Telescience: Access to Instruments/Data




• Uses GridPort to Integrate
Telescience technologies with the
Grid
    • Access to instruments
    • Globus job control
    • SRB data collections
•Migrating to BIRN
           NBCR GAMESS Portal
• Community Model
• XML database
• Access variety of
  compute resources
• Couples to proprietary
  visualization rendering
  services
• Part of multi-portal
  system
                  Future Directions
• Portals Workshop on Friday (open to all):
   – Goal is to bring NPACI Portal developers & users together
• GridPort 2.2 part of NPACKage, comliant with NMI
  program (NSF)
   – Perl has no GSI security capabilities – moving away
   – Developing Jetspeed/Portlet solutions for GridPort
   – Planning on pyGlobus version: pyGridPort
• Collaborating with U. Mich, Indiana, Argonne, NCSA to
  develop grid portlet repository
• Developing GridPort GCE
   – OGSA, Java/portlets, GCEShell interfaces
             GridPort Project Team
• GridPort Project represents collaboration efforts
  spanning the PACI Program:
   – Mary Thomas, Jay Boisseau, Maytal Dahan, Eric Roberts,
     Tomislav Urban (TACC)
   – Cathie Mills, Steve Mock, Kurt Mueller (SDSC)
   – Charles Severance, Joseph Hardin (U. Mich)
   – Dennis Gannon, Goeffrey Fox, Marlon Pierce (Indiana)
   – Argonne/ISI: Globus development team
• And input from other Institutions:
   – NASA/IPG
   – GGF/GCE Research Group
                     References
• Related AHM Sessions:
   –   Tutorial #7: SRB
   –   Tutorial #9: Grid Portals
   –   Parallel Session #2 (Weds): Grid Experiences
   –   Workshop on Portals (Friday)
• GridPort Toolkit: Contact:   Mary Thomas
  (mthomas@tacc.utexas.edu)
   – Project Websites: http://gridport.npaci.edu
   – Download: http://gridport.npaci.edu/download
• HotPages:
   – https://hotpage.npaci.edu, https://hotpage.paci.org

						
Related docs