CSWIM Framework and Software
Randall Bramley
Indiana University
Overview
• Bad news
– Taking a code running on limited number of sites and turning
into a community resource is hard
– Documentation and support specifications are necessary
• Good news
– We can do this: right people, right tools, right time
– Introduces new capabilities for plasma research that no one
else will have (but a lot of up-front investment needed)
• Some basic terminology …
19 Sept 2005 Fusion Simulation Project 2
A Component Architecture is
• A set of objects called components, and the
rules that govern their interactions called
interfaces
19 Sept 2005 Fusion Simulation Project 3
Framework Has Different Meanings
• General software composition environment
– Examples: CCA, CORBA Component Model, …
• A problem-solving environment for applications in a particular domain
(numerical relativity, computational chemistry, …)
– Earth System Modeling Framework
– Computational Facility for Reacting Flow Science
• A workflow management system
– Example: Kepler
• Integrated data analysis and visualization environments
• More loosely, an informal environment for composing applications
from subparts
19 Sept 2005 Fusion Simulation Project 4
Components, Ports, Interfaces
• A component is some collection of functionality (a
subroutine, a library, a complete code)
• Components have published interfaces specifying
how other components may interact with them
– How the component implements its functionality is hidden
– Components may interact with world without using
interfaces: I/O to files, MPI communications, …
• A port is a collection of methods (OO-speak for
functions or subroutines). We’ll drop this
intermediate idea, it’s not needed here
19 Sept 2005 Fusion Simulation Project 5
Basic Motivation (from TSTT/TOPS)
Overture
NWGrid
MOAB ISIS++
Interfaces
Mesquite Trilinos
GRUMMP PETSc
Frontier
FMDB Overture
NWGrid
TSTT Unstructured
Mesh Interface
Linear Solver
MOAB ISIS++
Interface
Mesquite Trilinos
GRUMMP PETSc
Frontier
FMDB
19 Sept 2005 Fusion Simulation Project 6
Components and Interfaces
GetDistribution GetState GetState InitializeState
SetState SetState LockState
EquilibriumAdvancer Plasma State Comp
• Components provide or use one or more interfaces
• As long as the interfaces don’t change, you can do
anything you like on the inside
– Or use any Equilibrium Advancer with that has the same
interfaces as the community agrees upon.
19 Sept 2005 Fusion Simulation Project 7
Components, Interfaces, CSWIM
• We are going to adapt a more informal and
flexible approach
• Use ideas of components and interfaces, but
we are forging into new territory here
– We’ll use what’s useful and helpful
– We’ll help guide the CS software component world
to handle this kind of new community effort
• I’ll use framework for the overall execution
system, components for the computational
parts, and infrastructure for shared utilities
19 Sept 2005 Fusion Simulation Project 8
Stages to Get Where We Need to Be
Stage 1: define components and their
interfaces at least informally
• A component need not be an existing code
– CSWIM design is for it to represent a category of codes, or
smaller utilities
• One code may provide the services specified by
multiple components.
• Vital that we not define a component in terms of what
code XYZ does
• Avoid encumbering interface definitions with some
cool, useful, but unique capability code XYZ has
– Make it a subclass of the general interface, or
– make it a separate interface
19 Sept 2005 Fusion Simulation Project 10
Stage 2: define components and their
interfaces formally
• Best: use Scientific Interface Definition
Language (SIDL)
• Minimally provide as a Fortran 2003 module,
or subroutine with argument list declarations
which can then be translated into SIDL
• Reason for SIDL
– Provides formal definition
– Has a parse-check mechanism to verify semantics
– Can have stubs automatically generated for
implementations in Fortran95, Fortran77, C/C++,
Java, Python, …
19 Sept 2005 Fusion Simulation Project 11
Stage 3: Compromises with reality
• Perhaps no code yet provides interface IJK but we
can modify it to something existing codes provide
• Data translators, conversions should be separate
components
– As simple as interpolating between two finite difference
mesh representations of a field
– As complex as taking a field represented in spectral space,
handing off to a real space finite element or PIC
representation
– May need to compromise/prioritize on these convertors
19 Sept 2005 Fusion Simulation Project 12
Stage 4: Implement the interfaces
– This page intentionally blank in IBM-speak
19 Sept 2005 Fusion Simulation Project 13
Stage 5: Connect communicating
components via file I/O
1. Component XYZ executes
2. Component XYZ writes data per interface spec
3. Framework service notified writing completed
4. Framework moves files to where “next” component
needs them
5. Component UVW started, reads in data from file per
interface spec
6. Lather, rinse, repeat
19 Sept 2005 Fusion Simulation Project 14
Stage 5.5: Connect communicating
components via parallel file I/O
• File I/O carried out in parallel
• File transfer is done via parallel ftp or other
HPC protocol
19 Sept 2005 Fusion Simulation Project 15
Stage 6: Live connections w/o
writing/reading to/from hard drives
• A change that should not be visible to the
application user
• Uses virtual serialization
19 Sept 2005 Fusion Simulation Project 16
What do we owe each other?
• Framework using a script/portal interface
– Launch executables on “suitable” platform
– Monitor and inform you of job status (statii?)
– Track and move data as needed
– Connecting components with shared interface
• Infrastructure services
– Parallel/serial I/O interface for initial coupling
– Linear solver interface to access advanced solvers
– Centralized simulation time services ?
– Shared/reusable data convertors ?
19 Sept 2005 Fusion Simulation Project 17
What do we owe each other?
• The interface definitions
– Done right, this is much harder than you think
– Done right, it will clarify much thinking
• Aids to sharing/running codes by others
– Invoices of what executable components expect and require
for input files, output locations, parameter settings, run-time
libraries, …
– Documentation, and codes must be runnable by nonexperts
and as portable as possible
– Version control use, contact person for problems
– Much of what NTCC worked towards .. But now for parallel
HPC systems.
19 Sept 2005 Fusion Simulation Project 18
Interfaces are an Investment
• The larger the community, the greater the time &
effort required to obtain agreement
– True in component and non-component environments
– MPI 1.0 took 1.5 years of regular meetings
– CCSM/ESMF are still evolving after 10 years
• Formality of “standards” process will increase with
time – but now imperative is to get physics done
along with the software development
19 Sept 2005 Fusion Simulation Project 19
Interfaces are an Investment
• Biggerstaff’s Rule of Threes (from Gary Kumfert)
– Must look at at least three systems to understand what is
common (reusable)
– Reusable software requires three times the effort of usable
software
– Payback only after third release
• Bramley’s addendum
– May take three years before system becomes productive
– May provide the team with three times the capabilities of
competitors
• Mike Normans Corollary
– Ultimately places apps scientist two years ahead of
competitors
19 Sept 2005 Fusion Simulation Project 20