Scalability Panel
James B. White III (Trey)
1
Domain-specific performance metrics
• Simulated years / day
• Efficacy = 1/(time x error)
- Mousseau, Knoll, Reisner (2002)
• Climate/weather efficacy = skill / time
• Strong scaling
2
Role of accelerators
• Relatively cheap and available
• Favor move to higher resolution, more
physical processes
• Favor use of higher-order methods
- More flops per memory op
- Does high order help with low-order
parameterizations and noisy initial and boundary
conditions?
• See next slide
3
Themes for HPC application
development
• Last 5 years: Scale out
- More nodes
• Next 5 years: Scale in
- Parallelism within node/socket
• 5 years after that: Scale in time?
- Longer global time steps
- Increasing asynchrony
- Redundant and speculative computation
- Hierarchical fault recovery
- Aggressive in situ analysis and data reduction
4
Programming models
• Hardware will continue to drive software
• HPC architectures keep gaining levels in hierarchy
- Memory and process hierarchies
- Becoming more like a web than a hierarchy?
• Evolution to more levels in programming model
• Driven by availability, visibility, and robustness of
implementations
- Not potential for productivity of model
• No new languages (that I know of) show enough
promise to justify large-scale refactoring
• See next slide
5
Characteristics of a new highly
productive programming language
• Strong separation of concerns between (1) specification of
computation, (2) specification of job, (3) mapping to hardware
• Core language for minimalist specification of computation
- Emphasis on correctness and minimum dependencies (maximum
parallelism)
- One copy of code per application version, portable
• Meta-language to specify details of a job
- Input/output, parameters that affect output values
- One copy of code per use case, portable
• Meta-language for mapping computation to hardware
- Doesn't change semantics, doesn't affect correctness
- Doesn't change output values (significantly?)
- Multiple copies of code per computing environment
6
HPC system attributes
• Deeper hierarchies / more-complex webs
• Zillions of concurrent flops
• Huge data volumes
• Stagnant global latencies
7