Exploration & Production Technology delivering breakthrough solutions SEG HPC Summer Workshop 2011 Richard Clarke Talk outline • 10 years ago • Bounce around the present − reflects an industry with lots of changing options • A possible future? ~10 years ago… • CM5 just retired • We were using SGI Origins • First trials with a Linux cluster − 32 dual cpu computers − Not using MPI (did it exist?) − “manual” job submission using rsh − manual reboot • Mix of production processing & research • ~99% utilization rate HPPC size = f(Moore’s law, oil price) Flops Memory Disk I/O bandwidth Time 4 Today • almost 5000 times bigger than 1999: − 450 teraflops − 3500 computers with over 40,000 CPUs − 230 terabytes of physical memory − 15 petabytes of raw disk storage − mix of large/medium memory systems • used for applied research – NOT production processing − team of geophysicists, mathematicians, computer scientists − focus on quick development & application of new technology • SGE job submission/monitoring • Utilization is ~99% 5 What keeps the computers busy? • Geophysically: − Propagating waves forwards, backwards, up, down… − Summing waves and images together − Cross correlating waves and images − Solving Inverse Problems • Mathematically: − FFT − Finite difference extrapolation − Convolutions / Cross correlation (often via FFT) − Linear algebra • Computationally, we typically do: − something embarrassingly parallel, followed by − (maybe) transpose/summation and user QC,… − and then something else embarrassingly parallel,… What keeps the users busy? • The trivial stuff − Book-keeping − Summing − Sorting − Quality Control − How to best QC terabytes of data? − Diagnostic statistics help, but it is hard to replace the human eye − Debugging − Users often need to know algorithm implementation details to be able to run a program successfully What keeps the developers busy? • Debugging ! • 10 years ago: − almost all our codes were written in Fortran77 − researchers developed codes alone − debugging was done with print statements and Sun Workshop • Today: − Almost all codes are written in Fortran77 − a few are written in Fortran2003 − the “Fortran-2003-ers” are sharing code! − parallelization is done with OpenMP and MPI − debugging is mostly done with print statements − debugging MPI codes is painful, even with debuggers What do we use the large memory machines for? • Open/interactive use − Shared by lots of users for doing the trivial stuff − Need lots of cpus, memory and for folks to “play nicely” − Avoids the “idle nodes waiting for large MPI job” user frustration • “Debugging” − Initial R&D on large problems − It is much easier to debug, then optimize later with MPI, etc Fault Tolerance • Increasingly an issue − more machines means higher risk of faults − increasing use of local disk • Not handled well with current MPI GPU clusters & Cloud Computing • Not currently using GPU clusters − sponsoring & monitoring external projects − large effort required for porting − short half-life of technology strategy of longer access through vendors − energy requirements (we need a new building) • Cloud computing − Upload 50Tb, compute, create 100Tb tmp files, download 50Tb ?? Challenges & Initial Conclusions • Still lots of problems that are too expensive to run − Fewer approximations (anisotropic, elastic, visco-elastic) − Finer scale, larger datasets • Still lots of problems that require lots of memory − Local disk use is growing as a stop gap fault tolerance is an issue • Visualizing large (many Terabyte) datasets − Viz clusters are an option, but expensive for lots of users • Steep learning curve for developing parallel applications The standard conclusions: bigger, faster, easier, please. Let’s have some fun… • Connectivity to the end “user” − Keeping computers busy does not directly generate $ − Need to get concise data to the end decision maker • Let’s look at some examples: − A past example − An almost there example − A futuristic example… A field example: Valhall • ~10 years ago − Processed the 1st 3D OBC survey at Valhall − Start to finish, processing times were about 12-18 months − 2 months to migrate ~25Gb data on 2 x 32cpu SGI machines − Chopped the input data into pieces, create pieces of the output, then summed it − Lots of approximations: isotropic, cause traveltime grids,… to speed it up • In 2003 the idea of permanently installed sensors materialized… − The volume of data exploded: the first survey was 7Tb − A new survey is acquired every 4-6 months Valhall Life of Field Seismic (LoFS), 2003- present Pre-processing on the platform Data Recording Fastest turnaround: 4 days 3.5Tb data/survey Still in use: 13 surveys acquired to date Enabled cross-discipline integration Imaging Data transfer Now an “almost there” example… Velocity Model Building • Salt model building is still intensely manual • A recent project imaged ~100 different models • Modelling uncertainties A futuristic example… There’s an app for that… • accelerometers • wireless networks • commodity If iphones could fly… • Combine with the $20 toy from the local mall… • Robotization research proposal from U. Delft. • Perfect for rough terrain • Similarly for marine: − this AUV from Rutgers U. crossed the Atlantic …what if thousands could fly in formation… • senseable.mit.edu/flyfire …add HPC to get ‘imaging adaptive surveys’ Redeploy sensors Image Image • complements rapid ISS acquisition A few issues… Coupling… Any Hitchcock fans? Conclusions • “Build it and they will come” − demand continues to grow − we are still limited by compute power & memory • We need simpler tools for: − development of parallel applications − fault tolerance • Connectivity to end users is important! Acknowledgements • I would like to thank − bp for permission to give this presentation, − the organizers for inviting me to talk today, − you for listening.
Pages to are hidden for
"FreeUSP and FreeDDS"Please download to view full document