klasky

Information about klasky

Published on November 20, 2007

Author: Charlo

Source: authorstream.com

Content

On the Path to Petascale: Top Challenges to Scientific Discovery:  On the Path to Petascale: Top Challenges to Scientific Discovery Scott A. Klasky NCCS Scientific Computing End-to-End Task Lead 1. Code Performance:  1. Code Performance From 2004 - 2008, computing power for codes like GTC will go up 3 orders of magnitude! 2 Paths for Pscale computing for most simulations. More physics. Larger problems. Code Coupling. My personal definition of leadership class computing. “Simulation runs on >50% of cores, running for >10 hours.” One ‘small’ simulation will cost $38,000 on a Pflop computer. Science scales with processors. XGC and GTC fusion simulations will run on 80% of cores for 80 hours ($400,000/simulation). Data Generated.:  Data Generated. MTF will be ~2 days. Restarts contain critical information to replay the simulation at different times. Typical Restarts = 1/10 of memory. Dumps every 1 hour. (Big 3 apps support this claim). Analysis files dump every physical timestep. Typically every 5 minutes of simulation. Analysis files vary. We estimate for ITER size simulations data output will be roughly 1GB/5 minutes. DEMAND I/O < 5% of calculation. Total simulation will potentially produce =1280TB + 960GB. Need > (16*1024+12)/(3600 * .05) = 91GB/sec. Asynchronous I/O is needed!!! (Big 3 apps (combustion, fusion, astro allow buffers). Reduces I/O rate to (16*1024+12)/3600 = 4.5GB/sec. (with lower overhead). Get the data off the HPC, and over to another system! Produce HDF5 files on another system (too expensive for HPC system). Workflow Automation is desperately needed. (with high-speed data-in-transit techniques).:  Workflow Automation is desperately needed. (with high-speed data-in-transit techniques). Need to integrate Autonomics into workflows…. Need to make it easy for the scientists. Need to make it fault tolerant/robust. A few days in the life of Sim Scientist. Day 1 -morning.:  A few days in the life of Sim Scientist. Day 1 -morning. 8:00AM Get Coffee, Check to see if job is running. Ssh into jaguar.ccs.ornl.gov (job 1) Ssh into seaborg.nersc.gov (job 2) (this is running yea!) Run gnuplot to see if run is going ok on seaborg. This looks ok. 9:00AM Look at data from old run for post processing. Legacy code (IDL, Matlab) to analyze most data. Visualize some of the data to see if there is anything interesting. Is my job running on jaguar? I submitted this 4K processor job 2 days ago! 10:00AM scp some files from seaborg to my local cluster. Luckily I only have 10 files (which are only 1 GB/file). 10:30AM first file appears on my local machine for analysis. Visualize data with Matlab.. Seems to be ok.  11:30AM see that the second file had trouble coming over. Scp the files over again… Dohhh Day 1 evening.:  Day 1 evening. 1:00PM Look at the output from the second file. Opps, I had a mistake in my input parameters. Ssh into seaborg, kill job. Emacs the input, submit job. Ssh into jaguar, see status. Cool, it’s running. bbcp 2 files over to my local machine. (8 GB/file). Gnuplot data.. This looks ok too, but still need to see more information. 1:30PM Files are on my cluster. Run matlab on hdf5 output files. Looks good. Write down some information in my notebook about the run. Visualize some of the data. All looks good. Go to meetings. 4:00PM Return from meetings. Ssh into jaguar. Run gnuplot. Still looks good. Ssh into seaborg. My job still isn’t running…… 8:00PM Are my jobs running? ssh into jaguar. Run gnuplot. Still looks good. Ssh into seaborg. Cool. My job is running. Run gnuplot. Looks good this time! And Later:  And Later 4:00AM yawn… is my job on jaguar done? Ssh into jaguar. Cool. Job is finished. Start bbcp files over to my work machine. (2 TB of data). 8:00AM @@!#[email protected] Bbcp is having troubles. Resubmit some of my bbcp from jaguar to my local cluster. 8:00AM (next day). Opps still need to get the rest of my 200GB of data over to my machine. 3:00PM My data is finally here! Run Matlab. Run Ensight. Oppps…. Something’s wrong!!!!!!!!! Where did that instability come from? 6:00PM finish screaming! Need metadata integrated into the high-performance I/O, and integrated for simulation monitoring.:  Need metadata integrated into the high-performance I/O, and integrated for simulation monitoring. Typical Monitoring Look at volume averaged quantities. At 4 key times this quantity looks good. Code had 1 error which didn’t appear in the typical ascii output to generate this graph. Typically users run gnuplot/grace to monitor output. More advanced monitoring 5 seconds move 600MB, and process the data. Really need to use FFT for 3D data, and then process data + particles 50 seconds (10 time steps) move & process data. 8 GB for 1/100 of the 30 billion particles. Demand low overhead <5%! Parallel Data Analysis.:  Parallel Data Analysis. Most applications use scalar data analysis. IDL Matlab. Ncar graphics. Need techniques such as PCA Need help, since data analysis is written quickly, and changed often… No harden versions…. Maybe…. Statistical Decomposition of Time Varying Simulation Data:  Statistical Decomposition of Time Varying Simulation Data Transform to reduce non-linearity in distribution (often density-based) PCA computed via SVD (or ICA, FA, etc.) Construction of component movies Interpretation of spatial, time, and movie components Pairs of equal singular values indicate periodic motion G Ostrouchov: [email protected] ETG GTC Simulation Data (Z. Lin, UCI and S. Klasky, ORNL) Decomposition shows transitions between wave components in time New Visualization Challenges.:  New Visualization Challenges. Finding the needle in the haystack. Feature identification/tracking! Analysis of 5D+time phase-space (with 1x1012) particles! Real-time visualization of codes during execution. Debugging Visualization. Where is my data?:  Where is my data? ORNL, NERSC, HPSS (NERSC,ORNL), local cluster, laptop? We need to keep track of multiple copies? We need to query the data. Query based visualization methods. Don’t want to distinguish between different disks/tapes.

Related presentations


Other presentations created by Charlo

jflex tutorial
27. 11. 2007
0 views

jflex tutorial

NCBCPresentation1 24 2006
01. 10. 2007
0 views

NCBCPresentation1 24 2006

Art 3101 Powerpoint Two
01. 11. 2007
0 views

Art 3101 Powerpoint Two

LOT Complex sentences
05. 11. 2007
0 views

LOT Complex sentences

A5LisaWelch
05. 11. 2007
0 views

A5LisaWelch

9 Wheat
04. 10. 2007
0 views

9 Wheat

coordinates
15. 11. 2007
0 views

coordinates

Toxocara canis
19. 11. 2007
0 views

Toxocara canis

Taslipinar
22. 11. 2007
0 views

Taslipinar

BCAKposter
13. 12. 2007
0 views

BCAKposter

jacquie white
23. 12. 2007
0 views

jacquie white

125789
29. 12. 2007
0 views

125789

South Asia
01. 01. 2008
0 views

South Asia

naor slides 1
05. 01. 2008
0 views

naor slides 1

intellilgence
07. 01. 2008
0 views

intellilgence

lab2
25. 12. 2007
0 views

lab2

ecco tezcan 28032007
28. 12. 2007
0 views

ecco tezcan 28032007

Overview of IP
26. 02. 2008
0 views

Overview of IP

Lesson 2 History
28. 02. 2008
0 views

Lesson 2 History

22 vcc parkview 9h30 lonn
04. 03. 2008
0 views

22 vcc parkview 9h30 lonn

AMST3100 Thepre 1960s
24. 02. 2008
0 views

AMST3100 Thepre 1960s

G3
14. 03. 2008
0 views

G3

jovage32zi7d3p6
18. 03. 2008
0 views

jovage32zi7d3p6

NK SIF2
27. 03. 2008
0 views

NK SIF2

Mujdat Altay Nortel
26. 11. 2007
0 views

Mujdat Altay Nortel

LIDA2007 Deyrup
30. 12. 2007
0 views

LIDA2007 Deyrup

Polar Bears
03. 10. 2007
0 views

Polar Bears

m07
21. 11. 2007
0 views

m07

Kimberly PPt
07. 01. 2008
0 views

Kimberly PPt

TRANS WP29 GRSP 38 inf12e
28. 11. 2007
0 views

TRANS WP29 GRSP 38 inf12e

Sabbatini
02. 01. 2008
0 views

Sabbatini

Sorrento highlights
03. 10. 2007
0 views

Sorrento highlights

EFSPlecture3
21. 12. 2007
0 views

EFSPlecture3