991112 price grid

Information about 991112 price grid

Published on October 24, 2007

Author: Miguel

Source: authorstream.com

Content

Grid Computing for HEP :  Grid Computing for HEP L. E. Price Argonne National Laboratory HEP-CCC Meeting CERN, November 12, 1999 The Challenge:  The Challenge Providing rapid access to event samples and subsets from massive datastores, from 100s of Terabytes in 2000 to 100 Petabytes by 2010. Transparent access to computing resources, throughout the U.S., and throughout the World The extraction of small or subtle new physics signals from large and potentially overwhelming backgrounds Enabling access to the data, and to the rest of the physics community, across and ensemble of networks of varying capability and reliability, using heterogeneous computing resources Achieving a Balance:  Achieving a Balance Proximity of the data to central computing and data handling resources Proximity of frequently accessed data to the users, to be processed in desktops, local facilities, or regional centers Making efficient use of limited network bandwidth; especially transoceanic Making appropriate use of regional and local computing and data handling Involving scientists and students in each world region in the physics analysis Need for optimization:  Need for optimization Meeting the demands of hundreds of users who need transparent access to local and remote data in disk caches and tape stores Prioritizing hundreds to thousands of requests from the local and remote communities Structuring and organizing the data; providing the tools for locating, moving, and scheduling data transport between tape and disk and across networks Ensuring that the overall system is dimensioned correctly to meet the aggregate need Science and Massive Datasets:  Science and Massive Datasets Massive dataset generation the new norm in science High Energy Physics Nuclear Physics LIGO Automated astronomical scans (e.g., Sloan Digital Sky Survey) The Earth Observing System (EOS) The Earth System Grid Geophysical data (e.g., seismic) Satellite weather image analysis The Human Brain Project (time series of 3-D images) Protein Data Bank The Human Genome Project Molecular structure crystallography data Proposed Solution:  Proposed Solution A data analysis grid for High Energy Physics Analogy to Computing Grid:  Analogy to Computing Grid Because the resources needed to solve complex problems are rarely collocated Topic of intensive CS research for a number of years already Computing (or data) resources from a “plug on the wall” Why a Hierarchical Data Grid?:  Why a Hierarchical Data Grid? Physical Appropriate resource use  data proximity to users & labs Efficient network use  local > regional > national > oceanic Scalable growth  avoid bottlenecks Human Central lab cannot manage / help / care about 1000s of users Cleanly separates functionality of different resource types University/regional computing complements national labs  funding agencies Easier to leverage resources, maintain control, assert priorities at regional/local level Effective involvement of scientists and students independently of location Logical Steps toward Data Grid:  Logical Steps toward Data Grid 2000 2005 2010 1995 Production Basic Research Testbeds Design/Optimization (Pre)                              U.S. Grid Technology Projects:  U.S. Grid Technology Projects 2000 2005 2010 1995 LHC, GriPhyN Clipper/ NGI-PPDG                           Apogee PASS/Globus/HENP-GC /MONARC/GIOD/Nile In Progress:  In Progress Laboratory and experiment-specific development, deployment and operation (hardware and software); Tool development in HENP, Computer Science, Industry; The Particle Physics Data Grid: NGI-funded project aiming (initially) at jump-starting the exploitation of CS and HENP software components to make major improvements in data access. Business as usual Proposals being Developed:  Proposals being Developed GriPhyN: Grid Physics Networking Targeted at NSF; Focus on the long-term university-based grid infrastructure for major physics and astronomy experiments. APOGEE: A Physics-Optimized Grid Environment for Experiments Targeted at DoE HENP (and/or DoE SSI); Focus on medium to long-term software needs for HENP distributed data management; Initial focus on instrumentation, modeling and optimization. PPDG, APOGEE and GriPhyN:  PPDG, APOGEE and GriPhyN A coherent program of work; Substantial common management proposed; A focus for HENP collaboration with Computer Science and Industry; PPDG/Apogee will create “middleware” needed by data-intensive science including LHC. (Synergy but no overlap with CMS/Atlas planning.) Data Grid Projects in Context:  Data Grid Projects in Context Construction and Operation of HENP Data Management an Analysis Systems Tiers 0/1 >> $20M/yr of existing funding at HENP labs. e.g. SLAC FY1999 ~$7M equipment for BaBar (of which < $2M physics CPU); ~$3M labor, M&S. Data Grid Projects in Context:  Data Grid Projects in Context Construction and Operation of HENP Data Management and Data Analysis Systems at DoE Laboratories Tiers 0/1 GriPhyN HENP Data Manage-ment at Major University Centers Tier 2 Draft proposal for NSF funding: $5-$16M/year $16M = $8M hardware $5M labor/R&D $3M network Data Grid Projects in Context:  Data Grid Projects in Context Construction and Operation of HENP Data Management and Data Analysis Systems at DoE Laboratories Tiers 0/1 GriPhyN HENP Data Manage-ment at Major University Centers Tier 2 OO Databases and Analysis Tools Resource Management Tools Metadata Catalogs WAN Data Movers Mass Storage Management Systems Matchmaking Widely Applicable Technolgy and Computer Science (not only from HENP; 100s of non-HEP FTEs) Data Grid Projects in Context:  Data Grid Projects in Context Construction and Operation of HENP Data Management and Data Analysis Systems at DoE Laboratories Tiers 0/1 GriPhyN HENP Data Manage-ment at Major University Centers Tier 2 OO Databases and Analysis Tools Resource Management Tools Metadata Catalogs WAN Data Movers Mass Storage Management Systems Matchmaking PPDG Particle Physics Data Grid NGI Project Large-scale tests/service focused on use of existing components Data Grid Projects in Context:  Data Grid Projects in Context Construction and Operation of HENP Data Management and Data Analysis Systems at DoE Laboratories Tiers 0/1 GriPhyN HENP Data Manage-ment at Major University Centers Tier 2 OO Databases and Analysis Tools Resource Management Tools Metadata Catalogs WAN Data Movers Mass Storage Management Systems Matchmaking PPDG Particle Physics Data Grid NGI Project Unified Project Management Optimization and Evaluation Instrumentation Modeling and Simulation A new level of rigor as the foundation for future progress APOGEE Data Grid Projects in Context:  Data Grid Projects in Context Construction and Operation of HENP Data Management and Data Analysis Systems at DoE Laboratories Tiers 0/1 GriPhyN HENP Data Manage-ment at Major University Centers Tier 2 OO Databases and Analysis Tools Resource Management Tools Metadata Catalogs WAN Data Movers Mass Storage Management Systems Matchmaking PPDG Particle Physics Data Grid NGI Project Unified Project Management Optimization and Evaluation Instrumentation Modeling and Simulation APOGEE R&D + Contacts with CS/Industry Long-term Goals Testbeds Overall Program Goal:  Overall Program Goal A Coordinated Approach to the Design and Optimization of a Data Analysis Grid for HENP Experiments Slide21:  Particle Physics Data Grid Universities, DoE Accelerator Labs, DoE Computer Science Funded by DoE-NGI at $1.2M for first year PPDG Collaborators:  PPDG Collaborators Particle Accelerator Computer Physics Laboratory Science ANL X X LBNL X X BNL X X x Caltech X X Fermilab X X x Jefferson Lab X X x SLAC X X x SDSC X Wisconsin X Slide23:  First Year PPDG Deliverables Implement and Run two services in support of the major physics experiments at BNL, FNAL, JLAB, SLAC: “High-Speed Site-to-Site File Replication Service”; Data replication up to 100 Mbytes/s “Multi-Site Cached File Access Service”: Based on deployment of file-cataloging, and transparent cache-management and data movement middleware First Year: Optimized cached read access to file in the range of 1-10 Gbytes, from a total data set of order One Petabyte Using middleware components already developed by the Proponents Slide24:  PPDG Site-to-Site Replication Service Network Protocols Tuned for High Throughput Use of DiffServ for (1) Predictable high priority delivery of high - bandwidth data streams (2) Reliable background transfers Use of integrated instrumentation to detect/diagnose/correct problems in long-lived high speed transfers [NetLogger + DoE/NGI developments] Coordinated reservation/allocation techniques for storage-to-storage performance Slide25:  PPDG Multi-site Cached File Access System University CPU, Disk, Users PRIMARY SITE Data Acquisition, Tape, CPU, Disk, Robot Satellite Site Tape, CPU, Disk, Robot Satellite Site Tape, CPU, Disk, Robot University CPU, Disk, Users University CPU, Disk, Users Satellite Site Tape, CPU, Disk, Robot Slide26:  PPDG Middleware Components APOGEE Focus on Instrumentation and Modeling :  APOGEE Focus on Instrumentation and Modeling Planned proposal to DOE Originally targeted at SSI Roughly the same collaborators as PPDG Intended to be the next step after PPDG Understanding Complex Systems (Writing into the BaBar Object Database at SLAC):  Understanding Complex Systems (Writing into the BaBar Object Database at SLAC) Aug. 1: ~4.7 Mbytes/s Oct. 1: ~28 Mbytes/s APOGEE Manpower Requirements (FTE):  APOGEE Manpower Requirements (FTE) FY00 FY01 FY02 FY03 FY04 Instrumentation Low-level data capture 0.5 1 0.75 0.75 0.75 Filtering and collecting agents 0.5 1 1 1 1 Data analysis and presentation 0.5 1 1 0.75 0.75 HENP workload profiling 0.5 1 0.5 0.5 0.5 Simulation Framework design and development 1 2 1.5 1 0.5 User workload simulation 0.5 1 0.75 0.75 0.5 Component simulations (network, mass-storage system, object DB etc.) 1.25 2.5 2 1.5 1 Site simulation packages 1 1 1 Instrumentation/Simulation Testbed Instrumentation of existing experiment(s) (e.g.PPDG) 0.5 1 1 1 1 Acquire and simulate performance measurements 0.25 0.5 0.5 0.75 1 Acquire user workload profile 0.25 0.5 0.5 0.25 0.25 Test prediction and optimization 0.5 0.75 0.75 Evaluation and Optimization Quantify evolving needs of physics (including site policies etc.) 0.25 0.5 0.5 0.5 0.5 Develop metrics for usefulness of data management facilities 0.5 1 1 1 1 Optimize model systems 0.5 1 1.5 Long-Term Strategy (Towards "Virtual Data") Tracking and testing HENP/CS/Industry developments 1 2 1.5 1.5 1.5 Development projects in collaboration with HENP/CS/Industry 0.5 1 1.5 Project Management (APOGEE and PPDG) Project leader (physicist) 0.5 1 1 1 1 Lead computer scientist 0.5 1 1 1 1 TOTALS 8.5 17 17 17 17 APOGEE Funding Needs:  APOGEE Funding Needs $k $k $k $k $k FY00 FY01 FY02 FY03 FY04 Manpower Instrumentation 250 500 406 375 375 Simulation 344 688 656 531 375 Instrumentation/Simulation Testbed 125 250 313 344 375 Evaluation and Optimization 94 188 250 313 375 Long-Term Strategy (Towards "Virtual Data") 125 250 250 313 375 Project Management (APOGEE and PPDG) 225 450 450 450 450 Commercial Software 100 250 375 500 500 Testbed hardware (in addition to parasitic 150 400 400 400 400 use of production systems) Workstations, M&S, Travel 128 255 255 255 255 TOTALS 1540 3230 3355 3480 3480 GriPhyN Proposal:  GriPhyN Proposal Addresses several massive dataset problems ATLAS, CMS LIGO Sloan Digital Sky Survey (SDSS) Tier 2 computing centers (university based) Hardware  commodity CPU / disk / tape System support Networking Transatlantic link to CERN  "high-speed" Tier 2 backbone  multi-gigabit/sec R&D Leverage Tier 2 + existing resources into Grid Computer Science partnership, software GriPhyN Goals:  GriPhyN Goals Build production grid Exploit all computing resources most effectively Enable US physicists to participate fully in LHC program (also LIGO, SDSS) Eliminate disadvantage of not being at CERN Early physics analysis at LHC startup Maintain and extend US leadership Build collaborative infrastructure for students & faculty Training ground for next generation leaders Tier 2 Regional Centers:  Tier 2 Regional Centers Total number »20 ATLAS: 6 CMS: 6 LIGO: 5 SDSS 23 Flexible architecture and mission  complements national labs Intermediate-level data handling Makes possible regional collaborations Well-suited to universities (training, mentoring and education) Scale: Tier2 = (university * laboratory)1/2 1 scenario: Tier 2 = Tier 1 Tier 2  20% Tier 1 GriPhyN Funding (Very Rough):  GriPhyN Funding (Very Rough) R&D Proposal $15M (Jan. 1999):  R&D Proposal $15M (Jan. 1999) R&D goals (complementary to APOGEE / PPDG) Data, resource management over wide area Fault-tolerant distributed computing over LAN High-speed networks, as they relate to data management Grid testbeds (with end-users) Simulations crucial to success MONARC group With APOGEE / PPDG Leverage resources available to us Strong connections with Computer Science people Existing R&D projects Commercial connections Grid Computing: Conclusions:  Grid Computing: Conclusions HENP at the frontier of Information Technology Collaboration with Computer Science; Collaboration with industry; Outreach to other sciences. Visibility (and scrutiny) of HENP computing; Enabling revolutionary advances in data analysis in the LHC era Increasing the value of the vital investment in experiment-specific data-analysis software

Related presentations


Other presentations created by Miguel

Att5 India
27. 03. 2008
0 views

Att5 India

FAPRI Biofuels addendum 2
04. 10. 2007
0 views

FAPRI Biofuels addendum 2

agroterrorism
24. 10. 2007
0 views

agroterrorism

56157
29. 10. 2007
0 views

56157

PMA URS31 IACSDUBAI VANCOUVER
08. 11. 2007
0 views

PMA URS31 IACSDUBAI VANCOUVER

jaipur
28. 11. 2007
0 views

jaipur

MH1
28. 11. 2007
0 views

MH1

ES 832 Lect 4
06. 12. 2007
0 views

ES 832 Lect 4

01guber
25. 10. 2007
0 views

01guber

Moore 16Mar05 Digits vs RDOs
01. 11. 2007
0 views

Moore 16Mar05 Digits vs RDOs

2007FLawards
02. 11. 2007
0 views

2007FLawards

ESD Wood preservatives BH
12. 11. 2007
0 views

ESD Wood preservatives BH

kotlerkeller chapter 14
15. 11. 2007
0 views

kotlerkeller chapter 14

ICATforWCC 002
16. 11. 2007
0 views

ICATforWCC 002

jaguar
20. 11. 2007
0 views

jaguar

Uganda
24. 10. 2007
0 views

Uganda

Predation07
30. 12. 2007
0 views

Predation07

Selim hydropower
22. 11. 2007
0 views

Selim hydropower

InstantMessaging
30. 12. 2007
0 views

InstantMessaging

2005 skin ppt
03. 01. 2008
0 views

2005 skin ppt

Crypto durant la 2nd guerre
04. 01. 2008
0 views

Crypto durant la 2nd guerre

absorption
04. 01. 2008
0 views

absorption

Tech Preso
07. 01. 2008
0 views

Tech Preso

Andhra Pradesh
07. 01. 2008
0 views

Andhra Pradesh

Pumpkins Squash
05. 11. 2007
0 views

Pumpkins Squash

Target System Design Update
07. 11. 2007
0 views

Target System Design Update

SgrA Reid
15. 11. 2007
0 views

SgrA Reid

The Next Great Generation
02. 10. 2007
0 views

The Next Great Generation

EigenTrust
18. 12. 2007
0 views

EigenTrust

Carlstrom
14. 11. 2007
0 views

Carlstrom

MTTppsb
05. 11. 2007
0 views

MTTppsb

installingvaporrecov ery
06. 11. 2007
0 views

installingvaporrecov ery

ShiftShare
20. 02. 2008
0 views

ShiftShare

ch5FamilyCharacteris tics
24. 02. 2008
0 views

ch5FamilyCharacteris tics

zuccaarl08
27. 02. 2008
0 views

zuccaarl08

Bill Presentation
05. 03. 2008
0 views

Bill Presentation

1 HGIS011006
25. 10. 2007
0 views

1 HGIS011006

matmata
12. 12. 2007
0 views

matmata

SKADS MCCT Controllers 22032007
14. 03. 2008
0 views

SKADS MCCT Controllers 22032007

WSISRO Openinghbn110702
30. 03. 2008
0 views

WSISRO Openinghbn110702

cert microsoft 03
28. 11. 2007
0 views

cert microsoft 03

g rime
06. 11. 2007
0 views

g rime

MalaysiaMar2206
13. 04. 2008
0 views

MalaysiaMar2206

Buildup Activities
22. 11. 2007
0 views

Buildup Activities

Unit 5 1 2004
01. 01. 2008
0 views

Unit 5 1 2004

rm1910
30. 10. 2007
0 views

rm1910

TRIMS InvClimate Eng
26. 10. 2007
0 views

TRIMS InvClimate Eng

02 06 chistov
27. 09. 2007
0 views

02 06 chistov

BioBank SheilaCasserly
28. 09. 2007
0 views

BioBank SheilaCasserly

atividadefisicaeobes idade
28. 12. 2007
0 views

atividadefisicaeobes idade

code invaders
06. 11. 2007
0 views

code invaders

saghala Cedric Patin
28. 11. 2007
0 views

saghala Cedric Patin