Tree of Life 3 11 03

Information about Tree of Life 3 11 03

Published on August 11, 2007

Author: WoodRock

Source: authorstream.com

Content

The Tree of Life: Challenges for Discrete Mathematics and Theoretical Computer Science:  The Tree of Life: Challenges for Discrete Mathematics and Theoretical Computer Science Fred S. Roberts DIMACS Rutgers University Slide2:  The tree of life problem raises new challenges for mathematics and computer science just as it does for biological science. Slide3:  For math. and CS to become more effectively utilized, we need to: develop new tools; establish working partnerships between mathematical scientists and biological scientists; introduce the two communities to each others’ problems, language, and tools; . Slide4:  introduce outstanding junior researchers from both sides to the issues, problems, and challenges of problems arising from the tree of life; Slide5:  involve biological and mathematical scientists together to define the agenda and develop the tools of this field. Slide6:  These are some of the motivations for this meeting. I will lay out some of the challenges for math and CS, with emphasis on discrete math and theoretical CS. What are DM and TCS?:  What are DM and TCS? DM deals with: arrangements designs codes patterns schedules assignments TCS deals with the theory of computer algorithms. :  TCS deals with the theory of computer algorithms. During the first 30-40 years of the computer age, TCS, aided by powerful mathematical methods, had a direct impact on technology, by developing models, data structures, algorithms, and lower bounds that are now at the core of computing. Slide9:  DM and TCS have found extensive use in many areas of science and public policy, for example in Molecular Biology. These tools seem especially relevant to problems of the tree of life DM and TCS Continued:  DM and TCS Continued These tools are made especially relevant to the tree of life problem because of: Geographic Information Systems DM and TCS Continued:  DM and TCS Continued Availability of large and disparate computerized databases on subjects relating to species and the relevance of modern methods of data mining. Outline:  Outline Phylogenetic Tree Reconstruction Database Issues Nomenclature Setting up a Species Bank Digitization of Natural History Collections Interoperability The Many Applications of Research on the Tree of Life Phylogenetic Tree Reconstruction:  Phylogenetic Tree Reconstruction Phylogeny (continued):  Phylogeny (continued) New methods of phylogenetic tree reconstruction owe a significant amount to modern methods of DM/TCS. Trees, supertrees, consensus trees will all be discussed at length in this meeting I will only make a few brief remarks about them. Phylogenetic Challenges for DM/TCS:  Phylogenetic Challenges for DM/TCS Tailoring phylogenetic methods to describe the idiosyncracies of viral evolution -- going beyond a binary tree with a small number of contemporaneous species appearing as leaves. Dealing with trees of thousands of vertices, many of high degree. Making use of data about species at internal vertices (e.g., when data comes from serial sampling of patients). Phylogenetic Challenges for DM/TCS: Continued:  Phylogenetic Challenges for DM/TCS: Continued Network representations of evolutionary history - if recombination has taken place. Modeling viral evolution by a collection of trees -- to recognize the 'quasispecies' nature of viruses. Devising fast methods to average the quantities of interest over all likely trees. Thanks to Eddie Holmes and Mike Steel for ideas. DIMACS Working Group on Phylogenetic Trees and Rapidly Evolving Diseases, Sept. 3-6, 2003 Database Issues:  Database Issues Assembling the tree of life requires collecting massive amounts of data about the world’s scientific species. Making it a collaborative project requires making such data universally available. There are great challenges for Math and CS, specifically DM and TCS. Thanks to the Global Biodiversity Information Facility (GBIF) for many of the following ideas. Complexity of Data:  Complexity of Data In many ways, data about the world’s species are far more complex than genetic or protein sequence data. (GBIF) Complexity of Data (cont’d):  Complexity of Data (cont’d) There are databases of images, databases in numerous forms, etc. Data is heterogeneous. Data has errors and inconsistencies. Nomenclature:  Nomenclature There are some 1.75M named species By some estimates, there are up to 10M actual species. Nomenclature (cont’d):  Nomenclature (cont’d) The same species is often named more than once. On the average, each species has two additional names (synonyms) besides its own name. (GBIF) Nomenclature (cont’d):  Nomenclature (cont’d) Thus, there is need to assemble names in an electronic catalogue, with synonyms and common misspellings. This would be of fundamental importance in aiding research on biodiversity. Nomenclature (cont’d):  Nomenclature (cont’d) Because of errors, one major challenge for TCS is data cleaning. Nomenclature (cont’d):  Nomenclature (cont’d) Another challenge is to search a database to see if two entries are similar. This is a standard problem in database theory. TCS algorithms involving k-nearest neighbor and other methods are very helpful here. Setting up a Species Bank:  Setting up a Species Bank Setting up a Species Bank (cont’d):  Setting up a Species Bank (cont’d) A species bank would provide not only names, but also data about a species: Type Distribution Ecological role Phylogenetic history Physiology Genomics This involves issues about huge datasets. Setting up a Species Bank (cont’d):  Setting up a Species Bank (cont’d) NASA earth science satellites alone beam home image data at the rate of 1.2 terabytes a day. By 2010, this is expected to grow to 10 petabytes a day. (Kathleen Bergen, U. Michigan) Slide28:  Setting up a Species Bank (cont’d):  Setting up a Species Bank (cont’d) The problem is even worse: We need to combine information from many databases. There is no known way to catalogue all species of plants in one place given current database systems techniques. (Jessie Kennedy, Napier University, Edinburgh) Setting up a Species Bank (cont’d):  Setting up a Species Bank (cont’d) One possible approach: Tree and graph methods to support overlapping classifications as directed acyclic graphs or with complex objects (taxa or specimens) as nodes. (Jessie Kennedy) Digitizing Natural History Collections:  Digitizing Natural History Collections It has been estimated that there are between 1.5 and 3 Billion specimens in the world’s natural history collections, including herbaria, living microorganism stock centers, and other repositories (GBIF). Digitizing Natural History Collections (cont’d):  Digitizing Natural History Collections (cont’d) If we could digitize information about these specimens, and make them available, we would 'have a treasure trove of information about the world’s biota.' (GBIF) Pilot projects have shown that utilizing digitized data from several institutions’ databases can be a powerful tool. (GBIF) Digitizing Natural History Collections (cont’d):  Digitizing Natural History Collections (cont’d) Challenge: digitization and reference of non-standard data (photos, sonograms, field notes) Digitizing Natural History Collections (cont’d):  Digitizing Natural History Collections (cont’d) Challenge: Develop methods for visualizing the data (e.g., species’ distributions) Digitizing Natural History Collections (cont’d):  Digitizing Natural History Collections (cont’d) Challenge: Develop search engines for real-time searching of such extremely large data sets. Digitizing Natural History Collections (cont’d):  Digitizing Natural History Collections (cont’d) Challenge: Make information access on the web more knowledge-based so humans and intelligent software can work together. (Susan Gauch, U. Kansas) Digitizing Natural History Collections (cont’d):  Digitizing Natural History Collections (cont’d) Challenge: Use 'intelligent agents' to organize and present relevant information on the web. (Susan Gauch) Digitizing Natural History Collections (cont’d):  Digitizing Natural History Collections (cont’d) Challenge: Use partial information as 'training data' for classification algorithms (Susan Gauch) One approach: Use training data and classification algorithms with learning capabilities. (See: DIMACS project on Monitoring Message Streams) Digitizing Natural History Collections (cont’d):  Digitizing Natural History Collections (cont’d) Another approach to problems posed by digitization: Use tools of 'knowledge inferencing' (Yannis Ioannidis, University of Wisconsin) Still another approach: Use methods of spatio-temporal data mining (Ioannidis; see work of Muthukrishnan at Rutgers) Interoperability:  Interoperability Goal: Devise standards for datasets so as to allow researchers to collaborate across datasets – develop standards leading to database interoperability. (GBIF) Interoperability:  Interoperability Challenge: How do we develop ways to more accurately represent observational or experimental data so that others may use them? (Jessie Kennedy) Challenge: Deal with issues of inconsistency and scalability. Challenge: Formalize issues of policy with regard to others’ databases. Challenge: Interoperability over a diversity of users and types of equipment. Interoperability:  Interoperability One approach: 'Semantic Web' – the idea used to express the growing desire to make information access on the Web more knowledge-based so humans and intelligent software can work together. (Susan Gauch) Interoperability:  Interoperability Another approach: Make use of languages such as XML developed to aid interoperability in business and military collaborations. The Many Applications of Research on the Tree of Life:  The Many Applications of Research on the Tree of Life Side benefits in many fields: Agriculture Biomedicine Biotechnology Natural resource management Pest control Control of emergent diseases Sustainable use of biodiversity resources Global climate change The Many Applications of Research on the Tree of Life:  The Many Applications of Research on the Tree of Life Let’s say you’re importing bananas from South America The Many Applications of Research on the Tree of Life:  The Many Applications of Research on the Tree of Life A camera in the hold of the ship sees a spider. What kind of spider is it? Is it safe to unload your cargo of bananas? The Many Applications of Research on the Tree of Life:  The Many Applications of Research on the Tree of Life Luckily, you have a digitized natural history database. With an efficient search feature. (Thanks to Diana Lipscomb for this example) The Many Applications of Research on the Tree of Life:  The Many Applications of Research on the Tree of Life

Related presentations


Other presentations created by WoodRock

VoIP endfassung
18. 06. 2007
0 views

VoIP endfassung

Lone Wolf Presentation
22. 04. 2008
0 views

Lone Wolf Presentation

Guersenfinal
17. 04. 2008
0 views

Guersenfinal

10 bridge
16. 04. 2008
0 views

10 bridge

Reveiwfinal spring
14. 04. 2008
0 views

Reveiwfinal spring

ch03 edit
13. 04. 2008
0 views

ch03 edit

Howcroft CME
10. 04. 2008
0 views

Howcroft CME

ARPA07distribute
09. 04. 2008
0 views

ARPA07distribute

PowerPoint Presentation 2007
07. 04. 2008
0 views

PowerPoint Presentation 2007

Central Asia short
30. 03. 2008
0 views

Central Asia short

APALSAGeneralMeeting
27. 03. 2008
0 views

APALSAGeneralMeeting

elements compounds mixtures
04. 01. 2008
0 views

elements compounds mixtures

Moodle for english teachers
27. 06. 2007
0 views

Moodle for english teachers

YagerDOE2005
17. 09. 2007
0 views

YagerDOE2005

JESSICA2 HKJU Dec 18 2002
17. 09. 2007
0 views

JESSICA2 HKJU Dec 18 2002

wipo smes del 07 www 76775
24. 09. 2007
0 views

wipo smes del 07 www 76775

LDAP Integration
24. 09. 2007
0 views

LDAP Integration

SAR presentation Final
24. 09. 2007
0 views

SAR presentation Final

Politics ml Z
02. 10. 2007
0 views

Politics ml Z

sparkles
04. 10. 2007
0 views

sparkles

Extreme Makeover
17. 09. 2007
0 views

Extreme Makeover

current status ebxml cppa tc
29. 10. 2007
0 views

current status ebxml cppa tc

ast201 2007 lect11
28. 11. 2007
0 views

ast201 2007 lect11

judicial
28. 08. 2007
0 views

judicial

Laptop Security
28. 08. 2007
0 views

Laptop Security

hammer fatriv
28. 08. 2007
0 views

hammer fatriv

Air Monitoring
23. 10. 2007
0 views

Air Monitoring

CONFINED
07. 11. 2007
0 views

CONFINED

Kansas GRB 5
15. 11. 2007
0 views

Kansas GRB 5

ATS
16. 11. 2007
0 views

ATS

Lecture 4 Bioterrorism Dunne
17. 11. 2007
0 views

Lecture 4 Bioterrorism Dunne

wieser sybase
20. 11. 2007
0 views

wieser sybase

rushdie
21. 11. 2007
0 views

rushdie

Napoleon I
26. 11. 2007
0 views

Napoleon I

SonnetOL
11. 08. 2007
0 views

SonnetOL

Steve Lafferty optimized
11. 08. 2007
0 views

Steve Lafferty optimized

Tibetian test 2
11. 08. 2007
0 views

Tibetian test 2

Plumbing an Information Space
02. 01. 2008
0 views

Plumbing an Information Space

savas dangerous offenders
11. 08. 2007
0 views

savas dangerous offenders

Memory Revisited
12. 10. 2007
0 views

Memory Revisited

Dermatology Revision
05. 01. 2008
0 views

Dermatology Revision

FROM THE DISCOVERY OF HELIX
16. 10. 2007
0 views

FROM THE DISCOVERY OF HELIX

504d AACR poster 2005 cfg
30. 10. 2007
0 views

504d AACR poster 2005 cfg

Zeeberg
17. 09. 2007
0 views

Zeeberg

sweep
11. 08. 2007
0 views

sweep

Industrialization Ideology
26. 10. 2007
0 views

Industrialization Ideology

CS438 08 Bridges
28. 12. 2007
0 views

CS438 08 Bridges

sa advocacy
24. 09. 2007
0 views

sa advocacy

CausalArguments
26. 11. 2007
0 views

CausalArguments

JostDeutschAwards
07. 01. 2008
0 views

JostDeutschAwards

Class24ImlicatureExp
19. 02. 2008
0 views

Class24ImlicatureExp

Lars Nord Presentation at HA2005
08. 10. 2007
0 views

Lars Nord Presentation at HA2005

ConEvals
27. 02. 2008
0 views

ConEvals

moodle themes
27. 06. 2007
0 views

moodle themes

Moodle lokalp
27. 06. 2007
0 views

Moodle lokalp

Moodle na UE final
27. 06. 2007
0 views

Moodle na UE final

SIRESENAC06
06. 03. 2008
0 views

SIRESENAC06

Seance 4 Alissa fr
24. 10. 2007
0 views

Seance 4 Alissa fr

SKita gesture
11. 08. 2007
0 views

SKita gesture

8 lessons learnt from nms
18. 03. 2008
0 views

8 lessons learnt from nms

WORKING IN THE EU INSTITUTIONS
20. 03. 2008
0 views

WORKING IN THE EU INSTITUTIONS

semantic web applications
25. 03. 2008
0 views

semantic web applications

FutureofNews
05. 10. 2007
0 views

FutureofNews

sxu 1 05 06
11. 08. 2007
0 views

sxu 1 05 06

canarias
23. 10. 2007
0 views

canarias

Reintegration ProgramFinal
28. 12. 2007
0 views

Reintegration ProgramFinal

G Abaee
22. 11. 2007
0 views

G Abaee

tromsoe
11. 08. 2007
0 views

tromsoe

glazerbusan
12. 10. 2007
0 views

glazerbusan

Stockholm Tutorial June 2001
12. 03. 2008
0 views

Stockholm Tutorial June 2001

TF Rschede
18. 06. 2007
0 views

TF Rschede

telwisa 5
18. 06. 2007
0 views

telwisa 5

Teitler Framework
18. 06. 2007
0 views

Teitler Framework

STRUMENTI tris DI ATTUAZIONE
18. 06. 2007
0 views

STRUMENTI tris DI ATTUAZIONE

strategic plan
18. 06. 2007
0 views

strategic plan

STEROIDS
18. 06. 2007
0 views

STEROIDS

Slide musso taranto
18. 06. 2007
0 views

Slide musso taranto

V 005 Gierke
18. 06. 2007
0 views

V 005 Gierke

Vorlesung BGB AT 1
18. 06. 2007
0 views

Vorlesung BGB AT 1

violenza
18. 06. 2007
0 views

violenza

Varma
18. 06. 2007
0 views

Varma

usenix
18. 06. 2007
0 views

usenix

unter Mitglieder wenn das geht
18. 06. 2007
0 views

unter Mitglieder wenn das geht

Unterrichtsbeobachtu ng
18. 06. 2007
0 views

Unterrichtsbeobachtu ng

Traechtigkeit
18. 06. 2007
0 views

Traechtigkeit

todoslossantosanual
02. 11. 2007
0 views

todoslossantosanual

vortrag we mu 220602
18. 06. 2007
0 views

vortrag we mu 220602

SOR Legal Updates 2006 141962 7
11. 08. 2007
0 views

SOR Legal Updates 2006 141962 7

Bigwood 1
13. 03. 2008
0 views

Bigwood 1

lrec metadata
14. 11. 2007
0 views

lrec metadata

termininfo D2D Konferenz2006
18. 06. 2007
0 views

termininfo D2D Konferenz2006

3320 l09
17. 09. 2007
0 views

3320 l09

typologie
18. 06. 2007
0 views

typologie

antalya
03. 09. 2007
0 views

antalya

sermonpp thy will be done
11. 08. 2007
0 views

sermonpp thy will be done

gabriel
24. 09. 2007
0 views

gabriel

tack2
24. 09. 2007
0 views

tack2

VORTRAG BW
18. 06. 2007
0 views

VORTRAG BW

The Perils of Childhood Obesity
11. 08. 2007
0 views

The Perils of Childhood Obesity

GT TurkeyCountryPresent ation
23. 10. 2007
0 views

GT TurkeyCountryPresent ation

Open Everything 3 9
01. 10. 2007
0 views

Open Everything 3 9

arnaud
28. 09. 2007
0 views

arnaud

file1180026507
22. 10. 2007
0 views

file1180026507

yasinsky
24. 09. 2007
0 views

yasinsky

healthy body esteem
03. 10. 2007
0 views

healthy body esteem

moodle presentation epfl final
27. 06. 2007
0 views

moodle presentation epfl final

37 Yale SA Program Overview 07
24. 09. 2007
0 views

37 Yale SA Program Overview 07

song slides
11. 08. 2007
0 views

song slides

Stuttgart
18. 06. 2007
0 views

Stuttgart

site wsa
29. 02. 2008
0 views

site wsa

pearson
24. 09. 2007
0 views

pearson

09 s4 fr
11. 03. 2008
0 views

09 s4 fr

EPS
17. 10. 2007
0 views

EPS

OARS CRJ 2006
24. 09. 2007
0 views

OARS CRJ 2006

7Paul Hopkin
11. 12. 2007
0 views

7Paul Hopkin

Sofia 29 09 30 02
23. 11. 2007
0 views

Sofia 29 09 30 02

CSI NetSec2004
29. 10. 2007
0 views

CSI NetSec2004

santTOPch11
11. 08. 2007
0 views

santTOPch11

HumanCapitalFINAL
24. 09. 2007
0 views

HumanCapitalFINAL

Carmelo Polino
22. 10. 2007
0 views

Carmelo Polino

Poeplau ECLOUD07
03. 01. 2008
0 views

Poeplau ECLOUD07

peytonap
17. 09. 2007
0 views

peytonap

BUTE 2005feb Milano COST291
16. 10. 2007
0 views

BUTE 2005feb Milano COST291