Microarray Data Standard

Information about Microarray Data Standard

Published on November 7, 2007

Author: Belly

Source: authorstream.com

Content

Microrray Data Standardisation :  Microrray Data Standardisation Microarray Gene Expression Database group -- MGED December, 2000 Public data repositories for microarray data:  Public data repositories for microarray data There is a growing consensus in the life science community for a need for public repositories of gene expression data analogous to DDBJ/EMBL/GenBank for sequences Some of the reasons::  Some of the reasons: Gradually building up gene expression profiles for various organisms, tissues, cell types, developmental stages, various states, under influence of various compounds Through links to other genomics databases builds up systematic knowledge about gene functions and networks Comparison of profiles, access and analysis of data by third parties Cross validation of results and platforms - quality control Systematic gene expression profiling initiatives in public domain:  Systematic gene expression profiling initiatives in public domain The International Life Science Institute (ILSI) is coordinating a program undertaken by ~25 pharmaceutical and food companies to generate toxicity related gene expression data under defined experimental conditions evaluate gene expression profiles in standardised test systems following exposure to toxicants relate changes in gene expression to other measures of toxicity Microarray data handling and analysis - a major bottleneck (Calculations by Jerry Lanfear):  Microarray data handling and analysis - a major bottleneck (Calculations by Jerry Lanfear) Experiments: 100 000 genes in human 320 cell types 2000 compounds 3 time points 2 concentrations 2 replicates Data 8 x 1011 data-points 1 x 1015 = 1 petaB of data Expression data repository projects:  Expression data repository projects Public repositories in making: GEO - NCBI GeneX - NCGR ArrayExpress - EBI In-house databases - Stanford, MIT, University of Pennsylvania, Organism specific databases: Mouse in Jackson Proprietary databases - Gene Logic, NCI Difficulties:  Difficulties Raw data are images What is needed for higher level analysis and mining is gene expression matrix (genes/samples/gene expression levels) lack of standard measurement units for gene expression lack of standards for sample annoation Raw data - images:  Raw data - images Treated sample labeled red (Cy5) Control data labeled green (Cy3) Competitive hybridization onto chip Red dot - gene overexpressed in treated sample Green dot - gene underexpressed in treated sample Yellow - equally expressed Intensity - “absolute” level red/green - ratio of expression 2 - 2x overexpressed 0.5 - 2x underexpressed log2( red/green ) - “log ratio” 1 2x overexpressed -1 2x underexpressed cDNA plotted microarray Stanford university (Yeast,1997) Gene expression matrix:  Gene expression matrix Samples Genes Gene expression levels Gene expression levels:  Gene expression levels What we would like to have gene expression levels expressed in some standard units (e.g. molecules per cell) reliability measure associated with each value (e.g. standard deviation) What we do have each experiment using different units no reliability information Comparing expression data:  Comparing expression data Comparing expression data:  Comparing expression data Comparing expression data:  Comparing expression data Measurement units:  Measurement units In perspective: standard controls for experiments (on chips and in the samples) replicate measurements Temporary solution: storing intermediate analysis results (including the images) and annotations of how they were obtained - i.e., the evidence Comparing expression data - problem 2:  Comparing expression data - problem 2 How gene names relate in different data matrices? How samples relate in different data matrices? Sample annotation:  Sample annotation Gene expression data have any meaning only in the context of what are the experimental conditions of the target system Controlled vocabularies and ontologies (species, cell types, compound nomenclature, treatments, etc) are needed for unambiguous sample annotation Sample annotations in current public databases are typically useless In perspective:  In perspective Standard units for gene expression measurements Standards for sample annotation. More immediate actions:  More immediate actions To understand what information about microarray experiments should be captured to make the descriptions reasonably self-contained Develop data exchange format able to capture this minimum information Develop recommendations how data should be normalised and what controls should be used MGED group:  MGED group The MGED group is an open discussion group initially established at the Microarray Gene Expression Database meeting MGED 1 (14-15 November, 1999, Cambridge, UK). The goal of the group is to facilitate the adoption of standards for DNA-array experiment annotation and data representation, as well as the introduction of standard experimental controls and data normalisation methods. The underlying goal is to facilitate the establishing of gene expression data repositories, comparability of gene expression data from different sources and interoperability of different gene expression databases and data analysis software. Since 1999 the group has had two general meetings and the third one is planned for 2001 For more see www.mged.org MGED participants including:  MGED participants including Affymetrix Berkeley DDBJ DKFZ EMBL Gene Logic Incyte Max Plank Institute NCBI NCGR NHGRI Sanger Centre Stanford Uni Pennsylvania Uni Washington Whitehead Institute Working groups:  Working groups Microarray experiment annotations and minimum information standards (A. Brazma) XML-data communication standards and interfaces (P. Spellman) Ontology for sample description (M. Bittner) Cross platform comparison and normalisation (F.Holstege, R.Bumgarner) Future user group - queries, query languages and data mining (M. Vingron) MGED state of art:  MGED state of art Formulation of the “minimum information about a microarray experiment” (MIAME) to ensure its interpretability and reproducibility Data exchange format based on XML - microarray markup language (MAML) submitted to OMG in November MIAME six parts: :  MIAME six parts: 1. Experimental design: the set of the hybridisation experiments as a whole 2. Array design: each array used and each element (spot) on the array 3. Samples: samples used, the extract preparation and labeling 4. Hybridizations: procedures and parameters 5. Measurements: images, quantitation, specifications 6. Controls: types, values, specifications see www.mged.org for details MIAME concepts:  MIAME concepts MIAME is aimed at co-operative data submitter Concept of “qualifier, value, source” lists, where source is either user defined or an external reference Reusable information can be referenced, but should be provided at least once (array descriptions, standard protocols) Raw data should be reported, together with the authors interpretations MAML:  MAML MAML is an XML based data exchange format able to capture MIAME compliant information The work is still in progress, the first draft has been submitted to OMG as a data exchange standard for microarray data MAML concepts:  MAML concepts Annotations + data; data can be given as a set of external 2D matrices Data format independent on particular scanner or image analysis sofwater Sample and treatment can be represented as a DAG Concept of composite images and composite spots Sample and treatment representation:  Sample and treatment representation Sample 1 Sample 2 Sample 3 Array 1 Array 2 Treatments Expression matrix - raw and processed:  Expression matrix - raw and processed  Microarray image analysis data representation:  Microarray image analysis data representation Images Spots Quantitations primary images composite images e.g., green/red ratios primary spots composite spots MAML future:  MAML future The NOMAD microarray LIMS system will export data in MAML format ArrayExpress and GEO will import data in MAML format We hope that OMG will accept MAML as the industry standard We hope that MAML will become a defacto standard MGED steering committee :  MGED steering committee Meeting in Bethesda on 17 Nov 2000 MIAME accepted and a publication urging the journals and funding agencies to adopt it will be prepared MGED will become ISCB Special Interest Group Next general MGED meeting in Stanford, March 29-31 Top level object model for gene expression database:  Top level object model for gene expression database

Related presentations


Other presentations created by Belly

Capital budgeting
28. 04. 2008
0 views

Capital budgeting

Nice pics slides
17. 09. 2007
0 views

Nice pics slides

perceptron 2 4 2008
30. 04. 2008
0 views

perceptron 2 4 2008

pham07
18. 04. 2008
0 views

pham07

FC STONE GREAT WALL1
17. 04. 2008
0 views

FC STONE GREAT WALL1

Sauter Nuts Bolt ETFs
16. 04. 2008
0 views

Sauter Nuts Bolt ETFs

UnivOfGuelphNov26th
14. 04. 2008
0 views

UnivOfGuelphNov26th

fujiwara
13. 04. 2008
0 views

fujiwara

Week 08 Finance
10. 04. 2008
0 views

Week 08 Finance

Lct1
09. 04. 2008
0 views

Lct1

outlook
19. 06. 2007
0 views

outlook

Microsoft Windows Vista
19. 06. 2007
0 views

Microsoft Windows Vista

2004 presentation
13. 09. 2007
0 views

2004 presentation

Australian
13. 09. 2007
0 views

Australian

NBB
13. 09. 2007
0 views

NBB

Thilo Ewald ppt
13. 09. 2007
0 views

Thilo Ewald ppt

20031216 NASANIH presentation
05. 10. 2007
0 views

20031216 NASANIH presentation

mna presentation
17. 10. 2007
0 views

mna presentation

lect29 groupwords
18. 10. 2007
0 views

lect29 groupwords

Essential Q Imperialism 2
22. 10. 2007
0 views

Essential Q Imperialism 2

p puska
07. 09. 2007
0 views

p puska

Productivity
07. 09. 2007
0 views

Productivity

honeyPots
13. 09. 2007
0 views

honeyPots

NDB Bensouda
23. 10. 2007
0 views

NDB Bensouda

181105
24. 10. 2007
0 views

181105

METO200Lect19 20
05. 10. 2007
0 views

METO200Lect19 20

oksupercompsymp2006 talk matrow
17. 10. 2007
0 views

oksupercompsymp2006 talk matrow

mareyes
25. 10. 2007
0 views

mareyes

2 01 3
29. 10. 2007
0 views

2 01 3

Online Class Evaluations 8
30. 10. 2007
0 views

Online Class Evaluations 8

1 3Grand father Journey
02. 11. 2007
0 views

1 3Grand father Journey

TuijaKuisma
07. 09. 2007
0 views

TuijaKuisma

Metallsektor
14. 11. 2007
0 views

Metallsektor

insects in out
13. 09. 2007
0 views

insects in out

oasen
16. 11. 2007
0 views

oasen

Unit 10 Scent Theory
17. 11. 2007
0 views

Unit 10 Scent Theory

SPEAR 2004
21. 11. 2007
0 views

SPEAR 2004

danse macabre
22. 11. 2007
0 views

danse macabre

kmutt
13. 09. 2007
0 views

kmutt

NCUR SDT 4 19 05
04. 01. 2008
0 views

NCUR SDT 4 19 05

gerber colloq UICtop feb2002
15. 10. 2007
0 views

gerber colloq UICtop feb2002

Lioi Altered Version
07. 01. 2008
0 views

Lioi Altered Version

Five Halloween Pumpkins audacity
02. 11. 2007
0 views

Five Halloween Pumpkins audacity

smime
07. 10. 2007
0 views

smime

CdF BEC
20. 11. 2007
0 views

CdF BEC

WEB C Schumacher
23. 10. 2007
0 views

WEB C Schumacher

bsb
13. 09. 2007
0 views

bsb

2006052213550876705
03. 01. 2008
0 views

2006052213550876705

1 11
19. 02. 2008
0 views

1 11

Ukraine
20. 02. 2008
0 views

Ukraine

truck tmp1002
27. 02. 2008
0 views

truck tmp1002

ace program plan
29. 02. 2008
0 views

ace program plan

takala
07. 09. 2007
0 views

takala

464 TM12
14. 12. 2007
0 views

464 TM12

ICEBP presentation for ANZCP A
10. 03. 2008
0 views

ICEBP presentation for ANZCP A

aionescu cmc dec06
30. 10. 2007
0 views

aionescu cmc dec06

creationtalk
11. 03. 2008
0 views

creationtalk

Data Mining 2
12. 03. 2008
0 views

Data Mining 2

Omaha Pres for NAP web2
29. 12. 2007
0 views

Omaha Pres for NAP web2

sustainable development part1
26. 03. 2008
0 views

sustainable development part1

Schrage
31. 08. 2007
0 views

Schrage

IHYJP Kickoff Poster
09. 10. 2007
0 views

IHYJP Kickoff Poster

020703 DHCAL
31. 08. 2007
0 views

020703 DHCAL

Vimpel Com
31. 08. 2007
0 views

Vimpel Com

Overland vista uib itforum
19. 06. 2007
0 views

Overland vista uib itforum

OS Notes
19. 06. 2007
0 views

OS Notes

NVIDIA OpenGL on Vista
19. 06. 2007
0 views

NVIDIA OpenGL on Vista

NonAdmin Pilot
19. 06. 2007
0 views

NonAdmin Pilot

New Mexico NETUG WPF
19. 06. 2007
0 views

New Mexico NETUG WPF

nercomp SIG
19. 06. 2007
0 views

nercomp SIG

MSAM Launch Vista Final Updated
19. 06. 2007
0 views

MSAM Launch Vista Final Updated

MOSS WF Talk
19. 06. 2007
0 views

MOSS WF Talk

More Online Games
19. 06. 2007
0 views

More Online Games

MHay Wireless
19. 06. 2007
0 views

MHay Wireless

Marl WSUS3
19. 06. 2007
0 views

Marl WSUS3

mail list news
19. 06. 2007
0 views

mail list news

Lenovo UofU
19. 06. 2007
0 views

Lenovo UofU

Lecture II
19. 06. 2007
0 views

Lecture II

Smith F09
13. 10. 2007
0 views

Smith F09

35508
26. 02. 2008
0 views

35508

pinar
19. 06. 2007
0 views

pinar

pgp
19. 06. 2007
0 views

pgp

pessner
19. 06. 2007
0 views

pessner

Overview Presentation
19. 06. 2007
0 views

Overview Presentation

North Dakota Annuity Deck
19. 06. 2007
0 views

North Dakota Annuity Deck

Rutland Presentation plenary4
31. 08. 2007
0 views

Rutland Presentation plenary4

NAMI NC 112707
07. 01. 2008
0 views

NAMI NC 112707

finland poster
07. 09. 2007
0 views

finland poster

sample
27. 09. 2007
0 views

sample

dtk
13. 09. 2007
0 views

dtk

Phenotyping Oxford
17. 10. 2007
0 views

Phenotyping Oxford

dog breeding
19. 11. 2007
0 views

dog breeding

5th trondhiem
29. 11. 2007
0 views

5th trondhiem

policies regs
28. 12. 2007
0 views

policies regs

GetuHailu
13. 09. 2007
0 views

GetuHailu

genealogy
01. 10. 2007
0 views

genealogy

net info 050928
19. 06. 2007
0 views

net info 050928

chap7
15. 10. 2007
0 views

chap7

Rafael Guillen CCAD SIAM mar06
22. 10. 2007
0 views

Rafael Guillen CCAD SIAM mar06

na3 Russia
31. 08. 2007
0 views

na3 Russia

Sois Global Programs3 12 04
31. 08. 2007
0 views

Sois Global Programs3 12 04

sacha
31. 08. 2007
0 views

sacha

amm pres valdez lacnic
22. 10. 2007
0 views

amm pres valdez lacnic

nwnt
19. 06. 2007
0 views

nwnt

STAR shielding 2
13. 11. 2007
0 views

STAR shielding 2

voiceline overview
17. 10. 2007
0 views

voiceline overview

gross PPT
07. 04. 2008
0 views

gross PPT

WP1a
15. 10. 2007
0 views

WP1a

Lim Badejo Dell Presentation 1
19. 06. 2007
0 views

Lim Badejo Dell Presentation 1

HongKong Punkka Salo
07. 09. 2007
0 views

HongKong Punkka Salo

Dvoretsky
31. 08. 2007
0 views

Dvoretsky

qm1 web
03. 01. 2008
0 views

qm1 web

IAPS
07. 09. 2007
0 views

IAPS

yalestudy
28. 09. 2007
0 views

yalestudy

digvlsideslec1
12. 10. 2007
0 views

digvlsideslec1

mead
13. 09. 2007
0 views

mead

bashmakov
31. 08. 2007
0 views

bashmakov