Published on December 31, 2007
Slide1: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance in Plants Ruth Grene Alscher Lenwood S. Heath Naren Ramakrishnan Virginia Tech December 6, 2001 Slide2: Overview Organization of our group About environmental stress and reactive oxygen species (ROS) Plant responses to ROS Analysis of responses to stress on a chip -microarray technology Expresso: management system for microarrays Managing expression experiments Analyzing expression data Reaching conclusions Where do we go from here? Slide3: Ruth Alscher Carol Loopstra Senior Collaborators Students Slide4: Detection of stress -mediated gene expression effects on microarrays Computational tools to infer interaction among genes, pathways Revised / New Tools and Experiments Genetic Regulatory Networks Test inferences with varying conditions and genotypes 1 2 3 4 Iterative strategy for detection of stress -mediated effects on gene expression using microarrays and CS expertise Slide5: Proposed Project: 2002-2005 Plant Biology (with co-PIs: Ron Sederoff, NCSU; Carol Loopstra, TAMU) An investigation of drought stress responses in lobolly pine in a variety of provenances. Quantitative RT-PCR to confirm and expand results obtained with microarrays. In situ hybridization to stressed and unstressed cell and tissue types. Slide6: Plants adapt to changing environmental conditions through global cellular responses involving successive changes in, and interactions among, expression patterns of numerous genes. Our group studies these changes through a combination of bioinformatics and genomic techniques. Plant Response to Stress Slide7: Biological: To identify molecular stress resistance mechanisms in tree and crop species. Bioinformatic: To support iterative experimentation in plant genomics, capture and analyze experimental data, integrate biological information from diverse sources, and close the experimental loop. Long Term Goals Slide8: The Paradox of Aerobiosis Oxygen is essential, but toxic. Aerobic cells face constant danger from reactive oxygen species (ROS). ROS can act as mutagens, they can cause lipid peroxidation and denature proteins. Slide9: ROS Arise as a Result of Exposure to: Ozone Sulfur dioxide High light Paraquat Extremes of temperature Salinity Drought Slide10: Redox Regulation of Cellular Systems Membrane Receptors Environmental Stress Metabolite Defense Protein kinases; phosphatases Transcription factors Gene Expression Defense, Repair, Apoptosis Prooxidants (ROS) Antioxidants Slide11: Scenarios for Effects of Abiotic Stress on Gene Expression in Plants Slide12: Hypotheses There is a group of genes whose expression confers resistance to drought stress. Based on previous work (Alscher and others for superoxide dismutases and glutathione reductases) increased expression of defense genes is co-regulated and is correlated with resistance to oxidative stress. Failure to cope is correlated with little or no defense gene activation. A common core of defense genes exists, which responds to several different stresses. Slide13: Components of 1999 Stress Study 1999 Pine Drought Stress Experiments 2000 Expresso Prototype Design and Print Microarrays Select 384 Pine cDNAs Design Functional Hierarchy Capture Spot Intensities Integrate and Analyze Inductive Logic Programming (ILP) Slide14: Imposition of Successive Cycles of Mild or Severe Drought Stress on 1-year-old Loblolly Pine Seedlings 0 -2 -10 -15 DAYS = water potential (bars) RNA Harvest I RNA Harvest II RNA Harvest III RNA Harvest IV Cycles of Mild Drought Stress DRY DOWN DRY DOWN DRY DOWN DRY DOWN = PS (photosynthesis) 0 -2 -10 -15 DAYS = water potentional (bars) Cycle I Cycle II Cycle III RNA Harvest I RNA Harvest II RNA Harvest III Cycles of Severe Drought Stress DRY DOWN DRY DOWN DRY DOWN Water withheld Water given Water given Water given Water given Water withheld Water withheld Water withheld Water given Water given Water given Water withheld Water withheld Water withheld RECOVERY RECOVERY RECOVERY RECOVERY RECOVERY RECOVERY RECOVERY Slide15: Categories within Protective and Protected Processes Plant Growth Regulation Environmental Change Gene Expression Signal Transduction Protective Processes Protected Processes ROS and Stress Cell Wall Related Phenylpropanoid Pathway Development Metabolism Chloroplast Associated Carbon Metabolism Respiration and Nucleic Acids Mitochondrion Cells Tissues Cytoskeleton Secretion Trafficking Nucleus Protease-associated Slide16: Categories within “Protective Processes” Slide17: Hypotheses versus Results –1999 Expt Among the genes responding to mild stress, there exists a population of genes whose expression confers resistance. Candidate stress resistance genes. Genes in 69 categories ( e.g. HSP70s and 100s, but not HSP80s, aquaporins) responded positively to mild stress. Effect of severe stress was not detectable or negative. Slide18: Genes associated with other stresses responded to drought stress Isoflavone reductase homologs and GSTs responded positively to mild drought stress. These categories are previously documented to respond to biotic stress and xenobiotics, respectively. However, both isoflavone reductase homologs and GSTs responded positively to severe drought stress. Hypotheses versus Results –1999 Experiment Slide19: Flow of a Microarray Experiment Select cDNAs PCR Test of Hypotheses Extract RNA Replication and Randomization Reverse Transcription and Fluorescent Labeling Robotic Printing Slide20: Image Analysis: gridding, spot identification, intensity and background calculation, normalization Statistics: Fold or ratio estimation Combining replicates Higher-level Analysis: Clustering methods Inductive logic programming (ILP) Spot and Clone Analysis Slide21: Data Mining: Inductive Logic Programming ILP is a data mining algorithm expressly designed for inferring relationships. By expressing relationships as rules, it provides new information and resultant testable hypotheses. ILP groups related data and chooses in favor of relationships having short descriptions. ILP can also flexibly incorporate a priori biological knowledge (e.g., categories and alternate classifications). Slide22: Rule Inference in ILP Infers rules relating gene expression levels to categories, both within a probe pair and across probe pairs, without explicit direction Example Rule: [Rule 142] [Pos cover = 69 Neg cover = 3] level(A,moist_vs_severe,not positive) :- level(A,moist_vs_mild,positive). Interpretation: “If the moist versus mild stress comparison was positive for some clone named A, it was negative or unchanged in the moist versus severe comparison for A, with a confidence of 95.8%.” Slide23: ILP subsumes two forms of reasoning Unsupervised learning “Find clusters of genes that have similar/consistent expression patterns” Supervised learning “Find a relationship between a priori functional categories and gene expression” Hybrid reasoning: Information Integration “Is there a relationship between genes in a given functional category and genes in a particular expression cluster?” ILP mines this information in a single step Slide24: NGS-Supported Work of 2001: Expresso Slide25: NGS-Supported Work of 2001: Expresso Progress to Date Margaret Ellis and Logan Hanks (computer science graduate students): MEL: Semistructured data model for experiment capture Parsing: Automatic parser generators to drive archival storage Database: Loading and cataloging MEL data in a Postgres RDBMS Pipeline: Linkages to data analysis and data mining software Slide26: NGS-Supported Work of 2001: Progress to Date Cecilia Vasquez (plant biology graduate student): Loblolly pine seedlings were subjected to the same cycles of drought stress as in 1999, with photosynthesis, water potential measurements, and RNA isolations carried out throughout the time course of the experiment. Jonathan Watkinson (post-doctoral associate): RNA was hybridized to an array of 2400 pine cDNAs at NCSU. Data capture. Slide27: Imposition of Successive Cycles of Mild or Severe Drought Stress on 1-year-old Loblolly Pine Seedlings 0 -2 -10 -15 DAYS = water potential (bars) RNA Harvest I RNA Harvest II RNA Harvest III RNA Harvest IV Cycles of Mild Drought Stress DRY DOWN DRY DOWN DRY DOWN DRY DOWN = PS (photosynthesis) 0 -2 -10 -15 DAYS = water potentional (bars) Cycle I Cycle II Cycle III RNA Harvest I RNA Harvest II RNA Harvest III Cycles of Severe Drought Stress DRY DOWN DRY DOWN DRY DOWN Water withheld Water given Water given Water given Water given Water withheld Water withheld Water withheld Water given Water given Water given Water withheld Water withheld Water withheld RECOVERY RECOVERY RECOVERY RECOVERY RECOVERY RECOVERY RECOVERY Slide28: Cy3 TIFF Image Final Harvest; Control versus Mild Stress; 2001 Cy5 TIFF Image Replication Differential Expression Slide29: Final Harvest; Control versus Mild Stress; 2001 Cy5 to Cy3 ratios. Final harvest after four drought cycles. RNA harvested 24 hours after final watering. Cy5 = treated; Cy3 = control. Aquaporins responded positively, while HSP 80’s were unaffected, as in 1999 results. Slide30: Drought Stress Responses in Loblolly Pine: Questions to be Addressed Can a hierarchy of drought stress resistance mechanisms be identified ? Can a clear distinction be made between rapidly responding and long term adaptational mechanisms? Can particular subgroups within gene families be associated with drought tolerance? Slide31: Drought Stress Responses in Loblolly Pine: Proposed Bioinformatics Goals Support incorporation of biological information in the form of functional hierarchies and gene families. Close the computational and experimental loop to support iterative experimental regimes. Integrate information from multiple experiments involving multiple provenances, drought stresses, and EST sets. Slide32: Proposed Project: 2002-2005 Sources of cDNAs for 2002-2005 arrays NCSU ESTs selected on the basis of function. Stressed cDNA libraries from roots and stems of drought tolerant families from East Texas and Lost Pines, and from the Atlantic Coastal Plain (humid conditions). Homologs of drought-responsive Arabidopsis genes. Slide33: Gene Discovery in the Arabidopsis Transcriptome Possible Identification of Novel Drought Responsive Genes in Arabidopsis Drought Stress (short and long term) Hybridize to Arabidopsis Transcriptome Scanning, Image Processing Slide34: Select Pine cDNAs Via Contigs Robotic Replication and Printing Identification of Drought Responsive Genes and Pathways Across Provenances in Loblolly Pine Data Capture Postgres Database Database Queries Statistical Analysis and Clustering Data Mining, ILP Drought Stress Experiments on NC, TX Pine Hybridization Scanning, Image Processing Identification of Drought Responsive Pine Genes Close The Loop Arabidopsis Drought Responsive genes Slide35: Proposed Project: 2002-2005 Bioinformatics I (Alscher, Heath, Ramakrishnan) Constraint-based selection of cDNAs, including intelligent use of contigs. Assignment of pine ESTs to subgroups within protein families (ProDom, Pfam). Extend information integration in ILP to include Mendel classification of gene families. Integrating data across provenances and known degrees of drought tolerance. Slide36: Proposed Project: 2002-2005 Bioinformatics II (Ramakrishnan, Heath) Specialize ILP for particular biological information sources. Automatic tuning of ILP parameters. Pushing data mining functionality into the database. Interleaving and iteration of query, data analysis, and data mining operations.