Published on November 29, 2007
ARB - a software Environment for Sequence Data: ARB - a software Environment for Sequence Data Why ARB and what is it?: Why ARB and what is it? ARB – from ARBor, Latin: tree. A graphically oriented package of software tools for establishing, handling and using database of sequences with associated information. The ARB project is a joint initiative of the Lehrstuhl für Mikrobiologie and the Lehrstuhl für Rechnertechnik und Rechnerorganisation of the Technical University of Munich. Description: Description This is PACAGE of programs – “a software environment” for maintaining databases of molecular sequences and additional information, and for analyzing the sequence data with emphasis on phylogeny reconstruction. ARB main window: ARB main window The major concept is to combine access to the data via graphically presented hierarchy (tree) and sequence data analysis. Deals with: Deals with The program have primarily been developed for rRNA sequences (mostly SSU) and contain special tools for alignment and analysis (tree reconstruction) of these structures. However, ARB can be used for any nucleic or amino acid sequence data. ARB features – the tree is the key: ARB features – the tree is the key Any database entry can be visualised in the main window along with a phylogenetic tree. Database access and navigation is possible via mouse click in the displayed tree or by using a search tool. ARB features – DB information: ARB features – DB information Sequence data and/or associated information which can be displayed in the ARB main window Other ARB features: Other ARB features Sequence and additional data can be im- and exported in a variety of commonly used file formats (RDP). Publication ready trees can be generated and exported in various formats (very nice and handy). Distance matrix, maximum parsimony and maximum likelihood based phylogenetic treeing can be performed applying the respective integrated tools. A special maximum parsimony approach allows reconstruction and optimisation of comprehensive trees representing the full sequence data set (currently more than 20.000 entries). More ARB features - editor : More ARB features - editor A powerfull editor for real or virtual primary structures includes versatile tools for string searching, automated alignment of primary structures, local alignment optimisation, More ARB features - editor : More ARB features - editor automated secondary structure check as well as secondary structure visualisation. More ARB features – probes: More ARB features – probes The ARB PT server (positional tree) provides the basis for rapid searching closest relatives or specific sequence signatures. Such signatures can be evaluated as taxon specific probes against the background of the full database and used as probes for hybridization or PCR primers . More ARB features – genomics: More ARB features – genomics Currently, the ARB software is under further development for genome sequence database management. Who made and who knows ARB: Who made and who knows ARB ARB was developed by Oliver Strunk and Wolfgang Ludwig at the Department of Microbiology at the Technical University of Munich. http://www.arb-home.de/ You could also ask advices from Frank Oliver Glöckner from Max-Planck-Institute for Marine Microbiology http://www.mpi-bremen.de/molecol/arb/ You may want to use ARB if : You may want to use ARB if Working with 16S/23S/18S rRNA sequences Deal with hundreds of sequences Doing extensive phylogeny Designing oligonucleotide(16S/23S) probes (primers) Not afraid to learn some LINUX/UNIX (best is to do all the installation by yourself) What to do you need to run ARB? (Hardware requirements): What to do you need to run ARB? (Hardware requirements) Processor: from Pentium II 300 Memory (RAM): from 128 MB (most important) Monitor and resolution: from 17’ and 1152x864 Space: from 5GB Slide16: Linux (RedHat, SuSE, Mandrake etc.) or UNIX, SUN/Solaris ( I don’t know much about) ARB installation files. http://www.arb-home.de/ or http://www.mpi-bremen.de/molecol/arb/ Time and patience What to do you need to run ARB? (Software requirements) What is not so good ? : What is not so good ? Absence of comprehensive help literature Direct download sequences from EMBL (like in Seqlab from GCG) is NOT possible ARB remains under development – some functions are not working or incompletely developed ARB is not so good for other than rRNA data and in case you have small dataset ARB Web sites: ARB Web sites The ARB project at Technical University of Munich - the official ARB homepage http://www.arb-home.de/ ARB at MPI Bremen - the unofficial ARB homepage http://www.arb.de.vu/ Introduction to ARB by Dave Kysela, MBL http://www.mbl.edu/arb_tutorial/ ARB tutorial by German Jurgens, University of Helsinki http://honeybee.helsinki.fi/users/GJURGENS/Arb/arb_page.htm Main steps to go(From unaligned sequences to beautiful tree with your sequences and reference sequences): Main steps to go (From unaligned sequences to beautiful tree with your sequences and reference sequences) Lets assume that we successfully run ARB on Linux PC, all devices (network, floppy (or ZIP), printer, etc) are working and we have some database with some amount of aligned sequences which we think are related to our unknown sequence + some basic initial tree. Slide22: 0. Perform FASTA with GCG – just to make sure that it is rRNA of some organism. (Or search known rRNA sequence in EMBL database (by AC for example). Start Linux and Transfer sequence(s) to Linux OS (In case of entry from EMBL sequence could be directly downloaded and saved in Linux directory using Linux Netscape browser) Start ARB and Import sequence(s) to the ARB ). Makeup some name of this entry – to have a possibility to find it in the DB (not that necessary in case of if sequence is from EMBL database) Start the ARB sequence Editor and perform automatic alignment of all introduced sequences. Using row data (curves), help of secondary structure and other similar aligned sequences (if any) carefully check and correct the alignment – step IMPORTANT for future phylogeny. Add aligned sequence to the existing tree – parsimony method is used. Already here we have initial suggestion about who is our organism Choose reference sequences from the initial tree (or any other trees or any other sequences) build a real tree using Neighbor Joining, Parsimony or Maximum Likelihood method with different parameters and variation of corresponding filters to cut some positions from the alignment. Edit obtained tree with the available ARB tools (grouping, switching branches, tree representation (radial tree), etc)) and edit the picture of the tree with special ARB drawing software (add text, figures etc.). 9. Print tree on printer or Import it to other format (pdf, gif etc.).