NCBI OVERVIEW: NCBI OVERVIEW SHUBHJEET MANDAL M.Sc. INDUSTRIAL BIOTECHNOLOGY 2 nd SEMESTER NCBI: NCBI National Center for Biotechnology I nformation. It is very common name in area of Bioinformatics. It was established in the year 1988. A part of the National Library of Medicine at the National Institutes of Health( NIH), Bethesda, Maryland, USA. NCBI, EMBL and DDBJ comprise the International Nucleotide Sequence Database Collaboration(INSDC). Whose members exchange data daily to ensure a uniform and comprehensive collection of sequence information. Aim of NCBI: A im of NCBI Establish public Database. Develop software tools for sequence analysis. Disseminate biomedical information. Research in computational Biology GenBank: GenBank GenBank is the NCBI’s primary database. It is a compreshensive public database of nucleotide seq., supporting bibliographic and biological annotation. GenBank makes data available at no cost over the Internet and a wide range of online retrival and analysis services which operate on the database. ENTREZ: ENTREZ Developed by NCBI. Entrez is the search and retrieval tool for all of NCBI. It is French for ‘enter’. Entrez allows to search all of the NCBI databases, including PubMed, nucleotide, protein, structure, etc. Entrez provides extensive links within and between database record. Entrez is easier to use, and also offers more limited information to search. SEQUENCE SUBMISSION TOOL: SEQUENCE SUBMISSION TOOL The database are constantly updated through newer submission of sequence. This is done using the following tools: BankIT A web-based GenBank sequence submission tool. Choice for simple submission. When only one or small no. of record are to be submitted. Used by submitters to update their existing record. Sequin Developed by NCBI for submission and update their data. It helps in handling multiple seq. submission, provides increase capacity for complex submissions containing long seq., multiple annotations, segmented sets of DNA, or phylogenetic and population studies. Provides graphical viewing and editing option. BLAST: BLAST Basic Local Alignment Search Tool Widely used as similarity search tool against a variety of sequence databases. Based on Smith Waterman Algorithm. Finds best local alignment. It provides statistical significance. BLAST categories: BLAST categories blastn : comparing the nucleotide seq. query against the nucleotide seq. database. blastp : comparing the amino acid query against the protien seq. database. blastx : comparing the nucleotide query seq. translated in all reading frames against the protein database. tblastn : comparing the protein query seq. against the nucleotide database translated in all reading frame. tblastx : comparing six-reading frame translations of the nucleotide query against six-frame translations of the nucleotide seq. database. ORF finder: ORF finder Essential graphical tool, finds all open reading frame of a selectable min. size in a user or in a seq. available in database It uses the standard or alternative genetic codes to identify all open reading frame. Helpful in preparing complete and accurate seq. submission. Also packaged with the sequin seq. submission software. e-PCR: e-PCR Electronic polymerase chain reaction. A computational procedure that is used to identify seq.-tagged sites(STSs), within DNA sequence. Searches for sub-seq. that closely match the PCR primers and have the correct order, orientation, and spacing. That could represent the PCR primers used to generate known STSs. e-PCR provides a search mode using a query seq. against a seq. database. UniGene: UniGene UniGene is an experimental system for automatically partitioning GenBank seq into a non-redundant set of gene oriented clusters. Each UniGene cluster contains sequences that represent a unique gene, as well as related information such as the tissue types in which the gene has been expressed and map location. UniGene focuses on mRNA and ETS information, and is often used in microarray experiments RefSeq: RefSeq The Reference Sequence ( RefSeq ) collection aims to provide a comprehensive, integrated, non-redundant set of genomic sequences . RefSeq standards serve as the basis for medical, functional, and diversity studies; they provide a stable reference for gene identification and characterization, mutation analysis, expression studies, polymorphism discovery, and comparative analyses. Map Viewer: Map Viewer Map Viewer supports search and display of genomic information by chromosomal position – genomes displayed visually Regions of interest can be retrieved by text queries (e.g. gene or marker name) or by sequence alignment (BLAST) It shows the “neighborhood of your gene” PubMed: PubMed PubMed provides access to bibliographic information which includes MEDLINE, NLM's premier bibliographic database. MEDLINE contains bibliographic citations and author abstracts from more than 4,600 biomedical journals published in the United States and 70 other countries. Full text articles are usually available through “pay-per-view” on supporting journal websites. MMDB: MMDB Molecular modeling database. Contains 3d macromolecular structure. Structure of proteins have been experimentally determine through X-ray crystallography and NMR spectroscopy. Provides a wealth of information regarding the biological function, mechanism, evolutionary history of function and relationship. The objective behind the establishment of MMDB is to make this information accessible and useful to the researchers. Reference: Reference Zhumur Ghosh and Bibekanand Mallick ‘Bioinformatics Principle and Application’, oxford university press 2012. Jonathan Pevsner ‘Bioinformatics and Functional Genomics’, Wiley- L iss publication 2003. N CBI ( http:// )

