Published on November 16, 2007
Principles of Database Design: Principles of Database Design NLM/MBL Medical Informatics Session Outline: Session Outline Why learn this? Database Principles and Paradigms Principles of Relational Database Design System design and building methods Exercise: Transforming flat files to tables Why Learn about Database Design?: Why Learn about Database Design? Vendors will sell you on user interfaces, but the power and flexibility is in the data model Evaluating and comparing products Communicating with vendors and IT support staff Building your own databases What is a Database?: What is a Database? An organized collection of information Computer-based representation Systematic, automated retrieval Systematic, automated symbol manipulation Historical Evolution of Databases: Historical Evolution of Databases Dedicated files created & maintained by application software (sequential, random access) Database Management Systems (DBMSs) Hierarchical Databases: Hierarchical Databases Lab Results 5/30/96 Serum Na+ Pt=Smith Advantages: efficient storage and I/O, rapid access via predetermined data hierarchies Disadvantages: difficult to view/retrieve data from other perspectives, hard to modify underlying structure Information Network Databases: Information Network Databases Advantages: Can model complex many-to-many relationships as well as hierarchies and simple lists Disadvantages: difficult to predict & control effects of transitive relationships; recursion; I/O intensive, potential to become incomprehensible “Database as Hypertext” Relational Databases: Relational Databases Advantages: Understandable, permits variety of logical aggregation or “views” of data elements, structure easily modifiable, new elements generally do not “break” existing programs Disadvantages: I/O intensive, 1 logical record may = many physical records, relational integrity is a constant concern & must be under software control “Rows & Columns with inter-table references” Pt-UI Testname Date 12345 Serum_Na 5/30/96 42353 CBC 5/30/96 47756 ESR 5/30/96 12348 HBsAg 5/30/96 34523 Amylase 5/30/96 Lab_test Pt-UI Lname Fname 12345 Smith Elmer 12346 Jones Barbara 12347 Clark Arthur 12348 Jones Casey 12349 Sample Steve Patient Object-Oriented Databases: Object-Oriented Databases Multiple data types including text, graphics, sound, signals, etc. Encapsulation of data & programs Interprocess messaging: e.g., “Print Yourself” Advantages: applications programs consist of high level commands & functions which do not need to know the underlying data organization; modularity, reusability and portability between systems Disadvantages: early in commercialization; CPU intensive; few standards for query & object sharing Fundamental Assertions about Systems Design: Fundamental Assertions about Systems Design The Data Model is the most critical aspect of system design and function Data Models should reflect real world objects and their relationships to ensure durability A correct Data Model subserves and outlasts applications, including many not anticipated at system start-up Object-oriented Systems design:Basic Concepts: Object-oriented Systems design: Basic Concepts The World contains Things e.g., Collies, Terriers, Bloodhounds We develop abstractions of things called “objects” e.g., dog We group objects by criteria which represent the abstract object as an empty table Dog Name Breed Favorite Food Birthdate Basic Concepts, cont’d: Basic Concepts, cont’d Empty tables can be filled in to represent the real world things from which the object was abstracted Dog Name Breed Favorite Food Birthdate Boris St. Bernard Canned Jan 81 Fifi Poodle Dry May 92 Fido Pomeranian Canned Apr 87 Basic Concepts, cont’d: Basic Concepts, cont’d There are Relationships between objects which are attributes of those objects Dog Name License Owner Name Lic. Date Relationship: “OWNS” Dog Owner OWNS Dogs Objects: Objects All of the real-world things in the set (the “instances”) have the same characteristics All instances conform to the same rules Types of Objects (ie., types of tables): Types of Objects (ie., types of tables) Tangible Things e.g., book Roles e.g., doctor, patient, supervisor Incidents (=events, occurences) e.g., ordering of a lab test Interactions (bind two or more other objects via a transaction) e.g., Purchase relates Buyer to Seller Specifications (definition tables of tangible things) Table Notation: Table Notation Graphical Form: Patient_Admissions * Pt_ID -Date_Adm -Time_Adm -Unit -Room Textual Form: Patient_Admissions (Pt_ID, Date_Adm, Time_Adm, Unit, Room) Formalisms for Tables: Formalisms for Tables Rule 1: One instance of an object has exactly one value for each attribute (i.e, only one data element at each row-column intersection; no repeating groups, no true “holes” in table) Rule 2: Attributes must contain no internal structure Name Age-Sex Smith 38-F Jones 22-M Clark 18-M Not OK: If Rules 1 and 2 are obeyed, the data model is in “First Normal Form” Formalisms for Tables, cont’d: Formalisms for Tables, cont’d Rule 3: Every attribute should represent a characteristic of the entire object, not a characteristic of a limited part of the object Not OK: Attribute of hospital staff appointment, not committee Hospital Committee Membership * Person Name * Committee Name -Date committee term expires OK: Relationships: Relationships A relationship is the abstraction of a set of associations that hold systematically between different kinds of real world things Patient OCCUPIES bed Library CONTAINS books Specimen IS ASSAYED by Lab Method Most relationships may be stated in the inverse also: Library LENDS book Book IS LENT BY Library Relationship Types: Relationship Types State Governor One-to-One: has governs Many-to-Many Author writes Book is written by One-to-Many Dog Owner owns Dog is owned by Modeling Many-to-Many Relationships: Modeling Many-to-Many Relationships DRUG MANUFACTURER * manufacturer name - other attributes DRUG *generic name - other attributes LICENSE * manufacturer name * generic name - date licensed Overall System Design Process: Overall System Design Process Build the Entity-Relationship diagram for all defined objects (tables), [including an Object Specification Document] [Create a State Transition Model which describes changes to objects based on events or transactions] [Create a Data Flow diagram which models the information elements which cause State Transitions] [Recommended for multi-programmer projects] Exercise: Devise a Relational Model for MEDLINE citations: Exercise: Devise a Relational Model for MEDLINE citations Slide24: UI - 90134185 AU - Greenes RA ; Shortliffe EH TI - Medical Informatics. An Emerging academic discipline and institutional priority MH - Hospital Information Systems; Career Choice; Medical Informatics/EDUCATION/*TRENDS PT - JOURNAL ARTICLE; REVIEW; TUTORIAL EM - 9005 AB - Information management constitutes a major activity of the health care profession. Currently a number of forces are focusing attention on this function... AD - Department of Radiology, Brigham and Women’s Hosp., Boston, MA 02115 SO - JAMA 1990 Feb 23; 263(8):1114-20 Sample MEDLINE citation The “Bottom Line” in Database Design: The “Bottom Line” in Database Design The Data Model is the most critical aspect of system design and function Data Models should reflect real world objects and their relationships to ensure durability A correct Data Model subserves and outlasts applications, including many not anticipated at system start-up Questions?: Questions?