XMLDB M6 2005

Information about XMLDB M6 2005

Published on December 3, 2007

Author: Elliott

Source: authorstream.com

Content

Module 6 XML and RDBMSs (Shredding, SQL / XML, Products):  Module 6 XML and RDBMSs (Shredding, SQL / XML, Products) Why bother?:  Why bother? Most (structured) data stored in RDBMS Seemless integration of XML and rel. Data Exploit capabilities of RDBMS Scalability, Availability, Performance, ... XML - RDBMS Applications Legal documents, decision support on XML When not to use RDBMS Streaming data (RSS, SOAP), IR (Google) -> Fine line, often a difficult decision! XML-RDB Overview (Kernfach):  XML-RDB Overview (Kernfach) Store XML as a BLOB in relational database index by materializing indexed expressions in separate columns Plus: store XML in parsed and validated form Minus: proprietary solution (blob is a black box) Minus: replicate data for indexing Model-driven Shredding (Florescu, Kossmann 99) Edge, binary approach + alternatives Corresponds to generic APIs for Java Plus: very general; integrates well with relational data Minus: poor performance Schema-based Shredding (Shanmugasundaram et al. 99) Map XML Schema / DTD to SQL DDL Plus: integrates well with relational data Minus:  missing tools, complicated  SQL / XML (latest product developments 2005) Extend SQL with XML data type Plus: integrates well with relational data Minus: not clear how it integrates with application, odd „marriage“ Overview: SQL / XML:  Overview: SQL / XML Store XML Data in Relational Databases Publish relational data as XML Query XML data in RDBMS using XQuery Literature: C. Türker: SQL:1999&SQL:2003. Dpunkt, 2003. A. Chaudhuri, A. Rashid, R. Zicari: XML Data Management. Addison Wesley, 2003. A. Eisenberg, J. Melton: Advancements in SQL /XML. ACM SIGMOD Record, 2004. History of SQL / XML:  History of SQL / XML First edition part of SQL:2003 Part 14 of the SQL standard Pre-dates XQuery standard!!! Limited functionality - storage and publishing Second edition: work in progress More complete integration of XQuery + XQuery Data Model Advanced Query capabilities Expected to be published in 2006 XML Type in SQL:  XML Type in SQL A new type (like varchar, date, numeric) SQL:2003 - XML type restricted to XML document or XML element or Sequence of XML elements SQL / XML, 2nd edition Full support of XQuery Data Model XML(SEQUENCE), XML(ANY CONTENT), ... N.B. XML in different rows is incomparable Example (SQL:2003):  Example (SQL:2003) create table books( title varchar(20), authors XML); No schema validation, no typing! Publishing Rel. Data as XML:  Publishing Rel. Data as XML SQL / XML provides ways to publish relational data as XML. RDBMS looks like an XML Database XML view on relational data Enables XML Data Integration Idea: Map SQL Types to XML Schema Types Encode special characters: e.g., „<“ -> „&lt;“ Provide functions to construct XML Data (XML data can be published 1:1) Publishing Rel. Data as XML:  Publishing Rel. Data as XML <Phantasy-People> <row> <Id>4711</Id> <Name>Wutz</Name> </row> <row> <Id>4711</Id> <Name>Wutz</Name> </row> </Phantasy-People> Phantasy-People XML Schema of Publishing Data:  XML Schema of Publishing Data <xsd:complexType name=„ROW.Phantasy-People“> <xsd:sequence> <xsd:element name=„Id“ type=„xsd:integer“/> <xsd:element name=„Name“ type=„varchar20“/> </xsd:sequence> </xsd:complexType> <xsd:complexType name=„TABLE.Phantasy-People“> <xsd:sequence> <xsd:element name=„row“ type=„ROW.Ph...“/> </xsd:sequence> </xsd:complexType> <xsd:element name=„Phantasy-People“ type=„TABLE.Pha...“/> create table Phantasy-People(Id integer, Name varchar(20)); XML Schema for CHAR(20):  XML Schema for CHAR(20) <xsd:simpleType name=„char20“> <xsd:annotation> <xsd:appinfo> <sqlxml:sqltype kind=„PREDEFINED“ name=„CHAR“ length=„20“ characterSetName=„LATIN1“ collation=„DEUTSCH“/> </xsd:appinfo> <xsd:annotation> <xsd:restriction base=„xsd:string“> <xsd:length value=„20“/> </xsd:restriction> </xsd:simpleType> Publishing Functions:  Publishing Functions XMLGEN Generate an XML Document using Xquery XMLELEMENT, XMLATTRIBUTES, XMLFOREST, XMLCONCAT Simplified versions of XMLGEN for typical cases XMLAGG Aggregate XML Elements into a group Example: XML View on Rel. Data:  Example: XML View on Rel. Data Phantasy-People SELECT XMLGEN( <Person id = „{$Id}“> {$Name} </Person>) as Person FROM Phantasy-People Example: XML View on XML Data:  Example: XML View on XML Data SELECT Title, XMLGEN(<pa>{$Authors[1]/text()}</pa>) as PrimA FROM MyAuthors; XMLAGG:  XMLAGG SELECT Product, XMLAGG( XMLELEMENT(NAME „S“, Sales)) AS AllSales FROM SalesTable GROUP BY Product; SalesTable SQL / XML: 2nd Edition:  SQL / XML: 2nd Edition XML datatype will support XQuery data model XML(UNTYPED CONTENT) – old XML infoset model XML(SEQUENCE) – holds heterogeneous sequences ... (other parameterized types; validated data possible! Non well-formed XML data possible, too.) Full XML Schema support and validation XMLQuery() function create XML content using XQuery XMLTable() function Shred XML to rel. Data using Xquery Mapping between SQL & XQuery data model XMLCAST between XML and SQL types XMLExists:  XMLExists SELECT Title FROM books WHERE XMLEXISTS(Authors, //author = „et al.“); Explicit PASSING also possible (see XMLQuery) VALID Predicate:  VALID Predicate Check whether XML data validates to an XML data type. XML Schemas must be registered by admin Otherwise security problems xml-value-expression IS [NOT] VALID [identity-constraint-option] [validity-target] Identity constraings: deal with ID/IDREFs Validity target: Typically, URI of Schema XMLQuery expression:  XMLQuery expression SQL Expression – use in select for constructing XML select XMLQuery( ‘for $i in ./PurchaseOrder where $i/PoNo = $j/val return $i//Item ‘ passing p.pocol , xmlelement(“val”,2100) as “j” returning content) from purchaseorder p <Item itemno=“21”><Quantity>200</Quantity>..</Item> <Item itemno=“22”><Quantity>22</Quantity>..</Item> Pocol maps to default item XMLElement value maps to $j XML Query:  XML Query XMLQuery( Xquery-expression PASSING { BY REF | BY VALUE} (value-expression AS identifier [BY REF | BY VALUE])* RETURNING { CONTENT|SEQUENCE } { BY REF|BY VALUE} If PASSING value has no identifier, then that is context node BY REF - preserves Id (of an XML type) BY VALUE - creates a copy of the data XMLCast:  XMLCast XMLCAST (value-expression AS type) Cast XML (or SQL) value into one of the XML types (SEQUENCE, ANY CONTENT, ...) XMLTable construct:  XMLTable construct Used in FROM clause: translate XML into relational data Splits up result into SQL columns, passing always BY REF select items.pos, items.itemno, items.quantity from purchaseorder p, XMLTable(‘for $i in /PurchaseOrder//Items where $i/Quantity > 200 return $i’ passing p.pocol columns pos for ordinality, itemno number path ‘ItemNo’ quantity number DEFAULT 0 path ‘Quantity’ ) items; POS ITEMNO QUANTITY ------ ----------- ------------ 1 21 21 2 22 0 Default value is used If path does not return value Ordinality returns sequential position Relational columns returned in result XMLTable (ctd.):  XMLTable (ctd.) Can be chained (master-detail drill down) select items.pono, items.itemno, locs.city from purchaseorder p, XMLTable(‘for $i in /PurchaseOrder//Items where $i/Quantity > 200 return $i’ passing p.pocol columns pono number path ‘../PurchaseOrder/@PoNo’ itemno number path ‘ItemNo’ locs xmltype path ‘Locations’ ) items, XMLTable(‘for $i in /Location where $i/State = “CA” return $i’ passing items.locs columns city varchar2(20) path ‘City’ ) locs; PONO ITEMNO CITY ------- ----------- ------------ 100 21 Redwood City 100 21 Foster City 100 22 Berkeley 101 30 Los Angeles Microsoft SQL Server 2005:  Microsoft SQL Server 2005 XML Parser XML Validation XML datatype (binary XML) Schema Collection XML Relational XML Schemata OpenXML/nodes() FOR XML with TYPE directive Rowsets query() modify() Node Table PATH Index PROP Index VALUE Index PRIMARY XML INDEX query() Microsoft: Indexing:  Microsoft: Indexing Create XML index on XML column CREATE PRIMARY XML INDEX idx_1 ON docs (xDoc) Creates secondary indexes on tags, values, paths Speeds up queries Results can be served directly from index Entire query is optimized Same award winning cost based optimizer Indexes are used as available Microsoft: Updates:  Microsoft: Updates Insert, update, & delete a la XUpdate E.g.: Add a new section after Section 1 UPDATE docs SET xDoc.modify( 'insert <section num=''2''> <heading>Background</heading> </section> after (/doc/section[@num=1])[1]') XQuery Support in Oracle:  XQuery Support in Oracle XMLDB integrated database engine SQL / XML standard support Optimized queries – rewrite to relational Standalone Java query engine 100% Java Integrated into Oracle App Server -XDS Interoperates with XSLT/XPath First relational database to ship an XQuery implementation ! XQuery database support:  XQuery database support Production in Oracle Database 10gr2 Supports XMLQuery and XMLTable construct Native compilation into SQL /XML structures Returns XMLType(Content) Can query over relational, O-R, XMLType data fn:doc - Maps to XDB Repository on server SQLPlus provides xquery command to execute XQuery XSL-T will also get compiled to XQuery Architecture:  Architecture XQuery XSL-T Parser Compiler Rewrite to SQLX Normalization XQuery Type check SQLX rewrite Execution engine XQueryX Compiled XQuery Tree SQL Metadata XMLSchema Repository Statically Type checked Tree Normalized Tree (casts, treat ) SQL/XML Operand Tree Relational Optimizer SQL Operand Tree XML Indexes, Text Indexes XQuery F&O Execution Structures S Q L XQUERY Example SQL integration:  Example SQL integration SQL tables (get all 200K+ employees) select XMLQuery(‘ for $i in ora:view(“SCOTT”,”EMP”)/ROW where $i/SALARY > 200000 return $i/EMPNO’ returning content) from dual; <EMPNO>2100</EMPNO> <EMPNO>344</EMPNO> rich employee Example: XMLType Integration:  Example: XMLType Integration XMLType tables as input select p.XQuery(‘<PO pono=“{./PoNo}”/>’) from purchaseorder p where p.XQuery(‘@ShipAddr/City=“Fresno”]‘) is not null; After rewrite select XMLElement(“PO”, XMLAttributes(p.xmldata.”PONo”)) from purchaseorder p where p.xmldata.”ShipAddr”.”City” = ‘Fresno’ Example: Repository Integration:  Example: Repository Integration Query files from repository doc() queries files from repository (rewrite) collection() maps to directories select XQuery(‘ for $i in doc(“/public/foo.xml”) return $i’) from dual; <FOO> ....</FOO> Java Layer Architecture:  DB Adapter Java Execution engine Java Layer Architecture XQJ API Dr i ver XQuery XQueryX XPath XQJ API (Java) conn = datasrc.getConnection(); expr = conn.prepareExpression( ‘for $i in doc(“xxx”) return $i’); res = expr.executeQuery(); while (res.next()) { … } Push down query select * from XMLTable(‘for $i in doc(“xx”) return $i); User XMLDB SQL + XQuery or XQueryX XQuery aware SQL engine Rewrite to SQLX Normalization XQuery Type check SQL Compiler XMLQuery, XMLTable SQL/XML Rel optimizer, Execution engine Compiler XQuery Parser SQLX rewrite Query Pushdown Normalization Type check Compiler Parser XDS data sources, files Push down XQuery Java Engine XQuery Java implementation:  XQuery Java implementation XQuery or XQueryX input Extensible function implementation Compiles into rowsource like structures Optimization – push XQuery to XMLDB XQJ API driver – for accessing mid tier/backend Shared data model with XSL/XPath Shared F&O – pre-defined & external Standard Function implementation interfaces Write Java func once – use it in XQuery/XSLT IBM DB2:  IBM DB2 Hybrid SQL/XQuery Compiler XML Navigation Query Evaluation Run-time SQL/XML Parser XQuery Parser Table Storage XML Indexes XML Storage Native XML storage High-performance XPath processing Unified internal representation XML node-level, text indexes SQL extended for XML Dynamic function dispatch XML data type in SQL XQuery top-level parser Languages Compose See System RX in Session 10 Design Point:  Design Point XML Parsing is slow Thus: Native storage of the XQuery Data Model (QDM) Direct access to nodes From node to node From indexes The XML Schema repository:  The XML Schema repository register xmlschema ‘http://ibm.com/xsr/recipe-v1.xsd’ from ‘recipe-v1.xsd’ as recipe1 complete Stable, fast access to schemas stored in the database Schemas in many documents Recommended, but not required: Schemas have unique URIs for each schema version Reuse target namespace, except for drastic changes Documents specify the schema URI in xsi:schemaLocation schema URI local file DB2 id only one doc A Recipe in XML (v1):  A Recipe in XML (v1) <Recipe xmlns=“http://ibm.com/xsr/recipe” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation=“http://ibm.com/xsr/recipe http://ibm.com/xsr/recipe-v1.xsd” TimeToPrepare="5" CookMethod="Grill" Difficulty="Easy" Serves="1" Category="Entrees"> <Title>Veggie Dog with Onions</Title> <Ingredients> <Ingredient Name=“veggie sausage" Amount="1" /> <Ingredient Name=“hot dog bun" Amount="1" /> <Ingredient Name=“sliced onion" Amount="1" Unit=“ounce" /> <Ingredient Name=“mustard" Amount=“1" Unit="teaspoon" /> <Ingredient Name=“relish" Amount="2" Unit="teaspoon" /> </Ingredients> … <Comment>Some like spicy mustard, or ketchup instead</Comment> </Recipe> A Recipe in XML (v2):  A Recipe in XML (v2) <Recipe xmlns=“http://ibm.com/xsr/recipe” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation=“http://ibm.com/xsr/recipe http://ibm.com/xsr/recipe-v2.xsd” TimeToPrepare="5" CookMethod="Grill" Difficulty="Easy" Serves="1" Category="Entrees"> <Title>Veggie Dog with Onions</Title> <Ingredients> <Ingredient Name=“veggie sausage" Amount="1" /> … </Ingredients> <Nutrition> <Calories>300</Calories> <FatGrams>18.5</FatGrams> <CarboGrams>12</CarboGrams> <ProteinGrams>9.5</ProteinGrams> </Nutrition> </Recipe> Schema requires ‘Nutrition’ Incompatible schema change Very few applications affected Changing namespace or table means all applications affected A Recipe in XML (v3):  A Recipe in XML (v3) <Recipe xmlns=“http://ibm.com/xsr/recipe” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation=“http://ibm.com/xsr/recipe http://ibm.com/xsr/recipe-v3.xsd” TimeToPrepare=“5” CookMethod=“Grill” Difficulty=“Easy” Serves=“1” Category=“Entrees”> <Title>Veggie Dog with Onions</Title> <Ingredients> <Ingredient Name=“veggie sausage” > <Amount Unit=‘item’>1</Amount> </Ingredient> <Ingredient Name=“sliced onion” /> <Amount Unit=“ounce">1</Amount> <Amount Unit=“grams">30</Amount> </Ingredient> <Ingredient Name=“mustard” /> <Amount Unit=“teaspoon">1</Amount> <Amount Unit=“ml">5</Amount> … Internationalization: Amount in multiple units Incompatible schema change Applications using Amount affected, but not many others. Changing namespace or table means all applications affected Insert rows with XML:  Insert rows with XML insert into recipes (id,recipe) values (1, xmlParse(document ‘<?xml …’)) Schema derived from the document itself Schema (or not) per document xmlValidate() accepts computed XML too Including xmlValidate( xmlParse( … )) Parse XML from varchar, clob, or blob into QDM without validation insert into recipes (id,recipe) values (2, xmlValidate(?)) Validate XML from a parameter marker using xsi:schemaLocation Design Point:  Design Point Queries need to work with conflicting schemas Thus: Document-level validation No strict static typing There and back in SQL /XML:  There and back in SQL /XML xmlElement Create an element from relational data xmlForest Create elements from relational data xmlAttributes Create attributes from relational data xmlNamespaces Declare namespaces xmlSerialize Convert XML data to textual form xml2Clob Convert XML data to textual form xmlCast Convert XML to/from relational data Manipulating XML in SQL /XML:  Manipulating XML in SQL /XML xmlQuery Invoke XQuery to produce a sequence Takes SQL arguments, including XML Returns an XML sequence xmlTable Invoke XQuery to produce a table Takes SQL+XML, returns SQL+XML xmlExists Test XQuery result is nonempty Takes SQL+XML, returns Boolean xmlAgg Aggregate XML into a sequence xmlConcat Concatenate an XML sequence xmlExists example:  xmlExists example Return recipes that are both easy and tasty Both XML and relational predicates select recipe from recipes where xmlExists(‘ $r[Recipe/@Difficulty = “easy”] ’ passing by ref recipe as “r”) and rating > 3 xmlQuery example:  xmlQuery example Return the title of each recipe select xmlQuery(‘ $r/Recipe/Title ’ passing by ref recipe as “r” returning sequence) from recipes xmlTable example:  xmlTable example Return the title, servings, and ingredients of each easy recipe select R.id, X.title, X.howMany, X.stuff from recipes as R, xmlTable(‘$r/Recipe[@Difficulty = “easy”]’ passing R.recipe as “r” columns title varchar(50) path ‘./Title’, howMany int path ‘./@Serves’, stuff xml by ref path ‘./Ingredients/Ingredient’ returning sequence) as X XML data type is a XQuery Data Model sequence:  XML data type is a XQuery Data Model sequence DB2 always uses XML(sequence) vs. XML(content), … pass by ref vs. by value returning sequence vs. content Other flavors cause construction and copying Construction has side-effects: Node identity changed Simple values smashed together and lose types … Interferes with query optimization Returning sequences:  Returning sequences Return a list of ingredients select id, xmlQuery(‘ $r/ Recipe/ Ingredients/ Ingredient/ @Name/ data(.) ’ passing by ref recipe as “r” returning sequence) as stuff from recipes Design Point:  Design Point Cross-language optimization is necessary Thus: Single unified model Avoid copying as much as possible XQuery join example:  XQuery join example Return the recipes that the family likes xquery for $r in db2-fn:xmlcolumn(“RECIPES.RECIPE)/ Recipe for $yummyOne in db2-fn:sqlquery( “select xmlForest(title) from familyLikes”) where $r/ Title = $yummyOne return $r Same join in SQL /XML:  Same join in SQL /XML Return the recipes that the family likes select R.recipe from familyLikes as L, recipies as R where xmlExists(‘ $r/ Recipe[ Title = $yummyOne ] ’ passing by ref R.recipe as “r”, L.title as “yummyOne”) Design Point:  Design Point SQL is better than XQuery XQuery is better than SQL Thus: Support both languages Support language composition Give XQuery access to database data XML Indexes and query matching:  XML Indexes and query matching Indexing every node in a document is very expensive Slow insertion time Large space, log requirements Fast queries Index value of nodes returned by a simple XQuery ( (‘/’ | ‘//’) (axis)? (name-test | kind-test) )+ Given a query Q and an index I Determine if the index is applicable Build index pushdown QI : Q = QC (QI) Shredding XML into tables:  Shredding XML into tables Some applications require relational data Annotated schemas Can break document into XML fragments <xsd:element name=“phone” type=“xsd:string” sql:relation=“employee_tab” sql:field=“phone_col” />

Related presentations


Other presentations created by Elliott

Intergeo 07
01. 10. 2007
0 views

Intergeo 07

SSUP Aug2003
08. 11. 2007
0 views

SSUP Aug2003

dc ship design hf
05. 11. 2007
0 views

dc ship design hf

Insomnia 1
28. 11. 2007
0 views

Insomnia 1

West Bengal
11. 12. 2007
0 views

West Bengal

Chapter25t
12. 12. 2007
0 views

Chapter25t

Objy HSM chep98
25. 10. 2007
0 views

Objy HSM chep98

Roman Jeopardy
29. 10. 2007
0 views

Roman Jeopardy

Heuscher
29. 10. 2007
0 views

Heuscher

H113f
01. 11. 2007
0 views

H113f

HypatianAAAStalk
02. 11. 2007
0 views

HypatianAAAStalk

development
15. 11. 2007
0 views

development

mmir challenges
15. 11. 2007
0 views

mmir challenges

Powerpoint re walk and talk 2
19. 11. 2007
0 views

Powerpoint re walk and talk 2

MATUTKIMUS2002
20. 11. 2007
0 views

MATUTKIMUS2002

logarbindef
26. 11. 2007
0 views

logarbindef

Ruddy Ray
30. 12. 2007
0 views

Ruddy Ray

hong stigma vietnam
02. 01. 2008
0 views

hong stigma vietnam

mit 2002
03. 01. 2008
0 views

mit 2002

Wetlands poster
03. 01. 2008
0 views

Wetlands poster

DoesGenderMatterWEB
04. 01. 2008
0 views

DoesGenderMatterWEB

Lecture15
07. 01. 2008
0 views

Lecture15

Health Politics L2
07. 01. 2008
0 views

Health Politics L2

Lesson08 Running Fixes
07. 11. 2007
0 views

Lesson08 Running Fixes

ASPNETAJAX sunum
28. 11. 2007
0 views

ASPNETAJAX sunum

Logic Lesson 2
30. 10. 2007
0 views

Logic Lesson 2

CAP11 12 Pellicelli
20. 11. 2007
0 views

CAP11 12 Pellicelli

SDUT
24. 02. 2008
0 views

SDUT

02 DataFormats
27. 02. 2008
0 views

02 DataFormats

Fusion2007 Converse
06. 11. 2007
0 views

Fusion2007 Converse

24241
19. 11. 2007
0 views

24241

prezent eng
26. 10. 2007
0 views

prezent eng

Bartending and Me
07. 12. 2007
0 views

Bartending and Me

02Fischer
27. 03. 2008
0 views

02Fischer

bongi sw meetingpamela051006
31. 10. 2007
0 views

bongi sw meetingpamela051006

Synapsid1
02. 01. 2008
0 views

Synapsid1

TARSUV230
31. 10. 2007
0 views

TARSUV230

abortimi 1
06. 11. 2007
0 views

abortimi 1

Theil III
26. 11. 2007
0 views

Theil III

battlemind training novids
28. 12. 2007
0 views

battlemind training novids

Apr05NEIT Nichols presentation
06. 11. 2007
0 views

Apr05NEIT Nichols presentation

Alliance Group BRAC
30. 10. 2007
0 views

Alliance Group BRAC

RAIDEN Group 2007
14. 11. 2007
0 views

RAIDEN Group 2007

dgiavuss ENG
01. 11. 2007
0 views

dgiavuss ENG

Monterey Experiment Plan
06. 11. 2007
0 views

Monterey Experiment Plan