s409 guha

Information about s409 guha

Published on January 3, 2008

Author: Pumbaa

Source: authorstream.com

Content

SPACE EFFICENCY OF SYNOPSIS CONSTRUCTION ALGORITHMS:  SPACE EFFICENCY OF SYNOPSIS CONSTRUCTION ALGORITHMS Sudipto Guha UPENN Synopses:  Synopses Given n input numbers, summarize the input using B numbers, minimizing some error. Examples Histograms – piecewise constant repn. Wavelets – uses the wavelet basis Fourier, Bessel, SVD, what have you… Why space efficiency:  Why space efficiency “Interestingly, according to modern astronomers, space is finite. This is a very comforting thought – particularly for people who can never remember where they left things.” Woody Allen. From a computational viewpoint however… Space is the cruelest resource :  Space is the cruelest resource Resources Time : tweedle thumbs Access (stream): make more passes Program simply will not run – or if data is shifted to disk, will run quite slow(er). Further, if we had more space, maybe we can compute a better (more accurate) synopsis Examples - I:  Examples - I Histograms Many error measures V-OPT, Jagadish etal, 1998 O(n2B) time O(nB) space Only O(n) space at a time (working space) O(n2B2) time and O(n) space Is that the best ? Here: O(n2B) time O(n) space. Example - II:  Example - II (Haar) Wavelets Orthonormal systems For l2 error store the largest B coeffs of input Does not work for non l2 Find the best B coeffs to retain (note, restricted). Garofalakis & Kumar, 04 O(n2B log B) time O(n2B) space, but O(nB) needed at a time (for l1 ) Here O(n) space, and O(n2) time Example - III:  Example - III Extended Wavelets Multiple measures Optimization is similar to Knapsack with choices. Previous best – Deligiannakis and Rossopoulos, 04, O(Mn(B+ log n)) time and space O(MnB), but needing O(nM+MB) at a time Guha, Kim, Shim, 04, reduced space to O(BM+min {nM,B2}) Here, O(BM) space What we will not talk about:  What we will not talk about Approximation algorithms for histograms Range Query Histograms Basically improvement of a factor B in space across the board. B is not always small, specially when n is large The main idea:  The main idea Can we solve using a non DP paradigm ? Well, divide & conquer … Small details – how do we divide ? Interaction Does a small interaction partitioning exist ? How (much size) to represent it ? Ease of finding it (in the given representation) ? A case study - Histograms:  A case study - Histograms Formally, given a signal X find a piecewise constant representation H with at most B pieces minimizing ||X-H||2 Consider one bucket. The mean is the best value. A natural DP … The DP for histograms:  The DP for histograms Err[i,b] = Error of approximating x1,…,xi using b buckets For i=1 to n do For 2 to B do For j=1 to i-1 do Err[i,b] = min Err[i,b], Err[j,b-1] + error(j+1,i) B n What if:  What if We could figure out what was the story at the middlepoint ! Two questions So what ? How ? (use a DP) Wait a minute …:  Wait a minute … We just replaced a DP by another and claimed something … !!! Exactly. The second DP needs only O(n) space. So as the conquer steps re-use/share the same space; the total space is O(n) too. The idea is to use divide and conquer; and use a (small) DP to find the divide step. Is it really that simple ? The code:  The code The end of working space:  The end of working space If you can partition a problem using the working space – you can recompute the solution of the parts at a little extra cost. Working space = total space. How much is little ?:  How much is little ? Wavelets:  Wavelets A set of vectors {1,-1,0,0,0,0…}, {0,0,1,-1,0,0,…},{0,0,0,0,1,-1,0,0},{0,0,0,0,0,0,1-1} {1,1,-1,-1,0,0,0,0},{0,0,0,0,1,1,-1,-1} {1,1,1,1,-1,-1,-1,-1},{1,1,1,1,1,1,1,1} A natural multi-resolution Wavelet Synopsis Construction :  Wavelet Synopsis Construction Formally, given a signal X and the Haar basis {i} find a representation F=i zi i with at most B non-zero zi minimizing some error which a fn of X-F Restriction. Zi is either 0 or h X,i i Debate. Unrestricted or restricted. Omit. Wavelets:  Wavelets ||X-F||1 Long history Matias, Vitter Wang ’98 Garofalakis, Gibbons, ’02 Garofalakis, Kumar, ’04 State of the Art O(n2B log B) time O(n2B) space O(nB) working space Here O(n2log B) time O(n) space SEE ALSO NEXT TALK … What happens to wavelets [GK04] ?:  What happens to wavelets [GK04] ? Extensions:  Extensions Approximation Algorithms Range Query Histograms Extended Wavelets Histograms:  Histograms Saves space across all algorithms except algorithms which extend to general error measure over streams Range Query:  Range Query Same story Open Q: faster algorithm obeying synopsis size That’s all folks:  That’s all folks

Related presentations


Other presentations created by Pumbaa

christmas
16. 08. 2007
0 views

christmas

VERTEBRATES
12. 10. 2007
0 views

VERTEBRATES

Lec1Ch1 2IntroandHardware
15. 10. 2007
0 views

Lec1Ch1 2IntroandHardware

naos harbor project
22. 10. 2007
0 views

naos harbor project

MLA
05. 09. 2007
0 views

MLA

Bronx health disparities
05. 09. 2007
0 views

Bronx health disparities

writewithppt
05. 09. 2007
0 views

writewithppt

Persuading
05. 09. 2007
0 views

Persuading

JF KENNEDY
23. 10. 2007
0 views

JF KENNEDY

M3infectiousdisease sept
23. 10. 2007
0 views

M3infectiousdisease sept

Gerard Fries
24. 10. 2007
0 views

Gerard Fries

hen
04. 10. 2007
0 views

hen

prworkshop 07
02. 11. 2007
0 views

prworkshop 07

reptiles
26. 10. 2007
0 views

reptiles

csf
02. 11. 2007
0 views

csf

Turin OIAs
14. 11. 2007
0 views

Turin OIAs

Chapter15Personality1
17. 11. 2007
0 views

Chapter15Personality1

Passing Off
16. 08. 2007
0 views

Passing Off

Resurrection Slides
16. 08. 2007
0 views

Resurrection Slides

crucifixion
16. 08. 2007
0 views

crucifixion

MMRv2004PPT
28. 12. 2007
0 views

MMRv2004PPT

SlideShow2006 web
03. 01. 2008
0 views

SlideShow2006 web

17 a b
07. 10. 2007
0 views

17 a b

LBA2000
29. 10. 2007
0 views

LBA2000

nchrp w43
05. 01. 2008
0 views

nchrp w43

Personality Disorders
09. 08. 2007
0 views

Personality Disorders

rusko
09. 08. 2007
0 views

rusko

plenary4slides
09. 08. 2007
0 views

plenary4slides

chuang
09. 08. 2007
0 views

chuang

AALAS 02
07. 11. 2007
0 views

AALAS 02

Test score gaps Rev
05. 09. 2007
0 views

Test score gaps Rev

Presentacion PFIF Mar 2005
22. 10. 2007
0 views

Presentacion PFIF Mar 2005

09 user generated content 200407
17. 10. 2007
0 views

09 user generated content 200407

T4 08
22. 10. 2007
0 views

T4 08

Fry1441Lec19
28. 12. 2007
0 views

Fry1441Lec19

ruggiero
05. 09. 2007
0 views

ruggiero

Ferhat Ozcam
26. 11. 2007
0 views

Ferhat Ozcam

OpeningRestrInNYC
05. 09. 2007
0 views

OpeningRestrInNYC

etacharacteristicspp
02. 10. 2007
0 views

etacharacteristicspp

TEVTA
14. 02. 2008
0 views

TEVTA

2 Unit 1 lifestyle
20. 02. 2008
0 views

2 Unit 1 lifestyle

gc
03. 01. 2008
0 views

gc

INDIA06 11 CHengevoss Military
04. 03. 2008
0 views

INDIA06 11 CHengevoss Military

oil moc nyc072407
05. 09. 2007
0 views

oil moc nyc072407

travel
10. 03. 2008
0 views

travel

loh intranets portals
11. 03. 2008
0 views

loh intranets portals

Parasitology
09. 08. 2007
0 views

Parasitology

APrisonEpistles
25. 03. 2008
0 views

APrisonEpistles

2006080704
26. 03. 2008
0 views

2006080704

THE FUTURE OF AVIATION
26. 03. 2008
0 views

THE FUTURE OF AVIATION

GrayHarborWarm
07. 04. 2008
0 views

GrayHarborWarm

econ 3171 ppt slides ch 17
09. 04. 2008
0 views

econ 3171 ppt slides ch 17

PandemicPreparedness compress
10. 04. 2008
0 views

PandemicPreparedness compress

FMLAmediaframing
13. 04. 2008
0 views

FMLAmediaframing

P1D IntroReview 121407
16. 04. 2008
0 views

P1D IntroReview 121407

Gold Mine Pesentation
17. 04. 2008
0 views

Gold Mine Pesentation

IFM Barnhill 06 12 2001
22. 04. 2008
0 views

IFM Barnhill 06 12 2001

ce presentation
29. 02. 2008
0 views

ce presentation

lastdays welsh
16. 08. 2007
0 views

lastdays welsh

coop ppp
23. 11. 2007
0 views

coop ppp

ELL NM
09. 08. 2007
0 views

ELL NM

cbsss3
15. 10. 2007
0 views

cbsss3

Passion2
09. 08. 2007
0 views

Passion2

Personality Disorders Handout
09. 08. 2007
0 views

Personality Disorders Handout

Gaukinlec Marxism
14. 12. 2007
0 views

Gaukinlec Marxism

SÃtningsled
26. 11. 2007
0 views

SÃtningsled

buspres usA
29. 12. 2007
0 views

buspres usA

a thai way
16. 06. 2007
0 views

a thai way

MSG350 laahs SW2 show notes
09. 08. 2007
0 views

MSG350 laahs SW2 show notes

howe project
13. 03. 2008
0 views

howe project

PrayersandWritingsSt Edmund
09. 08. 2007
0 views

PrayersandWritingsSt Edmund

joshi revised
03. 10. 2007
0 views

joshi revised

harkins ref data
05. 09. 2007
0 views

harkins ref data

3474
02. 01. 2008
0 views

3474

bruschi
08. 10. 2007
0 views

bruschi

115 tunnelling 2006
15. 11. 2007
0 views

115 tunnelling 2006

SectorMeeting PrivateSchools
05. 09. 2007
0 views

SectorMeeting PrivateSchools

070227 JointBudgetHearingFi nal
05. 09. 2007
0 views

070227 JointBudgetHearingFi nal

DerryHymanRoughDraft
07. 12. 2007
0 views

DerryHymanRoughDraft

2Hochstein Trends
21. 11. 2007
0 views

2Hochstein Trends

HJepardy
16. 08. 2007
0 views

HJepardy

dcc life cycle
09. 08. 2007
0 views

dcc life cycle

Presentation Liu
16. 10. 2007
0 views

Presentation Liu

sr AkzoNobel4
12. 10. 2007
0 views

sr AkzoNobel4