Bayesian analysis and problems

Information about Bayesian analysis and problems

Published on June 15, 2007

Author: funnyside

Source: authorstream.com

Content

Bayesian analysis and problems with the frequentist approach:  Bayesian analysis and problems with the frequentist approach Rencontres de Moriond (QCD) La Thuile, 17-24 March, 2007 Glen Cowan Physics Department Royal Holloway, University of London [email protected] www.pp.rhul.ac.uk/~cowan Outline:  Outline 1 Probability Frequentist vs. Subjective (Bayesian) 2 Fitting with systematic errors nuisance parameters marginalization treatment of inconsistent data error on the error 3 Bayesian methods for limits 4 Comment on Bayesian goodness-of-fit 5 Comment on Bayesian computation MCMC extra slides Frequentist Statistics − general philosophy :  Frequentist Statistics − general philosophy In frequentist statistics, probabilities are associated only with the data, i.e., outcomes of repeatable observations. Probability = limiting frequency Probabilities such as P (Higgs boson exists), P (0.117 andlt; as andlt; 0.121), etc. are either 0 or 1, but we don’t know which. The tools of frequentist statistics tell us what to expect, under the assumption of certain probabilities, about hypothetical repeated observations. The preferred theories (models, hypotheses, ...) are those for which our observations would be considered ‘usual’. Bayesian Statistics − general philosophy :  Bayesian Statistics − general philosophy In Bayesian statistics, interpretation of probability extended to degree of belief (subjective probability). Use this for hypotheses: posterior probability, i.e., after seeing the data prior probability, i.e., before seeing the data probability of the data assuming hypothesis H (the likelihood) normalization involves sum over all possible hypotheses Bayesian methods can provide more natural treatment of non- repeatable phenomena: systematic uncertainties, probability that Higgs boson exists,... No golden rule for priors ('if-then' character of Bayes’ thm.) Statistical vs. systematic errors :  Statistical vs. systematic errors Statistical errors: How much would the result fluctuate upon repetition of the measurement? Implies some set of assumptions to define probability of outcome of the measurement. Systematic errors: What is the uncertainty in my result due to uncertainty in my assumptions, e.g., model (theoretical) uncertainty; modelling of measurement apparatus. Usually taken to mean the sources of error do not vary upon repetition of the measurement. Often result from uncertain value of, e.g., calibration constants, efficiencies, etc. Systematic errors and nuisance parameters:  Systematic errors and nuisance parameters Response of measurement apparatus is never modelled perfectly: x (true value) y (measured value) model: truth: Model can be made to approximate better the truth by including more free parameters. systematic uncertainty ↔ nuisance parameters Example: fitting a straight line:  Example: fitting a straight line Data: Model: measured yi independent, Gaussian: assume xi and si known. Goal: estimate q0 (don’t care about q1). Slide8:  Correlation between causes errors to increase. Standard deviations from tangent lines to contour Frequentist approach Slide9:  The information on q1 improves accuracy of Frequentist case with a measurement t1 of q1 Slide10:  Bayesian method We need to associate prior probabilities with q0 and q1, e.g., Putting this into Bayes’ theorem gives: posterior Q likelihood  prior ← based on previous measurement reflects ‘prior ignorance’, in any case much broader than Slide11:  Bayesian method (continued) Ability to marginalize over nuisance parameters is an important feature of Bayesian statistics. We then integrate (marginalize) p(q0, q1 | x) to find p(q0 | x): In this example we can do the integral (rare). We find Slide12:  Digression: marginalization with MCMC Bayesian computations involve integrals like often high dimensionality and impossible in closed form, also impossible with ‘normal’ acceptance-rejection Monte Carlo. Markov Chain Monte Carlo (MCMC) has revolutionized Bayesian computation. Google for ‘MCMC’, ‘Metropolis’, ‘Bayesian computation’, ... MCMC generates correlated sequence of random numbers: cannot use for many applications, e.g., detector MC; effective stat. error greater than √n . Basic idea: sample multidimensional look, e.g., only at distribution of parameters of interest. Slide13:  Although numerical values of answer here same as in frequentist case, interpretation is different (sometimes unimportant?) Example: posterior pdf from MCMC Sample the posterior pdf from previous example with MCMC: Summarize pdf of parameter of interest with, e.g., mean, median, standard deviation, etc. Slide14:  Bayesian method with vague prior Suppose we don’t have a previous measurement of q1 but rather some vague information, e.g., a theorist tells us: q1 ≥ 0 (essentially certain); q1 should have order of magnitude less than 0.1 ‘or so’. Under pressure, the theorist sketches the following prior: From this we will obtain posterior probabilities for q0 (next slide). We do not need to get the theorist to ‘commit’ to this prior; final result has ‘if-then’ character. Slide15:  Sensitivity to prior Vary () to explore how extreme your prior beliefs would have to be to justify various conclusions (sensitivity analysis). Try exponential with different mean values... Try different functional forms... A more general fit (symbolic):  A more general fit (symbolic) Given measurements: and (usually) covariances: Predicted value: control variable parameters bias Often take: Minimize Equivalent to maximizing L() » e-2/2, i.e., least squares same as maximum likelihood using a Gaussian likelihood function. expectation value Its Bayesian equivalent:  Its Bayesian equivalent and use Bayes’ theorem: To get desired probability for , integrate (marginalize) over b: → Posterior is Gaussian with mode same as least squares estimator,  same as from 2 = 2min + 1. (Back where we started!) Take Joint probability for all parameters The error on the error:  The error on the error Some systematic errors are well determined Error from finite Monte Carlo sample Some are less obvious Do analysis in n ‘equally valid’ ways and extract systematic error from ‘spread’ in results. Some are educated guesses Guess possible size of missing terms in perturbation series; vary renormalization scale Can we incorporate the ‘error on the error’? (cf. G. D’Agostini 1999; Dose andamp; von der Linden 1999) A prior for bias b(b) with longer tails:  A prior for bias b(b) with longer tails Gaussian (s = 0) P(|b| andgt; 4sys) = 6.3  10-5 s = 0.5 P(|b| andgt; 4sys) = 6.5  10-3 b(b) b Represents ‘error on the error’; standard deviation of ps(s) is ss. A simple test:  A simple test Suppose fit effectively averages four measurements. Take sys = stat = 0.1, uncorrelated. Case #1: data appear compatible Posterior p(|y): Usually summarize posterior p(|y) with mode and standard deviation: experiment measurement  p(|y) Simple test with inconsistent data:  Simple test with inconsistent data Case #2: there is an outlier → Bayesian fit less sensitive to outlier. → Error now connected to goodness-of-fit. Posterior p(|y): experiment measurement  p(|y) Goodness-of-fit vs. size of error:  Goodness-of-fit vs. size of error In LS fit, value of minimized 2 does not affect size of error on fitted parameter. In Bayesian analysis with non-Gaussian prior for systematics, a high 2 corresponds to a larger error (and vice versa). 2000 repetitions of experiment, s = 0.5, here no actual bias. posterior  2  from least squares Slide23:  Summary Bayesian methods allow (indeed require) prior information about the parameters being fitted. This type of prior information can be difficult to incorporate into a frequentist analysis This will be particularly relevant when estimating uncertainties on predictions of LHC observables that may stem from theoretical uncertainties, parton densities based on inconsistent data, etc. Prior ignorance is not well defined. If that’s what you’ve got, don’t expect Bayesian methods to provide a unique solution. Try a reasonable variation of priors -- if that yields large variations in the posterior, you don’t have much information coming in from the data. You do not have to be exclusively a Bayesian or a Frequentist Use the right tool for the right job Slide24:  Extra slides Slide25:  Happy Birthday, MINUIT Thanks to Al Eisner for pointing this out! Bayes’ theorem:  Bayes’ theorem From the definition of conditional probability we have and but , so Bayes’ theorem First published (posthumously) by the Reverend Thomas Bayes (1702−1761) An essay towards solving a problem in the doctrine of chances, Philos. Trans. R. Soc. 53 (1763) 370; reprinted in Biometrika, 45 (1958) 293. Some Bayesian references :  Some Bayesian references P. Gregory, Bayesian Logical Data Analysis for the Physical Sciences, CUP, 2005 D. Sivia, Data Analysis: a Bayesian Tutorial, OUP, 2006 S. Press, Subjective and Objective Bayesian Statistics: Principles, Models and Applications, 2nd ed., Wiley, 2003 A. O’Hagan, Kendall’s, Advanced Theory of Statistics, Vol. 2B, Bayesian Inference, Arnold Publishers, 1994 A. Gelman et al., Bayesian Data Analysis, 2nd ed., CRC, 2004 W. Bolstad, Introduction to Bayesian Statistics, Wiley, 2004 E.T. Jaynes, Probability Theory: the Logic of Science, CUP, 2003 Slide28:  Example: Poisson data with background Count n events, e.g., in fixed time or integrated luminosity. s = expected number of signal events b = expected number of background events n ~ Poisson(s+b): Sometimes b known, other times it is in some way uncertain. Goal: measure or place limits on s, taking into consideration the uncertainty in b. Widely discussed in HEP community, see e.g. proceedings of PHYSTAT meetings, Durham, Fermilab, CERN workshops... Slide29:  The Bayesian approach to limits In Bayesian statistics need to start with ‘prior pdf’ p(q), this reflects degree of belief about q before doing the experiment. Bayes’ theorem tells how our beliefs should be updated in light of the data x: Integrate posterior pdf p(q | x) to give interval with any desired probability content. For e.g. Poisson parameter 95% CL upper limit from Slide30:  Bayesian prior for Poisson parameter Include knowledge that s ≥0 by setting prior p(s) = 0 for sandlt;0. Often try to reflect ‘prior ignorance’ with e.g. Not normalized but this is OK as long as L(s) dies off for large s. Not invariant under change of parameter — if we had used instead a flat prior for, say, the mass of the Higgs boson, this would imply a non-flat prior for the expected number of Higgs events. Doesn’t really reflect a reasonable degree of belief, but often used as a point of reference; or viewed as a recipe for producing an interval whose frequentist properties can be studied (coverage will depend on true s). Slide31:  Bayesian interval with flat prior for s Solve numerically to find limit sup. For special case b = 0, Bayesian upper limit with flat prior numerically same as classical case (‘coincidence’). Otherwise Bayesian limit is everywhere greater than classical (‘conservative’). Never goes negative. Doesn’t depend on b if n = 0. Upper limit versus b:  Upper limit versus b b If n = 0 observed, should upper limit depend on b? Classical: yes Bayesian: no FC: yes Feldman andamp; Cousins, PRD 57 (1998) 3873 Slide33:  Coverage probability of confidence intervals Because of discreteness of Poisson data, probability for interval to include true value in general andgt; confidence level (‘over-coverage’) Slide34:  Bayesian limits with uncertainty on b Uncertainty on b goes into the prior, e.g., Put this into Bayes’ theorem, Marginalize over b, then use p(s|n) to find intervals for s with any desired probability content. Controversial part here is prior for signal s(s) (treatment of nuisance parameters is easy). Discussion on limits :  Discussion on limits Different sorts of limits answer different questions. A frequentist confidence interval does not (necessarily) answer, 'What do we believe the parameter’s value is?' Coverage — nice, but crucial? Look at sensitivity, e.g., E[sup | s = 0]. Consider also: politics, need for consensus/conventions; convenience and ability to combine results, ... For any result, consumer will compute (mentally or otherwise): Need likelihood (or summary thereof). consumer’s prior Uncertainty from parametrization of PDFs:  Uncertainty from parametrization of PDFs Try e.g. (MRST) (CTEQ) or The form should be flexible enough to describe the data; frequentist analysis has to decide how many parameters are justified. In a Bayesian analysis we can insert as many parameters as we want, but constrain them with priors. Suppose e.g. based on a theoretical bias for things not too bumpy, that a certain parametrization ‘should hold to 2%’. How to translate this into a set of prior probabilites? Residual function:  Residual function Try e.g. ‘residual function’ where r(x) is something very flexible, e.g., superposition of Bernstein polynomials, coefficients i: mathworld.wolfram.com Assign priors for the i centred around 0, width chosen to reflect the uncertainty in xf(x) (e.g. a couple of percent). → Ongoing effort. Slide38:  MCMC basics: Metropolis-Hastings algorithm Goal: given an n-dimensional pdf generate a sequence of points 1) Start at some point 2) Generate Proposal density e.g. Gaussian centred about 3) Form Hastings test ratio 4) Generate 5) If else move to proposed point old point repeated 6) Iterate Slide39:  Metropolis-Hastings (continued) This rule produces a correlated sequence of points (note how each new point depends on the previous one). For our purposes this correlation is not fatal, but statistical errors larger than naive The proposal density can be (almost) anything, but choose so as to minimize autocorrelation. Often take proposal density symmetric: Test ratio is (Metropolis-Hastings): I.e. if the proposed step is to a point of higher , take it; if not, only take the step with probability If proposed step rejected, hop in place. Slide40:  Metropolis-Hastings caveats Actually one can only prove that the sequence of points follows the desired pdf in the limit where it runs forever. There may be a 'burn-in' period where the sequence does not initially follow Unfortunately there are few useful theorems to tell us when the sequence has converged. Look at trace plots, autocorrelation. Check result with different proposal density. If you think it’s converged, try it again with 10 times more points.

Related presentations


Other presentations created by funnyside

HCM Luzern
18. 06. 2007
0 views

HCM Luzern

lecture12
12. 10. 2007
0 views

lecture12

basketball
01. 08. 2007
0 views

basketball

Atacama Large Millimeter Array
15. 06. 2007
0 views

Atacama Large Millimeter Array

gen mtgmar07 slides
18. 06. 2007
0 views

gen mtgmar07 slides

THAKKAR ANJALI artificial blood
16. 10. 2007
0 views

THAKKAR ANJALI artificial blood

20030815 RepICFA HN
27. 09. 2007
0 views

20030815 RepICFA HN

Owen
03. 10. 2007
0 views

Owen

14BBlackHoles
07. 10. 2007
0 views

14BBlackHoles

Noda06
09. 10. 2007
0 views

Noda06

reptile
10. 10. 2007
0 views

reptile

leskov
15. 10. 2007
0 views

leskov

461
17. 10. 2007
0 views

461

Europe Presentation
19. 10. 2007
0 views

Europe Presentation

RadOnc
22. 10. 2007
0 views

RadOnc

schnittsysteme
23. 10. 2007
0 views

schnittsysteme

ch1 10 part2
23. 10. 2007
0 views

ch1 10 part2

firma end Presentation
19. 10. 2007
0 views

firma end Presentation

sta291 5c
29. 10. 2007
0 views

sta291 5c

ACM India Studies 2005
26. 11. 2007
0 views

ACM India Studies 2005

Mirabel
01. 12. 2007
0 views

Mirabel

ghana
10. 12. 2007
0 views

ghana

DanishCitationEventJB final
15. 10. 2007
0 views

DanishCitationEventJB final

aleynikov
26. 10. 2007
0 views

aleynikov

ShortageChicago Seago 2
29. 10. 2007
0 views

ShortageChicago Seago 2

Daylight Saving Time
07. 11. 2007
0 views

Daylight Saving Time

unity
21. 08. 2007
0 views

unity

Fad Diets
21. 08. 2007
0 views

Fad Diets

NACADA Combined Workshop
21. 08. 2007
0 views

NACADA Combined Workshop

TeraGrid SC06 1
22. 10. 2007
0 views

TeraGrid SC06 1

plane parallel 2 1
24. 10. 2007
0 views

plane parallel 2 1

cronologia storia contemporanea
13. 11. 2007
0 views

cronologia storia contemporanea

IDitarod
16. 11. 2007
0 views

IDitarod

Axiome Von Neumann Morgenstern
19. 11. 2007
0 views

Axiome Von Neumann Morgenstern

Dog bite sepsis
19. 11. 2007
0 views

Dog bite sepsis

Girish
21. 11. 2007
0 views

Girish

EcoValidity
17. 12. 2007
0 views

EcoValidity

matthew
29. 12. 2007
0 views

matthew

Cyatheaceae Classification
02. 01. 2008
0 views

Cyatheaceae Classification

RFF Shinnecock
03. 01. 2008
0 views

RFF Shinnecock

AustralianPoetry
03. 01. 2008
0 views

AustralianPoetry

dbl dutch
22. 11. 2007
0 views

dbl dutch

Lecture10 overheads
21. 08. 2007
0 views

Lecture10 overheads

Deborah Russell Presentation
14. 11. 2007
0 views

Deborah Russell Presentation

AAPT v3
01. 08. 2007
0 views

AAPT v3

EScannapieco
15. 11. 2007
0 views

EScannapieco

Successful  Case  Studies
29. 09. 2007
0 views

Successful Case Studies

Interv Hajji
23. 10. 2007
0 views

Interv Hajji

khelif edson rep
09. 10. 2007
0 views

khelif edson rep

aadcrocodile
11. 10. 2007
0 views

aadcrocodile

matricole
16. 10. 2007
0 views

matricole

the three little pigs
14. 02. 2008
0 views

the three little pigs

Exon HKSB
30. 10. 2007
0 views

Exon HKSB

June06SPCCUpdateJVan voorhis
07. 11. 2007
0 views

June06SPCCUpdateJVan voorhis

cultural impact hispanics tn
24. 02. 2008
0 views

cultural impact hispanics tn

SE 05 Punctuation I
29. 02. 2008
0 views

SE 05 Punctuation I

GaryCorbett e
04. 03. 2008
0 views

GaryCorbett e

Radiation Hazards
06. 03. 2008
0 views

Radiation Hazards

Portoroz
04. 10. 2007
0 views

Portoroz

jdean c10
25. 03. 2008
0 views

jdean c10

SoS
25. 03. 2008
0 views

SoS

MVestergaard SantaFe06
29. 11. 2007
0 views

MVestergaard SantaFe06

PowerPointTour
02. 11. 2007
0 views

PowerPointTour

tal daly
10. 04. 2008
0 views

tal daly

matz011
16. 04. 2008
0 views

matz011

installaware faq
17. 04. 2008
0 views

installaware faq

ATMSELong
18. 04. 2008
0 views

ATMSELong

36329938
22. 04. 2008
0 views

36329938

ITCan
10. 03. 2008
0 views

ITCan

kejian4
11. 10. 2007
0 views

kejian4

061213 dialog north korea petrov
11. 10. 2007
0 views

061213 dialog north korea petrov

Session6 Adm Wilcocks
07. 11. 2007
0 views

Session6 Adm Wilcocks

high technology surgery
30. 04. 2008
0 views

high technology surgery

Robot Convoy final presentation
07. 01. 2008
0 views

Robot Convoy final presentation

harry truman
18. 06. 2007
0 views

harry truman

goldrush
18. 06. 2007
0 views

goldrush

GMK
18. 06. 2007
0 views

GMK

gatech
18. 06. 2007
0 views

gatech

fusion be
18. 06. 2007
0 views

fusion be

Fun Facts updated 3 05
18. 06. 2007
0 views

Fun Facts updated 3 05

Geomatic AAU 07032007 forkortet
18. 06. 2007
0 views

Geomatic AAU 07032007 forkortet

General Spring Intro
18. 06. 2007
0 views

General Spring Intro

intervento Ambrosi Legge125
18. 06. 2007
0 views

intervento Ambrosi Legge125

insostegno
18. 06. 2007
0 views

insostegno

Inhalts angabe
18. 06. 2007
0 views

Inhalts angabe

Implant Simulations
18. 06. 2007
0 views

Implant Simulations

implantate berlin
18. 06. 2007
0 views

implantate berlin

Imber stud
18. 06. 2007
0 views

Imber stud

hsm bend2
18. 06. 2007
0 views

hsm bend2

history mat storia dei diritti
18. 06. 2007
0 views

history mat storia dei diritti

headquarters structure
18. 06. 2007
0 views

headquarters structure

Department Of Conservation
23. 10. 2007
0 views

Department Of Conservation

Hippo
18. 07. 2008
0 views

Hippo

pa3 disare ppt
28. 09. 2007
0 views

pa3 disare ppt

Sudan Polio Update 03 05
21. 10. 2007
0 views

Sudan Polio Update 03 05

bradford ancer
21. 08. 2007
0 views

bradford ancer

Delaunay Triangulations
15. 06. 2007
0 views

Delaunay Triangulations

Biofuels - Think Big
15. 06. 2007
0 views

Biofuels - Think Big

AVO  - Virtual Observatory
15. 06. 2007
0 views

AVO - Virtual Observatory

PHOBOS Collaboration
15. 06. 2007
0 views

PHOBOS Collaboration

gennaipm
18. 06. 2007
0 views

gennaipm

sigcse2003
07. 12. 2007
0 views

sigcse2003

Introductionto Kyrgyzstan
26. 11. 2007
0 views

Introductionto Kyrgyzstan

HIT SAN
18. 06. 2007
0 views

HIT SAN

Metzenbaum Shelley
23. 10. 2007
0 views

Metzenbaum Shelley

ppt CHAP018
24. 02. 2008
0 views

ppt CHAP018

lesson3 warm up
07. 04. 2008
0 views

lesson3 warm up

shelving with lcc
19. 02. 2008
0 views

shelving with lcc

25ConnectionsV5 p2
18. 10. 2007
0 views

25ConnectionsV5 p2

2506043
15. 11. 2007
0 views

2506043

TravelChapter7 3
27. 03. 2008
0 views

TravelChapter7 3

french rfp
18. 06. 2007
0 views

french rfp

BEFxii Coronado
09. 04. 2008
0 views

BEFxii Coronado

spectrumastrotalk
28. 11. 2007
0 views

spectrumastrotalk

metadata frames
05. 10. 2007
0 views

metadata frames

Dancefinal
27. 11. 2007
0 views

Dancefinal