# prosper

Published on December 28, 2007

Author: Lucianna

Source: authorstream.com

Bayesian Within The Gates A View From Particle Physics:  Bayesian Within The Gates A View From Particle Physics Harrison B. Prosper Florida State University SAMSI 24 January, 2006 Outline:  Outline Measuring Zero as Precisely as Possible! Signal/Background Discrimination 1-D Example 14-D Example Some Open Issues Summary Measuring Zero!:  Measuring Zero! Diamonds may not be forever Neutron <-> anti-neutron transitions, CRISP Experiment (1982 – 1985), Institut Laue Langevin Grenoble, France Method Fire gas of cold neutrons onto a graphite foil. Look for annihilation of anti-neutron component. Measuring Zero!:  Measuring Zero! Count number of signal + background events N. Suppress putative signal and count background events B, independently. Results: N = 3 B = 7 Measuring Zero!:  Measuring Zero! Classic 2-Parameter Counting Experiment N ~ Poisson(s+b) B ~ Poisson(b) Wanted: A statement like s < u(N,B) @ 90% CL Measuring Zero!:  Measuring Zero! In 1984, no exact solution existed in the particle physics literature! But, surely it must have been solved by statisticians. Alas, from Kendal and Stuart I learnt that calculating exact confidence intervals is “a matter of very considerable difficulty”. Measuring Zero!:  Measuring Zero! Exact in what way? Over the ensemble of statements of the form s є [0, u) at least 90% of them should be true whatever the true value of the signal s AND whatever the true value of the background parameter b. blame… Neyman (1937) Slide8:  “Keep it simple, but no simpler” Albert Einstein Bayesian @ the Gate (1984):  Bayesian @ the Gate (1984) Solution: p(N,B|s,b) = Poisson(s+b) Poisson(b) the likelihood p(s,b) = uniform(s,b) the prior Compute the posterior density p(s,b|N,B) p(s,b|N,B) = p(N,B|s,b) p(s,b)/p(N,B) Marginalize over b p(s|N,B) = ∫p(s,b|N,B) db This reasoning was compelling to me then, and is much more so now! Particle Physics Data:  Particle Physics Data proton + anti-proton -> positron (e+) neutrino (n) Jet1 Jet2 Jet3 Jet4 This event “lives” in 3 + 2 + 3 x 4 = 17 dimensions. Particle Physics Data:  CDF/Dzero Discovery of top quark (1995) Data red Signal green Background blue, magenta Dzero: 17-D -> 2-D Particle Physics Data Slide12:  But that was then, and now is now! Today we have 2 GHz laptops, with 2 GB of memory! It is fun to deploy huge, sometimes unreliable, computational resources, that is, brains, to reduce the dimensionality of data. But perhaps it is now feasible to work directly in the original high-dimensional space, using hardware! Signal/Background Discrimination:  Signal/Background Discrimination The optimal solution is to compute p(S|x) = p(x|s) p(s) / [p(x|s) p(s) + p(x|B) p(B)] Every signal/background discrimination method is ultimately an algorithm to approximate this solution, or a mapping thereof. Therefore, if a method is already at the Bayes limit, no other method, however sophisticated, can do better! Signal/Background Discrimination:  Given D = x, y x = {x1,…xN}, y = {y1,…yN} of N training examples Infer A discriminant function f(x, w), with parameters w p(w|x, y) = p(x, y|w) p(w) / p(x, y) = p(y|x, w) p(x|w) p(w) / p(y|x) p(x) = p(y|x, w) p(w) / p(y|x) assuming p(x|w) -> p(x) Signal/Background Discrimination Signal/Background Discrimination:  A typical likelihood for classification: p(y|x, w) = Pi f(xi, w)y [1 – f(xi, w)]1-y where y = 0 for background events y = 1 for signal events If f(x, w) flexible enough, then maximizing p(y|x, w) with respect to w yields f = p(S|x), asymptotically. Signal/Background Discrimination Signal/Background Discrimination:  However, in a full Bayesian calculation one usually averages with respect to the posterior density y(x) = ∫ f(x, w) p(w|D) dw Questions: 1. Do suitably flexible functions f(x, w) exist? 2. Is there a feasible way to do the integral? Signal/Background Discrimination Answer 1: Hilbert’s 13th Problem!:  Answer 1: Hilbert’s 13th Problem! Prove that the following is impossible y(x,y,z) = F( A(x), B(y), C(z) ) In 1957, Kolmogorov proved the contrary conjecture y(x1,..,xn) = F( f1(x1),…,fn(xn) ) I’ll call such functions, F, Kolmogorov functions Kolmogorov Functions:  Kolmogorov Functions A neural network is an example of a Kolmogorov function, that is, a function capable of approximating arbitrary mappings f:RN -> U The parameters w = (u, a, v, b) are called weights Answer 2: Use Hybrid MCMC:  Answer 2: Use Hybrid MCMC Computational Method Generate a Markov chain (MC) of N points {w} drawn from the posterior density p(w|D) and average over the last M points. Each point corresponds to a network. Software Flexible Bayesian Modeling by Radford Neal http://www.cs.utoronto.ca/~radford/fbm.software.html A 1-D Example:  A 1-D Example Signal p+pbar -> t q b Background p+pbar -> W b b NN Model Class (1, 15, 1) MCMC 500 tqb + Wbb events Use last 20 networks in a MC chain of 500. x Wbb tqb A 1-D Example :  A 1-D Example x Dots p(S|x) = HS/(HS+HB) HS, HB, 1-D histograms Curves Individual NNs n(x, wk) Black curve < n(x, w) > A 14-D Example (Finding Susy!):  A 14-D Example (Finding Susy!) Transverse momentum spectra Signal: black curve Signal/Noise 1/100,000 A 14-D Example (Finding Susy!):  A 14-D Example (Finding Susy!) Missing transverse momentum spectrum (caused by escape of neutrinos and Susy particles) Variable count 4 x (ET, h, f) + (ET, f) = 14 A 14-D Example (Finding Susy!):  Likelihood Prior A 14-D Example (Finding Susy!) Signal 250 p+pbar -> top + anti-top (MC) events Background 250 p+pbar -> gluino gluino (MC) events NN Model Class (14, 40, 1) (641-D parameter space!) MCMC Use last 100 networks in a Markov chain of 10,000, skipping every 20. But does it Work?:  But does it Work? Signal to noise can reach 1/1 with an acceptable signal strength But does it Work? :  But does it Work? Let d(x) = N p(x|S) + N p(x|B) be the density of the data, containing 2N events, assuming, for simplicity, p(S) = p(B). A properly trained classifier y(x) approximates p(S|x) = p(x|S)/[p(x|S) + p(x|B)] Therefore, if the signal and background events are weighted with y(x), we should recover the signal density. But does it Work? :  But does it Work? Amazingly well ! Some Open Issues:  Some Open Issues Why does this insane function p(w1,…,w641|x1,…,x500) behave so well? 641 parameters > 500 events! How should one verify that an n-D (n ~ 14) swarm of simulated background events matches the n-D swarm of observed events (in the background region)? How should one verify that y(x) is indeed a reasonable approximation to the Bayes discriminant, p(S|x)? Summary:  Summary Bayesian methods have been, and are being, used with considerable success by particle physicists. Happily, the frequentist/Bayesian Cold War is abating! The application of Bayesian methods to highly flexible functions, e.g., neural networks, is very promising and should be broadly applicable. Needed: A powerful way to compare high-dimensional swarms of points. Agree, or not agree, that is the question!

02. 10. 2007
0 views

04. 01. 2008
0 views

08. 05. 2008
0 views

14. 04. 2008
0 views

29. 09. 2007
0 views

03. 10. 2007
0 views

03. 10. 2007
0 views

07. 10. 2007
0 views

09. 10. 2007
0 views

12. 10. 2007
0 views

12. 10. 2007
0 views

21. 10. 2007
0 views

22. 10. 2007
0 views

11. 10. 2007
0 views

24. 10. 2007
0 views

16. 10. 2007
0 views

13. 10. 2007
0 views

30. 10. 2007
0 views

15. 11. 2007
0 views

11. 10. 2007
0 views

23. 10. 2007
0 views

16. 11. 2007
0 views

20. 11. 2007
0 views

15. 10. 2007
0 views

15. 10. 2007
0 views

23. 12. 2007
0 views

02. 11. 2007
0 views

31. 12. 2007
0 views

07. 01. 2008
0 views

18. 10. 2007
0 views

24. 10. 2007
0 views

29. 09. 2007
0 views

13. 11. 2007
0 views

07. 01. 2008
0 views

23. 10. 2007
0 views

27. 09. 2007
0 views

23. 10. 2007
0 views

17. 10. 2007
0 views

12. 10. 2007
0 views

20. 02. 2008
0 views

26. 02. 2008
0 views

08. 10. 2007
0 views

29. 02. 2008
0 views

22. 10. 2007
0 views

19. 10. 2007
0 views

03. 04. 2008
0 views

07. 04. 2008
0 views

18. 03. 2008
0 views

10. 04. 2008
0 views

11. 04. 2008
0 views

04. 10. 2007
0 views

17. 04. 2008
0 views

22. 04. 2008
0 views

16. 03. 2008
0 views

19. 02. 2008
0 views

07. 05. 2008
0 views

15. 10. 2007
0 views

02. 05. 2008
0 views

02. 05. 2008
0 views

02. 05. 2008
0 views

02. 05. 2008
0 views

02. 05. 2008
0 views

02. 05. 2008
0 views

02. 05. 2008
0 views

01. 11. 2007
0 views

10. 03. 2008
0 views

15. 10. 2007
0 views

05. 10. 2007
0 views

30. 03. 2008
0 views

18. 10. 2007
0 views

22. 10. 2007
0 views

09. 10. 2007
0 views

16. 10. 2007
0 views

09. 10. 2007
0 views

29. 10. 2007
0 views

24. 10. 2007
0 views

21. 10. 2007
0 views

24. 10. 2007
0 views

06. 11. 2007
0 views

13. 03. 2008
0 views

28. 04. 2008
0 views

05. 10. 2007
0 views

25. 10. 2007
0 views

25. 03. 2008
0 views

01. 01. 2008
0 views

04. 01. 2008
0 views

09. 10. 2007
0 views

29. 10. 2007
0 views

02. 10. 2007
0 views

11. 10. 2007
0 views

16. 10. 2007
0 views

22. 10. 2007
0 views

02. 11. 2007
0 views