Published on October 15, 2007
Slide1: A Young Mathematician’s Reflection on Vision: Illposedness & Regularizations Jackie (Jianhong) Shen School of Mathematics University of Minnesota, Minneapolis, MN 55455 Workshop on Regularization in Statistics Banff International Research Station, Canada; September 6-11, 2003 Abstract: Abstract Vision, the perception of the 3-D world from its 2-D partial projections onto the left and right retinas, is fundamentally an illposed inverse problem. But after millions of years' of evolution, human vision has become astonishingly accurate and satisfactory. How could it have become such a remarkable inverse problem solver, and what are the hidden (or subconscious) regularization techniques it employs? This talk attempts to reflect and shed some light on these “billion-dollar” or Nobel-level questions (the works of several Nobel Laureates will be mentioned indeed), based on the limited but unique experience and philosophy of a young mathematician. In particular, we discuss the topological, geometric, statistical/Bayesian, and psychological regularization techniques of visual perception. I owe deep appreciation to my former advisors, mentors, and teachers leading me to the Door of Vision : Professors Gil Strang, Tony Chan, Stan Osher, David Mumford, Stu Geman, and Dan Kersten. Agenda: Agenda An Abstract View of Vision: Imaging and Perception Illposedness of Visual Perception: Root and Solutions Conscious or Subconscious Regularizations Topological Regularization: Generic Viewpoint Principle Geometric Regularization: Perception & Role of Curvature Statistical Regularization: Gibbs Fields and Learning Psychological Regularization: Role of Weber’s Law. Dedication: Dedication “ Dedicated to all pioneering mathematicians in Mathematical Image and Vision Analysis (Miva), on whose shoulders we the younger generations are standing, and on whose shoulders, we the younger must think deeper, speak louder, and look further beyond …” - Jackie Shen Slide5: An Abstract View of Vision : Passive Imaging & Active Visual Perception Biological or Digital (Passive) Imaging Process: Biological or Digital (Passive) Imaging Process q: viewing position | angle u: 2-D image on the biological or digital retina A 3-D world scene Optical imaging process (human vision or digital camera) can be modeled as a function (or operator): Here, 2 and 3 indicate the dimension of the spatial variables. G and R denote the configuration (scene | geometry) and the reflectance. All the a’s are parameters such as I and q. Passive Imaging Process: Lattice (or continuum) of photoreceptors Active Visual Perception: Active Visual Perception q: viewing position | angle u: 2-D image on the biological or digital retina A 3-D world scene Lattice (or continuum) of photoreceptors Perception is to reconstruct the 3-D world (geometry, topology, material surface properties, light source, etc.) from the observed 2-D image: Active Visual Perception: A Quick Overview of Mathematicians’ Missions: A Quick Overview of Mathematicians’ Missions Develop mathematical formulations for all the important psychological and cognitive discoveries in visual perception. As in geometry and physics (Hamiltonian, Statistical or Quantum Mechanics), unify and extract the most fundamental laws or axioms of perception, and develop their mathematical foundations. Decode the computational and information-theoretic efficiency behind human visual perception, and integrate such knowledge into digital and computational decision and optimization algorithms. Develop novel mathematical tools and theories arising from such studies, and apply them to other scientific and engineering fields, such as data visualization/mining; pattern recognition/learning/coding; multiscale modeling. Is It Risky for a Young Mathematician to Study Perception?: Is It Risky for a Young Mathematician to Study Perception? Quoting Albert Einstein: “The object of all science, whether natural science or psychology , is to co-ordinate our experiences and to bring them into a logical system.” - from “Space and Time in Pre-Relativity Physics,” May, 1921. To me, as for Special or General Relativities, being logical necessarily means being mathematical : Cause-effect modeling and description (e.g., learning theory); Formulating basic psychological/perceptual laws (e.g. Weber); Simulating brain computation using computers; Verifying existing data, and furnishing reasonable predictions. etc. Slide10: Illposedness of Visual Perception: Root of Illposedness Solutions to Illposedness Vision is an Inverse Computer Graphics Problem: Vision is an Inverse Computer Graphics Problem 3-D Computer Graphics: (Hollywood animated movies) G: Geometric configuration of a 3-D scene (bike, table,…) R: Material surface property: reflectance field. I : ILLUMINANCE (incident lights, light source and type). Goal: Generate a 2-D image U, which looks exactly like the image that one sees if facing such a real 3-D scene. Visual Perception is an Inverse C.G. Problem (Kersten): Given a 2-D image U, one attempts to reconstruct its 3-D world: G, R, I, viewing position & angle, etc. Dan Kersten (1997) Vision is an illposed Inverse Problem (A-1): Vision is an illposed Inverse Problem (A-1) A. Geometry is not invertible: depth or range is lost ! Mathematical Model (1): Projective Imaging P: (x1, x2, x3) (x1, x2) is not invertible! For any given 2-D curve g2, there are infinitely many 3-D curves g3, so that P(g3) = g2 . Example. g3 = (cos t, sin t, t ) g2 is just like the projection of a circle Vision is an illposed Inverse Problem (A-2): Mathematical Model (2): Perspective Imaging Imaging plane (or retina) an ellipse a parabola a hyperbola All three different types of curves are imaged as a 2-D circle ! Vision is an illposed Inverse Problem (A-2) Vision is an illposed Inverse Problem (B): B. The Reflectance-Illuminance entanglement. R I R I identical images ! different 3-D scenes Vision is an illposed Inverse Problem (B) But vision still makes sense,: But vision still makes sense, after millions of years’ evolution , which implies that: The human vision system is a well developed system of software and hardware, which can solve this highly ill-posed inverse problem efficiently and robustly. Fundamental Questions: What kind of regularization techniques the human vision system employs to conquer the illposedness? What kind of features or variables to regularize ? Deterministic Model: Tikhonov Regularization: Deterministic Model: Tikhonov Regularization Tikhonov conditioning technique for inverse problems: forward data generating: U0 = F(X). backward (inverse) reconstruction of X: min d [ F(X), U0 ] + R(X), where d is a suitable discrepancy metric, and R is the regularizing/conditioning term. Example. (Cleaning and sharpening of astronomical images) U0 = h * X + n . (h : atmosphere blurring ; n: white noise) Such an inverse problem is typically solved by Tikhonov Meets Bayes: Perception as Inference: Tikhonov Meets Bayes: Perception as Inference Bayesian Perception: X = (geometry G, reflectance R, …). Gibbs’ energy formula: Prob (y) = 1/Z exp ( - E[y] / k T ). Thus, in terms of energies, we have which leads to Tikhonov Inversion ! Conclusion: It is our a priori knowledge of the world (i.e., Prob [X ] ) that regularizes our visual perception! A Priori Knowledge (Common Sense) of the World Regularizes Vision: A Priori Knowledge (Common Sense) of the World Regularizes Vision G: Knowledge of curve and shape geometry; I: Knowledge of light sources & illuminance (sun, lamps, indoor or outdoor, …); R: Knowledge of materials (wood, bricks, …) and surface reflectance (metal shines and water sparkles, …) Q : Knowledge of the viewers (often standing perpendicularly to the ground, viewing more horizontally, several feet away for indoor scenes, …) …… Example I: Prior Knowledge on Curves & Boundaries: Example I: Prior Knowledge on Curves & Boundaries Straight lines: length(g), or 1-D Hausdorff. Euler’s elastica (Mumford , Chan-Kang-Shen ): Piecewise elastica (Shah ): which allows corners or hinges (i.e. the discrete set S) along the curve. Example II. Prior Knowledge on Material Reflectance: Example II. Prior Knowledge on Material Reflectance The Mumford-Shah  free-boundary model: Many challenging free-boundary problems. Connections to interface motions (such as the mean curvature motion). W curve energy Reflectance distribution R(x) in the visual field W Deeper Question: What to Learn & How: Deeper Question: What to Learn & How How does the vision system subconsciously choose what knowledge to learn and store, out of massive visual data in daily life? For such knowledge, to which degree of regularity or compression that the vision system “decides” to process, in order to achieve maximum efficiency and robustness? How to mathematically model (or quantify) such activities? Partial answer by Nobel Laureate (1972) Gerald Edelman: Neural Darwinism: (Basic Books, Inc., 1987) The vision system in each individual operates as a selective system resembling natural section in evolution, but operating by different mechanisms. Slide22: Topological Regularization: Principle of Generic Viewpoint vs. Theorem of Transversality in Differential Topology Transversal Phenomenon: Transversal Phenomenon projection A B C Digital or Biological 2-D Retina 3-D world and are transversal, in the sense of differential topology: and are not tangent at any point Transversal Is Generic: Stability & Universality: Transversal Is Generic: Stability & Universality Suppose that (2, 3 are dimensions, and P is the projection) are transversal. Then for any perturbation to , within an e distance under a suitable smoothness norm, are still transversal. Theorem (Milnor): Let M and N be two smooth manifolds, and P N submanifold. Then any smooth mapping f: M N can be approximated arbitrarily closely by one which is transversal to P. M N P Stability or Openness Universality or Denseness Principle of Generic Viewpoint in Vision: Principle of Generic Viewpoint in Vision Generic Viewpoint: (Nakayama & Shimojo, Science ; Freeman, Nature ) What we see is generic w.r.t. the viewpoint ! Mathematical Model: Suppose we see an image Then where denote small perturbations of the a’s, controlled by e, and D is a vision motivated measure, metric or distance. In terms of curves, D can be based on invariants of differential topology: smoothness, corners, connected components, branching, … How G.V. Principle Regularizes Perception: 3 Examples: How G.V. Principle Regularizes Perception: 3 Examples How the Generic Viewpoint Principle regularizes visual perception (conditioning the illposedness): Set up preference/biases among all possible scenes. (Uniqueness) Lead to topologically unique solutions. (Stability) Lead to the stability of perceptual solutions. Example 1: Curve Perception : Example 1: Curve Perception Given the image g , which 3-D scene (curve) satisfies the G.V. Principle? Answer: G1. For G2, when the viewing angle q is perturbed a little bit, the image would be: which is not diffeomorphic to g (since branching degree d and number of connected components c are invariants). Take the metric D, for example, D = difference in d + difference in c + other visual metric terms. OR A pitchfork Two curves Example 2: Shape Perception : Given the image u, which 3-D scenario satisfies the G. V. Principle? Answer: Planar Plate. If the viewer moves a little bit (i.e. viewing angle q q+e), the image of the V-wing scene would become The two 2-D shapes are NOT diffeomorphic since corner is an invariant. (This is because: if f is a diffeomorphism, then the Jacobian Jf is non-singular, and a corner can never be smoothed out leading to a straight line. Example 2: Shape Perception A painted planar plate a uniform V-wing Example 3: shape perception : Simian or Human: Example 3: shape perception : Simian or Human For image u, we “see” two human faces. No doubt. If we move the face contours closer till touching (image v), do we still percept two human faces? or two simians? Psychologists show that most of us percept the latter. WHY? My explanation based on G. V. Principle: In 3-D, give the two faces a slight distance difference (from the viewer). Then the perception of two human faces violates the G.V. Principle, while the perception of two (100% transparent except along the outlines) simian faces satisfies the G.V. Principle, or the Stability Principle. [Classical Interpretation is based on Gestalt.] Dan Kersten, Vision Psychology, UMN d d=0 u v Two human faces Two simian faces Slide30: Geometric Regularization: Perception and Roles of Curvature Why Curvature: (A) Architectural Evidence: Why Curvature: (A) Architectural Evidence If human vision were blind to curvature, architects would not have bothered to build these beautiful structures ! (Disclaimer: All pictures are Internet downloads, not taken by the author.) St. Louis Arch, USA Eiffel Tower, Paris MIT’s Big Dome, Cambridge, USA Why Curvature: (B) Anatomic Evidence: Why Curvature: (B) Anatomic Evidence The work of Nobel Laureates (1981) David Hubel & Torstern Wiesel (1962): Down to the interior Quantized array of cortical visual neurons and their oriented receptive fields In my opinion, this provides the most intimate evidence of visual processing of the curvature information. WHY ? Because Curvature= Spatial Organization of Orientations: k = dq/ds ! Visual cortical surface Why Curvature: (C) Heuristic Mathematical Argument : Why Curvature: (C) Heuristic Mathematical Argument Taylor expansion for a curve: Question: Which orders do matter to human visual perception? Answer: Like Newtonian mechanics, only up to the second order. Heuristic Mathematical Argument: First order matters: since we can detect rotation! Second order matters: since we have a strong sense of concavity/convexity/inflection. Third order? s What Do All These Mean?: What Do All These Mean? The proceeding evidences clearly demonstrate the significance, feasibility, legitimacy of curvature processing in visual perception. So what? It implies to: Employ curvature as a deterministic regularizer for solving numerous ill-posed inverse problems in image and vision analysis ( Mumford’s proposal of Euler’s elastica regularizer (1994), also Masnou-Morel (1998, ICIP ), Chan-Kang-Shen, (2002, SJAP ), Esedoglu-Shen (2002, EJAP ) ); Encode curvature in data structure and organization (e.g., Candes-Donoho’s curvelets (1999; 2002, IEEE Trans. IP )); Develop and study curvature-based mathematical models (e.g., Chan-Shen’s CDD 3rd nonlinear PDE model (2001, JVCIR ), Esedoglu-Shen’s proposal of Mumford-Shah-Euler model (2002, EJAP ); Euclidean and affine invariant scale spaces (Alvarez-Guichard-Lions-Morel, 1993; Calabi-Olver-Tannenbaum, 1996); mean curvature motions (Evans-Spruck, 1991)). We focus on this point. Perceptual Interpolation of Missing Boundaries: Perceptual Interpolation of Missing Boundaries Problem: Find a curve g (t), 0<t<1, which passes through the two endpoints p and q , and looks “natural.” Vision background: Most objects in our material world are not transparent. Occlusion is universal. Human vision must be able to integrate the dissected information (by suitable interpolation), in order to successfully perceive the world. Why regularization is needed: (1) (non-uniqueness) otherwise too many curves to choose from: Brownian paths, say; (2) to regularize is to properly model “being natural.” p q p q Occluding object Second Order Polynomial Regularization: Second Order Polynomial Regularization Let us try polynomial regularization: But which order is generically sufficient? Count the constraints: in 2-D, totally 2+2+1+1=6 conditions. Theorem. As long as , there exists a unique parabolic interpolant in the form of Modeling “Being Natural” This solution echoes the earlier claim of second order sufficiency Second Order Geometric Regularization: Euler’s Elastica: Second Order Geometric Regularization: Euler’s Elastica Variational approach: under these constraints, to minimize called Euler’s elastica. Energy was studied by Euler (1744) to model the steady shape of a thin and torsion free elastica rod. The equilibrium equation: which is an elliptic integral, and the solution can be expressed by elliptic functions (Mumford ). Mumford  Image Interpolation or Inpainting: Image Interpolation or Inpainting Inpainting is an artistic way of saying Image Interpolation (as first used in IP by Bertalmio et al (SIGGRAPH, 2000)); Partial image information loss is very common: Occlusion caused by non-transparent objects; Data loss in wireless transmission; Cracks in ancient paintings due to pigment aging/weather; Insufficient number of image acquisition sensors. etc Example I: Occlusion Example II: Cracks Read Shen (SIAM News, 36(5), Inpainting and Fundamental Problem of IP, 2003) Chan-Shen’s Inpainting Model via BV Regularizer: Chan-Shen (SIAM J. Appl. Math., 62(3), 2001) The TV inpainting model: The associated formal Euler-Lagrange equation on W: BV regularizer least square (for uniform Gaussian noise) with Neumann adiabatic condition along the boundary of W. Chan-Shen’s Inpainting Model via BV Regularizer Fidelity Index Existence is guaranteed, but uniqueness is not. Formally looks very similar to Rudin-Osher-Fatemi’s denoising model ! Chan-Shen’s Model for Inpainting Noisy & Blurry Images: Chan-Shen’s Model for Inpainting Noisy & Blurry Images movie forever Chan-Shen (AMS Contemporary Math., 2002) Suppose K=Gt, is the Gaussian kernel. Then, the model gives a good inverting of heat diffusion. Without the BV regularization, backward diffusion is notoriously ill-posed. The BV Regularizer is Insufficient for Inpainting: The BV Regularizer is Insufficient for Inpainting (Kanisza, Nitzberg-Mumford, Chan-Shen) Long distance connection is too expensive for the TV cost ! Cheaper to simply break it. We need curvature! Lifting Curve Regularity to Image Regularity: Lifting Curve Regularity to Image Regularity Using the level-sets of an image, we can “lift” a curve model to an image model (formally; theoretical study by Bellettini, et al., 1992): Connection to the mean curvature flow (Evans-Spruck, 1991): Curvature of level sets Elastica Inpainting Model: Curvature Incorporated: Theorem (associated E.-L. PDE). The gradient descent flow is given by where V is called the flux field , with proper boundary conditions. Elastica Inpainting Model: Curvature Incorporated Chan-Kang-Shen (SJAP, 2002; also ref. to Masnou-Morel, 1998 ) Elastica Inpainting: Nonlinear Tranport + CDD: Transport along the isophotes (level sets): Curvature driven diffusion (CDD) across the isophotes: Conclusion: Elastica inpainting unifies the earlier work of Bertalmio, Sapiro, Caselles, and Ballester (SIGGRAPH, 2000) on transport based inpainting, and that of Chan and Shen (JVCIR, 2001) on CDD inpainting (motivated by human visual perception). Chan-Kang-Shen (SJAP, 2002 ) Elastica Inpainting: Nonlinear Tranport + CDD n t level sets Normal component Tangential component Elastica Inpainting. I. Smoother Completion: Effect 1: as b/a increases, connection becomes smoother. Elastica Inpainting. I. Smoother Completion Chan-Kang-Shen (SJAP, 2002 ) Elastica Inpainting. II. Long Distance is Cheaper : Effect 2: as b/a increases, long distance connection gets cheaper. Elastica Inpainting. II. Long Distance is Cheaper For more theoretical and computational (4th order nonlinear!) details, please see Chan-Kang-Shen (SJAP, 63(2), 2002). Inpainting Regularized by Mumford-Shah Regulerizer : The Mumford-Shah (1989) image model was initially designed for the segmentation application: Chan-Shen (SJAP, 2000), Tsai-Yezzi-Willsky (2001), Esedoglu-Shen (EJAP, 2002) Inpainting Regularized by Mumford-Shah Regulerizer Mumford-Shah based inpainting is to minimize: Inpainting domain possible blurring A free boundary optimization problem. Mumford-Shah Inpainting: Algorithm: Mumford-Shah Inpainting: Algorithm For the current guess of edge completion G, find u to minimize equivalent to solving the elliptic equation on W\G: This updated guess of u then guides the motion of G: M. C. Motion Jump across G of the roughness measure R- R+ [We can then benefit from the level-set implementation by Chan-Vese.] Esedoglu-Shen (Europ. J. Appl. Math., 2002) M.-S. Inpainting via G-Convergence Approximation: edge G is approximated by a signature function z. Esedoglu-Shen (Europ. J. Appl. Math., 2002) z=1 z=0 M.-S. Inpainting via G-Convergence Approximation The G-convergence approximation of Ambrosio-Tortorelli (1990): Esedoglu-Shen shows that inpainting is the perfect market for G-convergence Leading to Simple Elliptic Implementation: The associated equilibrium PDEs are two coupled elliptic equations for u and z, with Neuman boundary conditions: Esedoglu-Shen (2002) which can be solved numerically by any efficient elliptic solver. Leading to Simple Elliptic Implementation Applications: Disocclusion and Text Removal: Esedoglu-Shen (2002) Inpainting domain inpainted u inpainted u the edge signature z Applications: Disocclusion and Text Removal Insufficiency of Mumford-Shah Regularity for Inpainting: Defect I: Artificial corners Defect II: Fail to realize the Connectivity Principle, like BV. Insufficiency of Mumford-Shah Regularity for Inpainting Esedoglu-Shen’s Proposal: Mumford-Shah-Euler: Idea: change the straight-line curve model embedded in the Mumford-Shah model to Euler’s elastica: The G-convergence approximation (conjecture) of De Giorgi (1991): Esedoglu-Shen (2002) For the technical and computational details, please see Esedoglu-Shen. Esedoglu-Shen’s Proposal: Mumford-Shah-Euler Inpainting Based on Mumford-Shah-Euler Regularity: Inpainting Based on Mumford-Shah-Euler Regularity Esedoglu-Shen (2002) Issues: Very costly: 4th order PDE Many local minima Without approximation, it is difficult to implement geometry Conclusion for Curvature Based Regularization: Conclusion for Curvature Based Regularization Curvature processing has its anatomical vision foundation Curvature processing has its cognitive/perceptual foundation Curvature based regularization is necessary for Image analysis and coding Image processing and modeling Image computation: algorithms and implementation schemes Curvature imposes both theoretical & computational challenges: non-quadratic objective functionals; nonlinear 3rd and 4th order PDEs; insufficient theory for wellposedness (existence, uniqueness, definition domains, etc.) … Slide56: Statistical Regularization: Gibbs Fields and Learning Stochastic View of Images: Random Fields: Stochastic View of Images: Random Fields An observed image u is treated as a randomly drawn sample. The sample space is an ensemble of images with its own (often unknown) distribution. An ensemble equipped with a probability distribution m or p naturally leads to a random field (R. F.) on the image domain. Two distinguished features of such stochastic view: We are not interested in the pixelwise details of a particular image, rather, the key statistical features of the R. F. associated. The R. F. is assumed to be ergodic : for a typical sample image u, spatial statistics converge to ensemble/field statistics. In this view, image regularization is built into the distribution m . Geman-Geman (1984), Grenander (1993), Zhu-Mumford (1997) Stochastic View of Images: Examples: Stochastic View of Images: Examples Crops (Internet download) Wood texture (Internet download) Reaction-diffusion spots Reaction-diffusion stripes (Shen-Jung, 2003) (Shen-Jung, 2003) Stochastic Regularization: Gibbs Fields: Stochastic Regularization: Gibbs Fields Stochastic regularity is encoded or specified by the random field distribution p (u ) . If p is a uniform distribution, then there is not much tangible “information” associated to the images, or in Shannon’s language, the information content is the lowest (equivalently, no regularity is in presence as the entropy reaches highest). In Grenander’s (founder of Pattern Theory, Brown U) language, textures are built from basic building elements (“atoms”), and these “atoms,” like molecules, are bound together by local regularity energies, leading to informative images. Thus Gibbs Fields Model seems natural to characterize such stochastic regularity: Gibbs Canonical Ensemble/Formula: Geman-Geman (1984), Grenander (1993), Zhu-Mumford (1997) partition function “regularity energy” visual “temperature” Regularization: Visual Potentials and Their Duals: Regularization: Visual Potentials and Their Duals What can a box of air (molecules) tell us: Microscopically never stops fluctuating (lacking regularity) Macroscopically there are a few key feature potentials: Temperature T Pressure p Chemical potentials m These potentials have their dual (additive) variables Energy E, dual to the inverse temperature b=1/kT. Volume V, dual to the pressure p. Mole numbers N, dual to the chemical potential m. Gibbs Generalized Ensemble Model says, Prob (a micro state) = 1/Z .exp(- b E[.] - b p V[.] + b m N[.] ). Conclusion: to apply Gibbs regularity in image analysis, one needs to properly identify visually meaningful potentials and their duals. (These feature parameters will regularize images.) Example of Gibbs Canonical Images: Binary Ising Images: Example of Gibbs Canonical Images: Binary Ising Images Ising’s Model (1925, for ferromagnetic phase transition originally): Binary images on the Cartesian lattice Z2. u=u(i,j)=+1 or –1 (spin up or down). In an external (biasing) magnetic field H , the energy associated to each observed field u is: J : internal energy of magnetic dipole (neighbor pair) a, b, c : representing general pixels, ~: neighboring. But generic (natural) images are not generated from such clean physics. Challenge is to develop good models for (generalized) energies, and properly model short-range or long-range visual interactions. What Features to Regularize: Visual Filters: What Features to Regularize: Visual Filters Treat the human vision system as a system of linear filters: T= (F1 , F2 , …, FN ). Each filter Fn is characterized by its special capability in resolving a particular orientation q with a particular spatial frequency w at a particular scale s . That is, Fn’s are parametrized by the feature vector ( q, w, s ). Basic Assumption: To human vision, the random image fields are completely describable (at least in satisfactory approximation) by the filter response T[u]. All other features are blindly filtered out, and treated visually insignificant. [ This is the hidden rule of visual regularization!! ] Justification (from Statistical Mechanics): Though the dimension of the phase space of a box of air molecules is huge, in equilibrium such a system can be accurately described by three feature parameters: temperature, pressure, and volume. Zhu-Mumford (1997) From Visual Filters to Potentials: Maximum Entropy Learning of Zhu-Mumford: From Visual Filters to Potentials: Maximum Entropy Learning of Zhu-Mumford Following the proceeding basic assumption, a Gibbs image model would be ONLY based on the statistics of the filter outputs: T[u] = ( F1 [u], F2 [u], … , FN [u] ). But how exactly? Zhu-Mumford’s MEL Model/Scheme: Each filter output vn = Fn [u] is by itself a random field. In the ideal case, suppose we do know the random field distributions: qn ( vn ), n =1:N. Then, these can be used as a set of constraints on the original Gibbs image u. That is, p=p(u) should lead to Prob( vn ) = qn ( vn ), n =1:N. Under these set of constraints, one can use Gibbs variational formulation of Statistical Mechanics to find the unique Gibbs field that maximizes the entropy, which necessarily takes the form of In reality, the spatial structure of vn is often ignored and each is treated as a field of i.i.d.’s. Thus the joint p.d.f qn is a direct tensor product, and is replaced by its 1-D histogram. Zhu-Mumford (1997) Dirac’s delta Learning of Regularization Will Be Momentous …: Learning of Regularization Will Be Momentous … A brief conclusion: Statistical regularization is often achieved through visually meaningful filters and filtering processes. The Gibbs field model can be learned based on the empirical statistics (e.g. histograms) of these filter outputs of a typical image sample, and the ergodicity assumption is thus crucial in such learning processes. Maximum Entropy based Learning Processes (Melp) will play an increasingly important role in vision and pattern analysis. Slide65: Psychological Regularization: Role of Weber’s Law: A Case Study Weber’s Law: Weber’s Law Weber’s Law in Psychology: “Let u denote the mean field of a background (sound/light), and du the intensity increment just detectable by human perception (ears/eyes) [or the so-called JND: just-noticeable-difference]. Then du/u = constant.” First qualitatively described by German physiologist E. H. Weber in 1834; Later formulated quantitatively by the great experimental psychologist G. T. Fechner in 1858. Search your own experience for validating Weber’s Law: In a fully packed stadium: high bgsound u => have to cry loud to be effectively heard by other folks (i.e., du has to be high as well); “月明星稀” (An ancient Chinese idiom): translated to “In a night with a bright full moon, the stars always look scarce.” Shen (Physica D, 175, 2003) Is Weber’s Law Psychological or Physiological ?: Is Weber’s Law Psychological or Physiological ? Jackie Shen’s Theorem: Any commonly shared psychological phenomenon (among most of the 6,000,000,000 people on this planet) has to be physiological. Weber’s Law expresses the light adaptive capability of the frontal end of the entire vision system – the two retinas. Without Weber’s Law, the retinas would not be able to operate over a wide range of light intensities: from several photons to bright solar light, since the membrane potential V of a neuron cell has a saturated maximum value. Weber’s Law is the result of a feedback mechanism (Tranchina-Perskin, 1988) of the retina system, which is “implemented” by the biochemical physiology (ion channels) of the photoreceptors (McNaughton, 1990). James Keener (Math. Physiology, 1998) Weber’s Law: Regularization Can Respect Real Perception: Weber’s Law: Regularization Can Respect Real Perception What does Weber’s Law have to do with regularization? Before Shen (2003), all variational image regularizers in image and vision analysis are defined by either the Sobolev norm as in the Linear Filtering Theory, and the Mumford-Shah model (1989), or the Total Variation (TV) Radon measure as in the Rudin-Osher-Fatemi model (Physica D, 1992). The fundamental assumption of such regularizers is that “human visual sensitivity to small fluctuations (or irregularities) depends on nothing else but themselves.” But this is inappropriate according to Weber’s Law ! Shen (Physica D, vol. 175, 2003) Weberized Regularization and Applications: Weberized Regularization and Applications Thus Shen (2003) proposed to Weberize (a word conveniently coined) the conventional Sobolev or TV regularizers to or, For example, the Weberized TV denoising and deblurring model (for additive Gaussian noise) would be to minimize the energy where the light intensity field (or image) u is non-negative. Shen (Physica D, vol. 175, 2003) Existence and Uniqueness of Weberization: Existence and Uniqueness of Weberization Admissible space of u: Existence Theorem: Assume that and . Then there exists at least one minimizer in the admissible space D. Uniqueness Theorem: Assume that u = z(x) in D is a minimizer and at each pixel x. Then z(x) is unique. Shen (Physica D, vol. 175, 2003) Weberized TV Restoration: An Example: Weberized TV Restoration: An Example Profile of one horizontal slice from noisy image after Weberized TV restoration One-Sentence Conclusion of Today’s Talk: One-Sentence Conclusion of Today’s Talk Regularization is crucial for visual perception, and presents numerous challenges as well as opportunities for further statistical and mathematical modeling. Slide73: That is all, folks… Thank you for your patience! Jackie Acknowledgments: Acknowledgments School of Mathematics and IMA, University of Minnesota (UMN). Tony Chan, Stan Osher, Luminita Vese, Selim Esedoglu (UCLA); S.-H. Kang (U. Kentucky); Yoon-Mo Jung (UMN). Gil Strang (my Ph.D. advisor, MIT) for his vision, guide, and support on research. David Mumford and Stu Geman (Division Appl Math, Brown U). Dan Kersten and Paul Schrater (Psychology & EECS, UMN). S. Masnou and J.-M. Morel (France); G. Sapiro and M. Bertalmio (EECS, UMN). Fadil Santosa, Peter Olver, Hans Othmer, Bob Gulliver, Willard Miller, Doug Arnold, Mitch Luskin (Colleagues at Math, UMN). National Science Foundations (NSF), Program of Applied Mathematics; Office of Navy Research (ONR). All the generous support and warm words from: Jayant Shah, Andrea Bertozzi, David Donoho, James Murray, Rachid Deriche.