# Tutorial07

Published on February 24, 2008

Author: fazil

Source: authorstream.com

Slide1:  Tutorial by Ma’ayan Fishelson Changes made by Anna Tzemach The Given Problem:  The Given Problem Input: A pedigree + phenotype information about some of the people. These people are called typed. Output: the probability of the observed data, given some probability model for the transmission of alleles. typed Slide3:  Q: What is the probability of the observed data composed of ? A: There are three types of probability functions: founder probabilities, penetrance probabilities, and transmission probabilities. Founder Probabilities – One Locus:  Suppose the gene frequency of d is 0.05, then: P(d/h) = 2 * 0.05 * 0.95 Founder Probabilities – One Locus Founders – individuals whose parents are not in the pedigree. We need to assign probabilities to their genotypes. This is done by assuming Hardy-Weinberg equilibrium. Founder Probabilities – Multiple Loci:  Founder Probabilities – Multiple Loci According to linkage equilibrium, the probability of the multi-locus genotype of founder k is: Pr(xk) = Pr(xk1) *…* Pr(xkn) Penetrance Probabilities:  Penetrance Probabilities Penetrance: the probability of the phenotype, given the genotype. E.g.,dominant disease, complete penetrance: E.g., recessive disease, incomplete penetrance: Can be, for example, sex-dependent, age-dependent, environment-dependent. Transmission Probabilities:  Transmission Probabilities Transmission probability: the probability of a child having a certain genotype given the parents’ genotypes. Pr(xc| xm, xf). If we split the ordered genotype xc into the maternal allele xcm and the paternal allele xcf, we get: Pr(xc| xm, xf) = Pr(xcm|xm)Pr(xcf|xf) The inheritance from each parent is independent. Transmission Probabilities – One locus:  Transmission Probabilities – One locus The transmission is according to the 1st law of Mendel. Pr(Xc=d/h | Xm=h/h, Xf=d/h) = Pr(Xcm=h | Xm=h/h)*Pr(Xcf=d | Xf=d/h) = 1 * ½ = ½ We also need to add the inheritance probability of the other phase, but we can see that it’s zero ! Transmission Probabilities – One locus:  Transmission Probabilities – One locus Different children are independent given the genotypes of their parents. Pr(X3=d/h, X4=h/h, x5=d/h | X1=d/h, X2=h/h) = = (1 * ½) * (1 * ½) * (1 * ½) Transmission Probabilities – Multiple Loci:  Transmission Probabilities – Multiple Loci Let’s look at paternal inheritance for example. We generate all possible recombination sequences (s1,s2,…,sn), where sl = 1 or sl = -1. (2n sequences for n loci). Each sequence determines a selection of paternal alleles p1,p2,…,pn where: and therefore its probability of inheritance is: We need to sum the probabilities of all 2n recombination sequences. Calculating the Likelihood of Family Data - Summary:  Calculating the Likelihood of Family Data - Summary For a pedigree with m people: where x=(x1,…,xm) and g=(g1,…,gm). The likelihood of the data is the probability of the observed data (the known phenotypes), given certain values for the unknown recombination fractions. Calculating the Likelihood of Family Data - Summary:  Calculating the Likelihood of Family Data - Summary Gi : genotype vector for individual i Founders: 1..k Non founders: im(i), f(i) Founder priorsby Hardy-Weinberg Recombinationprobabilities Penetrances Computational Problem:  Computational Problem Complexity disaster: Exponential in #markers Exponential in #individuals Performing a multiple sum over all possible genotype combinations for all members of the pedigree. Elston-Stewart algorithm:  Elston-Stewart algorithm The Elston-Stewart algorithm provides a means for evaluating the multiple sum in a streamlined fashion, for simple pedigrees. More efficient computation Exponential in #markers Linear in #individuals Simple Pedigree:  Simple Pedigree No consanguineous marriages, marriages of blood-related individuals ( no loops in the pedigree). There is one pair of founders from which the whole pedigree is generated. Simple Pedigree:  Simple Pedigree There is exactly one nuclear family T at the top generation. Every other nuclear family has exactly one parent who is a direct descendant of the two parents in family T and one parent who has no ancestors in the pedigree (such a person is called a founder). There are no multiple marriages. One of the parents in T is treated as the proband. “Peeling” Order:  “Peeling” Order Assume that the individuals in the pedigree are ordered such that parents precede their children, then the pedigree likelihood can be represented as: where is: P(gi), if i is a founder, or , otherwise. In this way, we first sum over all possible genotypes of the children and only then on the possible genotypes for the parents. An Example for “Peeling” Order:  An Example for “Peeling” Order h(gi) = P(xi|gi) P(gi) h(gm,gf,gc) = P(xc|gc) P(gc|gm,gf) Elston-Stewart “Peeling” Order:  Elston-Stewart “Peeling” Order As can be seen, this “peeling” order, “clips off” branches (sibships) of the pedigree, one after the other, in a bottom-up order. Elston-Stewart – Computational Complexity :  Elston-Stewart – Computational Complexity The computational complexity of the algorithm is linear in the number of people but exponential in the number of loci. Variation on the Elston-Stewart Algorithm in Fastlink:  Variation on the Elston-Stewart Algorithm in Fastlink The pedigree traversal order in Fastlink is some modification of the Elston-Stewart algorithm. Assume no multiple marriages… Nuclear family graph: Vertices: each nuclear family is a vertex. Edges: if some individual is a child in nuclear family x and a parent in nuclear family y, then x and y are connected by and edge x-y which is called a “down” edge w.r.t. x and an “up” edge w.r.t. y. Traversal Order:  Traversal Order Visit(w) { While w has an unvisited neighbor x reachable via an up edge: Visit(x); While w has an unvisited neighbor y reachable via a down edge: Visit(y); Update w; } One individual A is chosen to be a “proband”. For each genotype g, the probability is computed that A has genotype g conditioned on the known phenotypes for the rest of the pedigree and the assumed recombination fractions. The first family that is visited is a family containing the proband, preferably, a family in which he is a child. Traversal Order - Updates:  Traversal Order - Updates If nuclear family w is reached via a down edge from z, the parent in w that nuclear families w and z share, is updated. If nuclear family w is reached via an up edge from z, then the child that w and z share is updated. Example 1:  Example 1 Example 2:  Example 2

04. 03. 2008
0 views

03. 10. 2007
0 views

04. 10. 2007
0 views

28. 11. 2007
0 views

30. 11. 2007
0 views

03. 12. 2007
0 views

07. 12. 2007
0 views

13. 11. 2007
0 views

14. 11. 2007
0 views

27. 09. 2007
0 views

21. 11. 2007
0 views

29. 12. 2007
0 views

31. 12. 2007
0 views

01. 01. 2008
0 views

02. 01. 2008
0 views

29. 12. 2007
0 views

01. 11. 2007
0 views

19. 12. 2007
0 views

28. 02. 2008
0 views

07. 11. 2007
0 views

06. 03. 2008
0 views

10. 03. 2008
0 views

12. 03. 2008
0 views

14. 03. 2008
0 views

18. 03. 2008
0 views

21. 03. 2008
0 views

26. 03. 2008
0 views

27. 03. 2008
0 views

07. 04. 2008
0 views

30. 03. 2008
0 views

13. 04. 2008
0 views

26. 02. 2008
0 views

07. 11. 2007
0 views

07. 01. 2008
0 views

28. 11. 2007
0 views