Published on October 15, 2007
Protein Structure Prediction and Determination : Protein Structure Prediction and Determination Zhijun Wu Department of Mathematics Iowa State University Slide2: Biological Building Blocks DNA RNA PROTEIN GAA GTT GAA AAT CAG GCG AAC CCA CGA CTG GAA GUU GAA AAU CAG GCG AAC CCA CGA CUG GLU GAL GLU ASN GLN ALA ASN PRO ARG LEU Slide3: Protein Folding GLU GLU ASN VAL LEU ARG PRO ASN ALA GLN . . . GLU VAL GLU ASN GLN ALA ASN PRO ARG LEU Slide4: Prion, Stanley B. Prusiner, 1997, Nobel Prize in Physiology and Medicine Myoglobin, John Kendrew, 1962, Nobel Prize in Chemistry Examples Examples Slide5: Methods for Structure Prediction and Determination Potential Energy Minimization Nuclear Magnetic Resonance Protein X-ray Crystallography Homology Modeling, Fold Recognition, Inverse Protein Folding Molecular Dynamics Simulation Slide6: X-ray Crystallography Computing In X-ray crystallography, protein first needs to be purified and crystallized, which may take months or years to complete, if not failed. After that, the protein crystal is put into an X-ray equipment to make an X-ray diffraction image. The diffraction image can be used to determine the three-dimensional structure of the protein. 80% of the structures in PDB Data Bank were determined by using X-ray crystallography. The process is time consuming, and some proteins cannot even be crystallized. A mathematical problem, called the phase problem, needs to be solved before every crystal structure can be fully determined from the diffraction data. Slide7: NMR Structure Determination The NMR approach is based on the fact that nuclei spin and generate magnetic fields. When two nuclei are close their spins interact. The intensity of the interaction depends on the distance between the nuclei. Therefore, the distances between certain pairs of atoms can be estimated by measuring the intensities of the nuclei spin-spin couplings. The distance data obtained from the NMR experiment can be used to deduce the structural information for the molecule. One way of achieving such a goal is based on molecular distance geometry. 15% of the structures in PDB Data Bank were determined by using NMR spectroscopy. Structure can be determined by solving a distance geometry problem with the distance data from the NMR experiments. Not all distances between pairs of atoms can be detected. In practice, only lower and upper bounds for the distances can be obtained also. Slide8: Potential Energy Minimization Hypothesis: Protein native structure has the lowest or almost lowest potential energy. It can therefore be located at the global energy minimum of protein. A reasonably accurate potential energy function needs to be constructed. Potential energy minimization has been used successfully for structure refinement though. Given such a function, a local minimizer is easy to find, but a global one is hard, especially if the function has many local minimizers. No completely satisfactory algorithm has been developed yet for minimizing proteins. Slide9: Molecular Dynamics Simulation Folding can be simulated by following the movement of the atoms in protein according to Newton’s second law of motion. Current computing technology can make only picoseconds to microseconds of simulation, while protein folding may take seconds or even longer time. Molecular dynamics simulation has been used successfully for the study of other types of dynamical behavior of protein. The step size has to be small in femtosecond to achieve accuracy. Slide10: Sequence Structure Alignment Homology Modeling: Sequence to Sequence Fold Recognition: Sequence to Structure Inverse Protein Folding: Structure to Sequence Computing the best alignment is NP-hard in general when gaps are allowed. Scoring functions may not be able to distinguish between good and bad matches. Known Sequences / Structures Sequence Structure Alignment Ranking Sequences / Structures The results are not accurate and have only certain level of confidence.