Motivation: Medicine and health sciences are changing from the classical
symptom-based to a more personalized and genetics-based paradigm, with an
invaluable impact in health-care. While advancements in genetics were already
contributing significantly to the knowledge of the human organism, the
breakthrough achieved by several recent initiatives provided a comprehensive
characterization of the human genetic differences, paving the way for a new era
of medical diagnosis and personalized medicine.
Data generated from these and posterior experiments are now becoming
available, but its volume is now well over the humanly feasible to explore. It is
then the responsibility of computer scientists to create the means for extracting
the information and knowledge contained in that data.
Within the available data, genetic structures contain significant amounts of
encoded information that has been uncovered in the past decades. Finding,
reading and interpreting that information are necessary steps for building
computational models of genetic entities, organisms and diseases; a goal that
in due course leads to human benefits.
Aims: Numerous patterns can be found within the human variome and exome.
Exploring these patterns enables the computational analysis and manipulation
of digital genomic data...
Models in computational biology, such as those used in binding, docking, and folding, are often empirical and have adjustable parameters. Because few of these models are yet fully predictive, the problem may be nonoptimal choices of parameters. We describe an algorithm called ENPOP (energy function parameter optimization) that improves-and sometimes optimizes-the parameters for any given model and for any given search strategy that identifies the stable state of that model. ENPOP iteratively adjusts the parameters simultaneously to move the model global minimum energy conformation for each of m different molecules as close as possible to the true native conformations, based on some appropriate measure of structural error. A proof of principle is given for two very different test problems. The first involves three different two-dimensional model protein molecules having 12 to 37 monomers and four parameters in common. The parameters converge to the values used to design the model native structures. The second problem involves nine bumpy landscapes, each having between 4 and 12 degrees of freedom. For the three adjustable parameters, the globally optimal values are known in advance. ENPOP converges quickly to the correct parameter set.
In principle, given the amino acid sequence of a protein, it is possible to compute the corresponding three-dimensional structure. Methods for modelling structure based on this premise have been under development for more than 40 years. For the past decade, a series of community wide experiments (termed Critical Assessment of Structure Prediction (CASP)) have assessed the state of the art, providing a detailed picture of what has been achieved in the field, where we are making progress, and what major problems remain. The rigorous evaluation procedures of CASP have been accompanied by substantial progress. Lessons from this area of computational biology suggest a set of principles for increasing rigor in the field as a whole.
The cardiac cell is a complex biological system where various processes interact to generate electrical excitation (the action potential, AP) and contraction. During AP generation, membrane ion channels interact nonlinearly with dynamically changing ionic concentrations and varying transmembrane voltage, and are subject to regulatory processes. In recent years, a large body of knowledge has accumulated on the molecular structure of cardiac ion channels, their function, and their modification by genetic mutations that are associated with cardiac arrhythmias and sudden death. However, ion channels are typically studied in isolation (in expression systems or isolated membrane patches), away from the physiological environment of the cell where they interact to generate the AP. A major challenge remains the integration of ion-channel properties into the functioning, complex and highly interactive cell system, with the objective to relate molecular-level processes and their modification by disease to whole-cell function and clinical phenotype. In this article we describe how computational biology can be used to achieve such integration. We explain how mathematical (Markov) models of ion-channel kinetics are incorporated into integrated models of cardiac cells to compute the AP. We provide examples of mathematical (computer) simulations of physiological and pathological phenomena...
Genes co-expressed may be under similar promoter-based and/or position-based regulation. Although data on expression, position and function of human genes are available, their true integration still represents a challenge for computational biology, hampering the identification of regulatory mechanisms. We carried out an integrative analysis of genomic position, functional annotation and promoters of genes expressed in myeloid cells. Promoter analysis was conducted by a novel multi-step method for discovering putative regulatory elements, i.e. over-represented motifs, in a selected set of promoters, as compared with a background model. The combination of transcriptional, structural and functional data allowed the identification of sets of promoters pertaining to groups of genes co-expressed and co-localized in regions of the human genome. The application of motif discovery to 26 groups of genes co-expressed in myeloid cells differentiation and co-localized in the genome showed that there are more over-represented motifs in promoters of co-expressed and co-localized genes than in promoters of simply co-expressed genes (CEG). Motifs, which are similar to the binding sequences of known transcription factors, non-uniformly distributed along promoter sequences and/or occurring in highly co-expressed subset of genes were identified. Co-expressed and co-localized gene sets were grouped in two co-expressed genomic meta-regions...
Peptide-recognition modules (PRMs) are used throughout biology to mediate protein–protein interactions, and many PRMs are members of large protein domain families. Recent genome-wide measurements describe networks of peptide–PRM interactions. In these networks, very similar PRMs recognize distinct sets of peptides, raising the question of how peptide-recognition specificity is achieved using similar protein domains. The analysis of individual protein complex structures often gives answers that are not easily applicable to other members of the same PRM family. Bioinformatics-based approaches, one the other hand, may be difficult to interpret physically. Here we integrate structural information with a large, quantitative data set of SH2 domain–peptide interactions to study the physical origin of domain–peptide specificity. We develop an energy model, inspired by protein folding, based on interactions between the amino-acid positions in the domain and peptide. We use this model to successfully predict which SH2 domains and peptides interact and uncover the positions in each that are important for specificity. The energy model is general enough that it can be applied to other members of the SH2 family or to new peptides, and the cross-validation results suggest that these energy calculations will be useful for predicting binding interactions. It can also be adapted to study other PRM families...
Transporters play a vital role in both the resistance mechanisms of existing drugs and effective targeting of their replacements. Melarsoprol and diamidine compounds similar to pentamidine and furamidine are primarily taken up by trypanosomes of the genus Trypanosoma brucei through the P2 aminopurine transporter. In standardized competition experiments with [3H]adenosine, P2 transporter inhibition constants (Ki) have been determined for a diverse dataset of adenosine analogs, diamidines, Food and Drug Administration-approved compounds and analogs thereof, and custom-designed trypanocidal compounds. Computational biology has been employed to investigate compound structure diversity in relation to P2 transporter interaction. These explorations have led to models for inhibition predictions of known and novel compounds to obtain information about the molecular basis for P2 transporter inhibition. A common pharmacophore for P2 transporter inhibition has been identified along with other key structural characteristics. Our model provides insight into P2 transporter interactions with known compounds and contributes to strategies for the design of novel antiparasitic compounds. This approach offers a quantitative and predictive tool for molecular recognition by specific transporters without the need for structural or even primary sequence information of the transport protein.
For several decades, the standard model for high density lipoprotein (HDL) particles reconstituted from apolipoprotein A-I (apoA-I) and phospholipid (apoA-I/HDL) has been a discoidal particle ∼100 Å in diameter and the thickness of a phospholipid bilayer. Recently, Wu et al. (Wu, Z., Gogonea, V., Lee, X., Wagner, M. A., Li, X. M., Huang, Y., Undurti, A., May, R. P., Haertlein, M., Moulin, M., Gutsche, I., Zaccai, G., Didonato, J. A., and Hazen, S. L. (2009) J. Biol. Chem. 284, 36605–36619) used small angle neutron scattering to develop a new model they termed double superhelix (DSH) apoA-I that is dramatically different from the standard model. Their model possesses an open helical shape that wraps around a prolate ellipsoidal type I hexagonal lyotropic liquid crystalline phase. Here, we used three independent approaches, molecular dynamics, EM tomography, and fluorescence resonance energy transfer spectroscopy (FRET) to assess the validity of the DSH model. (i) By using molecular dynamics, two different approaches, all-atom simulated annealing and coarse-grained simulation, show that initial ellipsoidal DSH particles rapidly collapse to discoidal bilayer structures. These results suggest that, compatible with current knowledge of lipid phase diagrams...
Comparative genomics has become a real tantalizing challenge in the postgenomic era. This fact has been mostly magnified by the plethora of new genomes becoming available in a daily bases. The overwhelming list of new genomes to compare has pushed the field of bioinformatics and computational biology forward toward the design and development of methods capable of identifying patterns in a sea of swamping data noise. Despite many advances made in such endeavor, the ever-lasting annoying exceptions to the general patterns remain to pose difficulties in generalizing methods for comparative genomics. In this review, we discuss the different tools devised to undertake the challenge of comparative genomics and some of the exceptions that compromise the generality of such methods. We focus on endosymbiotic bacteria of insects because of their genomic dynamics peculiarities when compared to free-living organisms.
12-Oxophytodienoic acid (OPDA) is isomerized in the gut of herbivorous insects to tetrahydrodicranenone B (iso-OPDA). The transformation is achieved by a glutathione S-transferase present in the gut epithelium. Experiments with 9-[2H]-iso-OPDA demonstrated the complete retention of the deuterium atom in the product 11-[2H]-OPDA consistent with an intramolecular 1,3-hydrogen shift. Homology modeling based on the x-ray structure of a glutathione S-transferase from Anopheles gambiae revealed that the co-factor glutathione does not covalently bind to the substrate but appears to be involved in the initial deprotonation and enolization of the OPDA. The transformation resembles that of a mammalian GST-catalyzed isomerization of Δ5-3-ketosteroids to Δ4-3-ketosteroids or the conversion of prostaglandin A1 to the biologically inactive prostaglandin B1.
Over the past two decades, high-throughput (HTP) technologies such as microarrays and mass spectrometry have fundamentally changed clinical cancer research. They have revealed novel molecular markers of cancer subtypes, metastasis, and drug sensitivity and resistance. Some have been translated into the clinic as tools for early disease diagnosis, prognosis, and individualized treatment and response monitoring. Despite these successes, many challenges remain: HTP platforms are often noisy and suffer from false positives and false negatives; optimal analysis and successful validation require complex workflows; and great volumes of data are accumulating at a rapid pace. Here we discuss these challenges, and show how integrative computational biology can help diminish them by creating new software tools, analytical methods, and data standards.
The canonical nuclear factor-κB (NF-κB) signaling pathway controls a gene network important in the cellular inflammatory response. Upon activation, NF-κB/RelA is released from cytoplasmic inhibitors, from where it translocates into the nucleus, subsequently activating negative feedback loops producing either monophasic or damped oscillatory nucleo-cytoplasmic dynamics. Although the population behavior of the NF-κB pathway has been extensively modeled, the sources of cell-to-cell variability are not well understood. We describe an integrated experimental-computational analysis of NF-κB/RelA translocation in a validated cell model exhibiting monophasic dynamics. Quantitative measures of cellular geometry and total cytoplasmic concentration and translocated RelA amounts were used as priors in Bayesian inference to estimate biophysically realistic parameter values based on dynamic live cell imaging studies of enhanced GFP-tagged RelA in stable transfectants. Bayesian inference was performed on multiple cells simultaneously, assuming identical reaction rate parameters, whereas cellular geometry and initial and total NF-κB concentration-related parameters were cell-specific. A subpopulation of cells exhibiting distinct kinetic profiles was identified that corresponded to differences in the IκBα translation rate. We conclude that cellular geometry...
In this study, we utilized an integrated bioinformatics and computational biology approach in search of new BH3-only proteins belonging to the BCL2 family of apoptotic regulators. The BH3 (BCL2 homology 3) domain mediates specific binding interactions among various BCL2 family members. It is composed of an amphipathic α-helical region of approximately 13 residues that has only a few amino acids that are highly conserved across all members. Using a generalized motif, we performed a genome-wide search for novel BH3-containing proteins in the NCBI Consensus Coding Sequence (CCDS) database. In addition to known pro-apoptotic BH3-only proteins, 197 proteins were recovered that satisfied the search criteria. These were categorized according to α-helical content and predictive binding to BCL-xL (encoded by BCL2L1) and MCL-1, two representative anti-apoptotic BCL2 family members, using position-specific scoring matrix models. Notably, the list is enriched for proteins associated with autophagy as well as a broad spectrum of cellular stress responses such as endoplasmic reticulum stress, oxidative stress, antiviral defense, and the DNA damage response. Several potential novel BH3-containing proteins are highlighted. In particular, the analysis strongly suggests that the apoptosis inhibitor and DNA damage response regulator...
This thesis presents a number of novel computational methods for the analysis and design of protein-protein complexes, and their application to the study of the interactions of phosphopeptides with phosphopeptide-binding domain interactions. A novel protein-protein interaction type, the action-at-a-distance interaction, is described in the complex of the TEM1 P-lactamase with the 3-lactamase inhibitor protein (BLIP). New action-at-a-distance interactions were designed on the surface of BLIP and computed to enhance the affinity of that complex. A new method is described for the characterization and prediction of protein ligand-binding sites. This method was used to analyze the phosphoresidue-contacting sites of known phosphopeptide-binding domains, and to predict the sites of phosphoresidue-contact on some protein domains for which the correct site was not known. The design of a library of variant WW domains that is predicted to be enriched in domains that might have specificity for "pS/pT-Q" peptide ligands is detailed. General methods for designing libraries of degenerate oligonucleotides for expressing protein libraries as accurately as possible are given, and applied to the described WW domain variant library.; by Brian Alan Joughin.; Thesis (Ph. D.)--Massachusetts Institute of Technology...
Stochastic resonance is said to be observed when increases in levels of unpredictable fluctuations—e.g., random noise—cause an increase in a metric of the quality of signal transmission or detection performance, rather than a decrease. This counterintuitive effect relies on system nonlinearities and on some parameter ranges being “suboptimal”. Stochastic resonance has been observed, quantified, and described in a plethora of physical and biological systems, including neurons. Being a topic of widespread multidisciplinary interest, the definition of stochastic resonance has evolved significantly over the last decade or so, leading to a number of debates, misunderstandings, and controversies. Perhaps the most important debate is whether the brain has evolved to utilize random noise in vivo, as part of the “neural code”. Surprisingly, this debate has been for the most part ignored by neuroscientists, despite much indirect evidence of a positive role for noise in the brain. We explore some of the reasons for this and argue why it would be more surprising if the brain did not exploit randomness provided by noise—via stochastic resonance or otherwise—than if it did. We also challenge neuroscientists and biologists, both computational and experimental...
Fonte: Universidade de CambridgePublicador: Universidade de Cambridge
Tipo: Artigo de Revista Científica
Relevância na Pesquisa
RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are.; Abstract In this meeting report we give an overview of the 3rd International Society for Computational Biology Student Council Symposium. Furthermore, we explain the role of the Student Council and the symposium series in the context of large, international conferences.; Published version