Accessibility navigation

Evolution of protein interdependence from pairs to networks

Read, W. J. (2015) Evolution of protein interdependence from pairs to networks. PhD thesis, University of Reading

Text - Thesis
· Please see our End User Agreement before downloading.

[img] Text - Thesis Deposit Form
· Restricted to Repository staff only


It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.


I present a method for inferring a protein interdependency network, based on correlated evolution between proteins on a large phylogeny of 72 diverse eukaryotic species. My original contribution is in the span of the phylogenetic tree used to generate the network: similar studies have concentrated on more localised regions in the tree of life and have been undertaken with more limited intent. I show that the whole-eukaryotic correlated evolution network is a real network and has interesting features of its own. The method can be broken down into three major, sequential parts: binary trait derivation, phylogenetic inference and likelihood analysis. I describe the implementation of a reciprocal BLAST protocol for the inference of a binary trait matrix corresponding to the presence or absence of orthologues in each species in the analysis, based on a reference human proteome. Rows in the matrix correspond to the reference proteins, columns to species: entries are 1, denoting presence, or 0, denoting absence. The matrix that I derive is mapped onto a phylogeny of the same set of species, to facilitate the detection of correlated evolution between proteins, or orthologous sets thereof, based on the pattern of gains and losses. The phylogeny is inferred from a set of genes, which are selected according to the criterion that they be present in all 72 species. 15 genes meeting this criterion are identified from the trait matrix; 14 of them are aligned and used for phylogenetic inference. The inference itself is performed using the program BayesPhylogenies, which implements a phylogenetic mixture model using a Markov Chain Monte Carlo (MCMC) method. A consensus phylogeny (tree) is calculated after the chain has been run for many millions of iterations; trees based on the genes individually were also inferred, for purposes of comparison. I use the program BayesTraits to perform a likelihood analysis on pairs of proteins from the trait matrix. This method detects correlated evolution by means of a likelihood ratio statistic, relating the likelihood of the two proteins having evolved independently, to the likelihood of their having evolved in an interdependent, or correlated, fashion. If the likelihood ratio statistic exceeds a certain threshold, this is interpreted as the signature of correlated evolution. Using presumptively interacting protein pairs from the Human Protein Reference Database, and a control (or null) set of pairs where no interaction is expected, I present evidence for the efficacy of the method in detecting correlated evolution. I proceed to infer a network based on correlated evolution, wherein each link represents an instance of pairwise correlation, and demonstrate that a power law gives a good fit to the distribution of nodal degree within the network, which is also the case for a network of presumptive protein interactions with no filter for correlated evolution. Finally, I infer a new equation to characterise the evolutionary rules which fashioned the network. I propose a method for testing the equation, and discuss future directions.

Item Type:Thesis (PhD)
Thesis Supervisor:Pagel, M.
Thesis/Report Department:School of Biological Sciences
Identification Number/DOI:
Divisions:Life Sciences > School of Biological Sciences
ID Code:68267


Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation