Skip to main content
Login | Suomeksi | På svenska | In English

Kernel Methods for Protein-Protein Interaction Prediction

Show full item record

Title: Kernel Methods for Protein-Protein Interaction Prediction
Author(s): Lei, Jinmin
Contributor: University of Helsinki, Faculty of Science, Department of Computer Science
Language: English
Acceptance year: 2016
Abstract:
Despite of the efficiency brought by the high-throughput technology in detecting protein-protein interactions, different wet-lab methods still pose different pitfalls. As a complementary strategy, dry-lab methods are less expensive and have an advantage of data fusion that overcomes the biases of individual data sources. This thesis explores the indicative features and the effect of a graph model in the protein-protein interaction prediction task as well as the capability of the multiple kernel learning algorithms in improving the prediction performance.Different kernels are applied in accordance with different features. We integrate 14 global and 10 graph features respectively in the SVM framework via different kernel methods, and then compare the prediction performances of different features. When applying the graph features, we represent individual proteins as labeled graphs and then apply three different graph kernels to explore which one can best capture the relationships between proteins. For merging heterogeneous data, we apply different multiple kernel learning algorithms and explore their capabilities in improving the prediction accuracy. We formulate the prediction of protein-protein interactions as a binary classification problem and in the SVM framework, we need to reconstruct the kernel which measures the similarity between protein pairs from the kernel which measures the similarity between proteins. For this goal, we employ three different pairwise kernels in the SVM framework and explore their effects in capturing the relationships between protein pairs. We perform experiments on 896 Saccharomyces Cerevisiae (baker's yeast) proteins and report the prediction performances of the three pairwise kernels on 10 graph and 14 global features, as well as the prediction results of different multiple kernel learning algorithms. Our experimental results reveal that the overall prediction performance achieved by the 10 graph features applied to the proposed graph model is better than that achieved by the 14 protein global features, and that among all multiple kernel learning methods, the align wins over the others in the protein-protein interaction prediction task. Our methods detect the interacting proteins at a high level. Based on this work, low-level models can be devised to detect the exact interacting spots between proteins.


Files in this item

Files Size Format View
leijinmin_thesis.pdf 670.1Kb PDF

This item appears in the following Collection(s)

Show full item record