Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/14581
Title: Discovering variable-length patterns in protein sequences for protein-protein interaction prediction
Authors: Hu, L
Chan, KCC 
Issue Date: 2015
Source: IEEE transactions on nanobioscience, 2015, v. 14, no. 4, p. 409-416
Abstract: To predict Protein-Protein Interactions (PPIs), there have recently been some attempts to use computational approaches and among them, sequence-based approaches are often preferred over other kinds of approaches as they do not require prior knowledge about proteins to perform their tasks. However, in deciding if two proteins may interact with each other, existing sequence-based approaches consider only fixed-length segments. We believe that if segments of variable-length can also be considered, interactions between proteins can be more accurately predicted. To consider variable-length segments for PPI predictions, we have developed a VLASPD algorithm. Given a database of protein sequences, VLASPD performs its tasks in several steps. The protein database is first searched to identify frequent sequence segments (FSSs) of different length. The different combinations of the presence and absence of these FSSs are then used to form different associative sequential patterns (ASPs). Based on a statistical measure, the ASPs that occur significantly frequently among proteins in the training set are then identified as significant associative sequential patterns (SASPs). If an SASP is found in a protein pair, it can be considered as providing some evidence to support or refute the existence of an interaction relationship between the protein pairs. The amount of evidence provided are then quantified with an information theoretic measure. How likely two proteins may interact with each other are then decided by the total amount of evidence provided by the SASPs found in the protein pairs. To test the effectiveness of VLASPD, we used several sets of real data. The experimental results show that VLASPD can be a promising approach for PPI prediction. The VLASPD is made available for use and testing at http://www.comp.polyu.edu.hk/~cslhu/resources/vlaspd/.
Keywords: Prediction
Protein-protein interaction
Sequence information
Variable-length pattern
Publisher: Institute of Electrical and Electronics Engineers
Journal: IEEE transactions on nanobioscience 
ISSN: 1536-1241
EISSN: 1558-2639
DOI: 10.1109/TNB.2015.2429672
Appears in Collections:Journal/Magazine Article

Access
View full-text via PolyU eLinks SFX Query
Show full item record

SCOPUSTM   
Citations

5
Last Week
0
Last month
0
Citations as of Sep 8, 2020

WEB OF SCIENCETM
Citations

6
Last Week
0
Last month
0
Citations as of Sep 20, 2020

Page view(s)

148
Last Week
0
Last month
Citations as of Sep 21, 2020

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.