Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/14581
DC FieldValueLanguage
dc.contributorDepartment of Computing-
dc.creatorHu, L-
dc.creatorChan, KCC-
dc.date.accessioned2015-10-13T08:27:03Z-
dc.date.available2015-10-13T08:27:03Z-
dc.identifier.issn1536-1241-
dc.identifier.urihttp://hdl.handle.net/10397/14581-
dc.language.isoenen_US
dc.publisherInstitute of Electrical and Electronics Engineersen_US
dc.subjectPredictionen_US
dc.subjectProtein-protein interactionen_US
dc.subjectSequence informationen_US
dc.subjectVariable-length patternen_US
dc.titleDiscovering variable-length patterns in protein sequences for protein-protein interaction predictionen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.spage409-
dc.identifier.epage416-
dc.identifier.volume14-
dc.identifier.issue4-
dc.identifier.doi10.1109/TNB.2015.2429672-
dcterms.abstractTo predict Protein-Protein Interactions (PPIs), there have recently been some attempts to use computational approaches and among them, sequence-based approaches are often preferred over other kinds of approaches as they do not require prior knowledge about proteins to perform their tasks. However, in deciding if two proteins may interact with each other, existing sequence-based approaches consider only fixed-length segments. We believe that if segments of variable-length can also be considered, interactions between proteins can be more accurately predicted. To consider variable-length segments for PPI predictions, we have developed a VLASPD algorithm. Given a database of protein sequences, VLASPD performs its tasks in several steps. The protein database is first searched to identify frequent sequence segments (FSSs) of different length. The different combinations of the presence and absence of these FSSs are then used to form different associative sequential patterns (ASPs). Based on a statistical measure, the ASPs that occur significantly frequently among proteins in the training set are then identified as significant associative sequential patterns (SASPs). If an SASP is found in a protein pair, it can be considered as providing some evidence to support or refute the existence of an interaction relationship between the protein pairs. The amount of evidence provided are then quantified with an information theoretic measure. How likely two proteins may interact with each other are then decided by the total amount of evidence provided by the SASPs found in the protein pairs. To test the effectiveness of VLASPD, we used several sets of real data. The experimental results show that VLASPD can be a promising approach for PPI prediction. The VLASPD is made available for use and testing at http://www.comp.polyu.edu.hk/~cslhu/resources/vlaspd/.-
dcterms.bibliographicCitationIEEE transactions on nanobioscience, 2015, v. 14, no. 4, p. 409-416-
dcterms.isPartOfIEEE transactions on nanobioscience-
dcterms.issued2015-
dc.identifier.scopus2-s2.0-84930933387-
dc.identifier.eissn1558-2639-
Appears in Collections:Journal/Magazine Article
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

SCOPUSTM   
Citations

5
Last Week
0
Last month
0
Citations as of Sep 8, 2020

WEB OF SCIENCETM
Citations

6
Last Week
0
Last month
0
Citations as of Oct 22, 2020

Page view(s)

150
Last Week
0
Last month
Citations as of Oct 19, 2020

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.