Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/24611
Title: UPSEC : an algorithm for classifying unaligned protein sequences into functional families
Authors: Ma, PCH
Chan, KCC 
Keywords: Information theory
Pattern discovery
Protein sequence classification
Residual analysis
Weight of evidence
Issue Date: 2008
Publisher: Mary Ann Liebert Inc
Source: Journal of computational biology, 2008, v. 15, no. 4, p. 431-443 How to cite?
Journal: Journal of Computational Biology 
Abstract: To classify proteins into functional families based on their primary sequences, popular algorithms such as the k-NN-, HMM-, and SVM-based algorithms are often used. For many of these algorithms to perform their tasks, protein sequences need to be properly aligned first. Since the alignment process can be error-prone, protein classification may not be performed very accurately. To improve classification accuracy, we propose an algorithm, called the Unaligned Protein SEquence Classifier (UPSEC), which can perform its tasks without sequence alignment. UPSEC makes use of a probabilistic measure to identify residues that are useful for classification in both positive and negative training samples, and can handle multi-class classification with a single classifier and a single pass through the training data. UPSEC has been tested with real protein data sets. Experimental results show that UPSEC can effectively classify unaligned protein sequences into their corresponding functional families, and the patterns it discovers during the training process can be biologically meaningful.
URI: http://hdl.handle.net/10397/24611
DOI: 10.1089/cmb.2007.0113
Appears in Collections:Journal/Magazine Article

Access
View full-text via PolyU eLinks SFX Query
Show full item record

SCOPUSTM   
Citations

7
Last Week
0
Last month
0
Citations as of May 22, 2017

WEB OF SCIENCETM
Citations

6
Last Week
0
Last month
0
Citations as of May 21, 2017

Page view(s)

25
Last Week
0
Last month
Checked on May 21, 2017

Google ScholarTM

Check

Altmetric



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.