Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/37942
Title: Speeding up subcellular localization by extracting informative regions of protein sequences for profile alignment
Authors: Wang, W
Mak, MW 
Kung, SY
Keywords: Bioinformatics
Cellular biophysics
Genetics
Macromolecules
Pattern classification
Proteins
Sorting
N-terminal sorting signal
Bioinformatics
Cleavage site prediction
Gene
Homology-based classifier
Informative region
Informative segment
Post-proteomics era
Profile alignment
Protein annotation
Protein sequence
Subcellular localization
Amino acids
Bioinformatics
Genomics
Peptides
Prediction methods
Proteins
Sequences
Sorting
Support vector machine classification
Support vector machines
Subcellular localization
Cleavage sites prediction
Profiles alignment
Protein sequences
Support vector machines
Issue Date: 2010
Source: 2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2-5 May 2010, Montreal, QC, p. 1-8 How to cite?
Abstract: The functions of proteins are closely related to their subcellular locations. In the post-proteomics era, the amount of gene and protein data grows exponentially, which necessitates the prediction of subcellular localization by computational means. This paper proposes mitigating the computation burden of alignment-based approaches to subcellular localization prediction by using the information provided by the N-terminal sorting signals. To this end, a cascaded fusion of cleavage site prediction and profile alignment is proposed. Specifically, the informative segments of protein sequences are identified by a cleavage site predictor. Then, only the informative segments are applied to a homology-based classifier for predicting the subcellular locations. Experimental results on a newly constructed dataset show that the method can make use of the best property of both approaches and can attain an accuracy higher than using the full-length sequences. Moreover, the method can reduce the computation time by 20 folds. We advocate that the method will be important for biologists to conduct large-scale protein annotation or for bioinformaticians to perform preliminary investigations on new algorithms that involve pairwise alignments.
URI: http://hdl.handle.net/10397/37942
ISBN: 978-1-4244-6766-2
DOI: 10.1109/CIBCB.2010.5510320
Appears in Collections:Conference Paper

Access
View full-text via PolyU eLinks SFX Query
Show full item record

SCOPUSTM   
Citations

5
Last Week
0
Last month
Citations as of Jan 12, 2018

Page view(s)

31
Last Week
2
Last month
Citations as of Jan 14, 2018

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.