Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/68907
Title: Compression of multiple DNA sequences using intra-sequence and inter-sequence similarities
Authors: Cheng, KO 
Wu, CPP 
Law, NF 
Siu, WC 
Keywords: Biology and genetics
Data compaction and compression
Issue Date: 2015
Publisher: ACM Special Interest Group
Source: IEEE/ACM transactions on computational biology and bioinformatics, 2015, v. 12, no. 6, p. 1322-1332 How to cite?
Journal: IEEE/ACM transactions on computational biology and bioinformatics 
Abstract: Traditionally, intra-sequence similarity is exploited for compressing a single DNA sequence. Recently, remarkable compression performance of individual DNA sequence from the same population is achieved by encoding its difference with a nearly identical reference sequence. Nevertheless, there is lack of general algorithms that also allow less similar reference sequences. In this work, we extend the intra-sequence to the inter-sequence similarity in that approximate matches of subsequences are found between the DNA sequence and a set of reference sequences. Hence, a set of nearly identical DNA sequences from the same population or a set of partially similar DNA sequences like chromosome sequences and DNA sequences of related species can be compressed together. For practical compressors, the compressed size is usually influenced by the compression order of sequences. Fast search algorithms for the optimal compression order are thus developed for multiple sequences compression. Experimental results on artificial and real datasets demonstrate that our proposed multiple sequences compression methods with fast compression order search are able to achieve good compression performance under different levels of similarity in the multiple DNA sequences.
URI: http://hdl.handle.net/10397/68907
ISSN: 1545-5963
EISSN: 1557-9964
DOI: 10.1109/TCBB.2015.2403370
Appears in Collections:Journal/Magazine Article

Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page view(s)

4
Checked on Dec 11, 2017

Google ScholarTM

Check

Altmetric



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.