Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/25884
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Electronic and Information Engineering-
dc.creatorWu, CPP-
dc.creatorLaw, NF-
dc.creatorSiu, WC-
dc.date.accessioned2015-08-28T04:30:33Z-
dc.date.available2015-08-28T04:30:33Z-
dc.identifier.issn0973-2063en_US
dc.identifier.urihttp://hdl.handle.net/10397/25884-
dc.language.isoenen_US
dc.publisherBiomedical Informatics Publishing Groupen_US
dc.rights© 2008 Biomedical Informatics Publishing Groupen_US
dc.rightsThis is an open-access article, which permits unrestricted use, distribution, and reproduction in any medium, for non-commercial purposes, provided the original author and source are credited.en_US
dc.rightsThe following publication Wu, C. P. P., Law, N. F., & Siu, W. C. (2008). Cross chromosomal similarity for DNA sequence compression. Bioinformation, 2(9), 412-416 is available at https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2533061/en_US
dc.subjectDNAen_US
dc.subjectSequenceen_US
dc.subjectChromosomeen_US
dc.subjectPredictionen_US
dc.subjectS. cerevisiaeen_US
dc.titleCross chromosomal similarity for DNA sequence compressionen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.spage412en_US
dc.identifier.epage416en_US
dc.identifier.volume2en_US
dc.identifier.issue9en_US
dcterms.abstractCurrent DNA compression algorithms work by finding similar repeated regions within the DNA sequence and then encoding these regions together to achieve compression. Our study on chromosome sequence similarity reveals that the length of similar repeated regions within one chromosome is about 4.5% of the total sequence length. The compression gain is often not high because of these short lengths. It is well known that similarity exist among different regions of chromosome sequences. This implies that similar repeated sequences are found among different regions of chromosome sequences. Here, we study cross-chromosomal similarity for DNA sequence compression. The length and location of similar repeated regions among the sixteen chromosomes of S. cerevisiae are studied. It is found that the average percentage of similar subsequences found between two chromosome sequences is about 10% in which 8% comes from cross-chromosomal prediction and 2% from self-chromosomal prediction. The percentage of similar subsquences is about 18% in which only 1.2% comes from self-chromosomal prediction while the rest is from cross-chromosomal prediction among the 16 chromosomes studied. This suggests the importance of cross-chromosomal similarities in addition to self-chromosomal similarities in DNA sequence compression. An additional 23% of storage space could be reduced on average using self-chromosomal and cross-chromosomal predictions in compressing the 16 chromosomes of S. cerevisiae.-
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationBioinformation, 2008, v. 2, no. 9, p. 412-416-
dcterms.isPartOfBioinformation-
dcterms.issued2008-
dc.identifier.rosgroupidr40621-
dc.description.ros2008-2009 > Academic research: refereed > Publication in refereed journalen_US
dc.description.oaVersion of Recorden_US
dc.identifier.FolderNumberOA_IR/PIRAen_US
dc.description.pubStatusPublisheden_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
Wu_Cross_Chromosomal_Similarity.pdf130.24 kBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

122
Last Week
2
Last month
Citations as of Apr 14, 2024

Downloads

63
Citations as of Apr 14, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.