Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/75863
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Computing-
dc.creatorZhu, XJ-
dc.creatorZhang, Q-
dc.creatorHo, ED-
dc.creatorYu, KHO-
dc.creatorLiu, C-
dc.creatorHuang, TH-
dc.creatorCheng, ASL-
dc.creatorKao, B-
dc.creatorLo, E-
dc.creatorYip, KY-
dc.date.accessioned2018-05-10T02:54:48Z-
dc.date.available2018-05-10T02:54:48Z-
dc.identifier.issn1471-2164-
dc.identifier.urihttp://hdl.handle.net/10397/75863-
dc.language.isoenen_US
dc.publisherBioMed Centralen_US
dc.rights© The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to theCreative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.en_US
dc.rightsThe following publication Zhu, X. J., Zhang, Q., Ho, E. D., Yu, K. H. O., Liu, C., Huang, T. H., … Yip, K. Y. (2017). START : a system for flexible analysis of hundreds of genomic signal tracks in few lines of SQL-like queries. BMC Genomics, 18, 749, 1-18 is available at https://dx.doi.org/10.1186/s12864-017-4071-1en_US
dc.subjectHuman genomicsen_US
dc.subjectSignal tracksen_US
dc.subjectData analysisen_US
dc.titleSTART : a system for flexible analysis of hundreds of genomic signal tracks in few lines of SQL-like queriesen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.epage18-
dc.identifier.volume18-
dc.identifier.doi10.1186/s12864-017-4071-1-
dcterms.abstractBackground: A genomic signal track is a set of genomic intervals associated with values of various types, such as measurements from high-throughput experiments. Analysis of signal tracks requires complex computational methods, which often make the analysts focus too much on the detailed computational steps rather than on their biological questions. Results: Here we propose Signal Track Query Language (STQL) for simple analysis of signal tracks. It is a Structured Query Language (SQL)-like declarative language, which means one only specifies what computations need to be done but not how these computations are to be carried out. STQL provides a rich set of constructs for manipulating genomic intervals and their values. To run STQL queries, we have developed the Signal Track Analytical Research Tool (START, http://yiplab. cse. cuhk. edu. hk/start/), a system that includes a Web-based user interface and a back-end execution system. The user interface helps users select data from our database of around 10,000 commonly-used public signal tracks, manage their own tracks, and construct, store and share STQL queries. The back-end system automatically translates STQL queries into optimized low-level programs and runs them on a computer cluster in parallel. We use STQL to perform 14 representative analytical tasks. By repeating these analyses using bedtools, Galaxy and custom Python scripts, we show that the STQL solution is usually the simplest, and the parallel execution achieves significant speed-up with large data files. Finally, we describe how a biologist with minimal formal training in computer programming self-learned STQL to analyze DNA methylation data we produced from 60 pairs of hepatocellular carcinoma (HCC) samples. Conclusions: Overall, STQL and START provide a generic way for analyzing a large number of genomic signal tracks in parallel easily.-
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationBMC genomics, 2017, v. 18, 749, p. 1-18-
dcterms.isPartOfBMC genomics-
dcterms.issued2017-
dc.identifier.isiWOS:000411432600001-
dc.identifier.scopus2-s2.0-85029878806-
dc.identifier.pmid28938868-
dc.identifier.eissn1471-2164-
dc.identifier.artn749-
dc.description.validate201805 bcrc-
dc.description.oaVersion of Recorden_US
dc.identifier.FolderNumberOA_IR/PIRAen_US
dc.description.pubStatusPublisheden_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
Zhu_System_Flexible_Hundreds.pdf1.93 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

123
Last Week
1
Last month
Citations as of Apr 21, 2024

Downloads

105
Citations as of Apr 21, 2024

SCOPUSTM   
Citations

6
Last Week
1
Last month
Citations as of Apr 4, 2024

WEB OF SCIENCETM
Citations

5
Last Week
0
Last month
Citations as of Apr 18, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.