Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/83857
Title: START : a system for flexible analysis of hundreds of genomic signal tracks in few lines of SQL-like queries
Authors: Zhang, Qiang
Degree: Ph.D.
Issue Date: 2016
Abstract: A genomic signal track is a set of genomic intervals associated with values of various types, such as measurements from high-throughput experiments. Analysis of signal tracks requires complex computational methods, which often make the analysts focus too much on the detailed computational steps rather than on their biological questions. This thesis presents Signal Track Analytical Research Tool (START) and Signal Track Query Language (STQL) for easy analysis of signal tracks. STQL is an SQL-like declarative language, which means one only specifies what computations need to be done but not how these computations are to be carried out. STQL provides a rich set of constructs for manipulating genomic intervals and their values. To run STQL queries, we have developed the Signal Track Analytical Research Tool (START), a MapReduce-based system that includes a Web-based user interface and a back-end execution system. By running some typical analyses tasks, we show that the START+STQL solution is usually the simplest, and the parallel execution achieves significant speed-up with large data files.
Subjects: Genomics -- Data processing
Cellular signal transduction.
Genetic regulation.
Hong Kong Polytechnic University -- Dissertations
Pages: xvi, 128 pages : color illustrations
Appears in Collections:Thesis

Show full item record

Page views

4
Citations as of May 15, 2022

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.