Please use this identifier to cite or link to this item:
Title: Spatial keyword query processing over road networks
Authors: Li, Wengen
Degree: Ph.D.
Issue Date: 2017
Abstract: With the rapid development of geo-positioning techniques and high-speed mobile networks, location-aware applications like location-based services, location-based social networks and intelligent driving navigation have been gaining tremendous popularity in recent years. When using these location-aware applications, users often issue queries with both spatial and textual requirements, e.g., searching restaurants within one kilometer from the current location, where the term "restaurants" is textual requirement while "within one kilometer from the current location" is spatial requirement. To provide users query services regarding both spatial and textual requirements, the research community has proposed Spatial Keyword Query which combines traditional spatial query in spatial databases and keyword query in information retrieval. Specifically, given a spatial keyword query with spatial requirement (e.g., query location and spatial range) and textual requirement (usually described by query keywords), both of (i) the spatial relationship (e.g., spatial proximity) between target objects and the spatial requirement and (ii) the textual relevance between the textual descriptions of target objects and query keywords are considered. In the past decade, extensive studies have been conducted to solve all kinds of spatial keyword queries. However, these studies mainly focus on Euclidean space and cannot effectively support spatial keyword queries over road networks on which the spatial proximity between two locations is network distance rather than Euclidean distance. Considering that people's travel routes are restricted by road networks, and the Euclidean distance and network distance between two locations could be quite different, it is of high necessity to study spatial keyword query over road networks. In addition, existing spatial keyword queries mainly aim to locate the target objects and ignore the function of routing, and are thus unable to provide complete travel solutions to get to the target objects. Motivated by these observations, this thesis focuses on "Spatial Keyword Query Processing over Road Networks" and aims to propose effective solutions to the following three typical types of interesting queries: First, we study services locating over road networks to find the locations of required services. For this type of query, we propose Range Spatial Keyword (RSK) query to retrieve all the objects that are textually relevant to the query keywords and locate within a given range from the query location. Though RSK query has received extensive studies in Euclidean space, little has been done to deal with it over road networks. To process RSK query over road networks efficiently, we first propose an expansion-based approach based on the locality of RSK query. Then, we improve the efficiency of this approach based on the observation that road network distance is always larger than or equal to the corresponding Euclidean distance between two locations. In addition, to ensure high scalability on large road networks and RSK queries of large query ranges, we design an Rnet Hierarchy index and devise an efficient query processing algorithm based on this index. According to the experiments, Rnet Hierarchy-based approach can deal with RSK queries over road networks of millions of vertices.
Second, to search a route relevant to query keywords, we propose two queries, i.e., Keyword Coverage Route (KCR) query and Bounded-Cost Information Route (BCIR) query. Given a set of query keywords, KCR query retrieves an optimal route such that its length is less than a distance threshold and it covers the most query keywords. By contrast, BCIR query retrieves the optimal route whose textual relevance to the query keywords is maximized and cost (e.g., length and travel time) is less than a cost budget. Different from KCR query, BCIR query considers general cost rather than the spatial distance and aims to maximize the global textual relevance rather than maximizing the number of query keywords on the routes. KCR query is particularly helpful for tourists to search the routes covering many interested objects while BCIR query is useful for tourists and city explorers to search for routes of specific topics. Both KCR query and BCIR query are NP-hard problems without solutions of polynominal time complexity. To solve KCR query, we propose an adaptive route sampling framework under which both static and dynamic route sampling techniques are proposed. Particularly, the dynamic route sampling can obtain routes of high quality by learning knowledge from the routes sampled previously. The proposed framework can flexibly compute query results of different qualities according to the response time limit, thus avoiding the low efficiency of traditional exact solutions and the low quality of approximate solutions. To solve BCIR query, we propose different solutions for different application scenarios. On the one hand, we design an exact solution for the BCIR queries of small cost budgets and propose multiple pruning techniques to reduce the searching space. On the other hand, we propose a time-bounded solution and an error-bounded solution for BCIR queries of large cost budgets. Time-bounded solution initializes a set of candidate routes and keeps refining them until reaching the given response time limit. Error-bounded solution is adapted from the exact solution by relaxing the pruning requirement to improve the pruning efficiency and can greatly reduce the query processing time while guaranteeing the quality of query results. Third, considering that the travel times of routes are uncertain and dynamically changing over time, we propose uncertain road network model which represents the travel time of each road by a dynamic discrete random variable. Then, we propose the Probabilistic Time-constrained Route (PTR) query to retrieve keyword-aware routes over this model such that (i) the returned routes sequentially pass multiple categories of POIs (points of interest, e.g., bank and restaurant) according to the order of given query keywords and (ii) the travel times of routes are small in high confidence. To answer PTR queries efficiently, we propose a two-phase query approach which first generates a small set of candidate routes by employing effective pruning strategies regarding the service time constraints on POIs and the probabilistic/rank requirements of queries. A refinement operation based on Monte Carlo sampling is then conducted over the candidate routes to compute the final results.
Subjects: Hong Kong Polytechnic University -- Dissertations
Location-based services
Pages: xx, 159 pages : illustrations
Appears in Collections:Thesis

Show full item record

Page views

Last Week
Last month
Citations as of Jun 4, 2023

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.