Edge-side resource management for data-driven applications

Hu, Chuang

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/86321

Title:	Edge-side resource management for data-driven applications
Authors:	Hu, Chuang
Degree:	Ph.D.
Issue Date:	2019
Abstract:	Data-driven applications exploit data mining and machine learning technologies to dig the great potential value of the data. Edge computing is promoted to meet increasing performance needs of data-driven applications using computational and storage resources close to the end devices at the edge of the current network. To achieve higher performance in this new paradigm, one has to consider how to combine the efficiency of resource usage at all three layers of architecture: end devices, edge devices, and the cloud. Indeed, end devices or edge devices are resource-constrained devices, whereas the cloud has almost unlimited but far away resources. Providing and/or managing the resource at the edge will enable the end device to spare resources and speed up computations and allows using resources it does not possess. Hence, there is a need for an efficient resource management at the edge. In this research, we study the resource management in the edge side and make the following original contributions in this field. Firstly, we focus on optimizing the communication resource management for the data-driven applications which need to transfer the data of end devices to the cloud at the edge side. The emerging smart after-sales maintenance is one such application. Manufacturers/vendors collect the data of their sold-products to the cloud so that they can conduct analysis and improve their operation, maintenance, and services of their products. Manufacturers are looking for a self-contained solution for data transmission since their products are typically deployed in a large number of different buildings, and it is neither feasible to negotiate with each building to use the buildings network (e.g., WiFi) nor practical to establish its own network infrastructure. A dedicated channel from an ISP can be rent to act as a thing-to-cloud communication (TCC) link for each end device. Since the readily available 3G/4G is over costly for most end devices, ISPs are developing new choices. Nevertheless, it can be expected that the choices from ISPs will not be fine-grained enough to match hundreds or thousands of requirements on different costs and data volumes from the end devices. To address issue, in this thesis, we propose the communication sharing sTube+, sharing tube. Stube+ organizes a greater number of end devices, with heterogeneous data communication and cost requirements, to efficiently share fewer choices of communication resources, i.e. TCC links, and transmit their data to the cloud. We take a design of centralized price optimization and distributed network control. More specifically, we architect a layered architecture for data delivery, develop algorithms to optimize the overall monetary cost, and prototype a fully functioning system of sTube+. We also develop a case study on smart maintenance of chillers and pumps, using sTube+ as the underlying network architecture. Secondly, we study computational allocation and optimization for DNN-based data-driven applications. DNN inference imposes heavy computation burden to end devices, but offloading inference tasks to the cloud causes transmission of a large volume of data. Motivated by the fact that the data size of some intermediate DNN layers is significantly smaller than that of raw input data, we design the DNN surgery, which allows partitioned DNN processed at both the edge and cloud while limiting the data transmission. DNN surgery considers the network bandwidth between the end device and the cloud, as well as their processing capabilities when deciding to handle the layer of the DNN either on the end device or in the cloud. The challenge is twofold: (1) Network dynamics substantially influence the performance of DNN partition, and (2) State-of-the-art DNNs are characterized by a directed acyclic graph (DAG) rather than a chain so that partition is greatly complicated. In order to solve the issues, we design a Dynamic Adaptive DNN Surgery (DADS) scheme, which optimally partitions the DNN under different network condition. We conduct a comprehensive study of the partition problem under the lightly loaded condition and heavily loaded condition. Under the lightly loaded condition, DNN Surgery Light (DSL) is developed, which minimizes the overall delay to process one frame. The minimization problem is equivalent to a min-cut problem so that a globally optimal solution is derived. In the heavily loaded condition, DNN Surgery Heavy (DSH) is developed, with the objective to maximize throughput. However, the problem is NP-hard so that DSH resorts an approximation method to achieve an approximation ratio of 3. Finally, we propose a Transmission-Analytic Processing Unit (TAPU), a novel accelerator using multi-image FPGA to provide both computation resource and communication resource for edge device. A multi-image FPGA can pre-store multiple images in the FPGA flash and fast switch between images. We can then configure one image for computation, and the other images for network functions. Thus, we can multiplex the accelerator by controlling the switch of the images. System design of TAPU from the hardware to the software is present. In the hardware design, we discuss the FPGA choice and abstracts a set of transparent APIs for the developers. For the software, offline modules are designed to determine which functions are offloaded to FPGA, and runtime modules are designed to determine how to switch images adapting to the runtime variations. We choose to accelerate network functions first, and use the residual computation capacity to accelerate computation. We develop two schemes to estimate the residual computation capacity of TAPU for non-preemption case and general case respectively. Then to fully use the residual computation capacity, we design an inference task offloading algorithm for video analytic task assignment to the FPGA. These algorithms collectively exploit the capacity of the FGPA for DNN inference acceleration to the maximum. In summary, we propose three methods to overcome the communication, computation and resource challenges in edge/cloud computing for data-driven applications from edge-side resource management perspective. The proposed methods are leveraged in important data-driven applications. Prototype of the systems were developed to evaluate the effectiveness of the solutions. We identify the requirements and address the challenges therein, providing effective frameworks and solutions for practitioners.
Subjects:	Hong Kong Polytechnic University -- Dissertations Computer networks Cloud computing Electronic data processing -- Distributed processing Internet of things
Pages:	xxii, 126 pages : color illustrations
Appears in Collections:	Thesis

Access

View full-text via https://theses.lib.polyu.edu.hk/handle/200/10310

Show full item record

Page views

162

Last Week
0

Last month

Citations as of Apr 27, 2025

Google Scholar^TM

Check

Access

Page views

Google ScholarTM

Google Scholar^TM