Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/61154
Title: Joint scheduling of MapReduce jobs with servers : performance bounds and experiments
Authors: Ling, X
Yuan, Y 
Wang, D 
Liu, J
Yang, J
Keywords: Fast heuristic
MapReduce
NP-complete
Scheduling
Server assignment
Issue Date: 2016
Publisher: Elsevier Inc. for Academic Press
Source: Journal of parallel and distributed computing, 2016, v. 90-91, p. 52-66 How to cite?
Journal: Journal of parallel and distributed computing 
Abstract: MapReduce-like frameworks have achieved tremendous success for large-scale data processing in data centers. A key feature distinguishing MapReduce from previous parallel models is that it interleaves parallel and sequential computation. Past schemes, and especially their theoretical bounds, on general parallel models are therefore, unlikely to be applied to MapReduce directly. There are many recent studies on MapReduce job and task scheduling. These studies assume that the servers are assigned in advance. In current data centers, multiple MapReduce jobs of different importance levels run together. In this paper, we investigate a schedule problem for MapReduce taking server assignment into consideration as well. We formulate a MapReduce server-job organizer problem (MSJO) and show that it is NP-complete. We develop a 3-approximation algorithm and a fast heuristic design. Moreover, we further propose a novel fine-grained practical algorithm for general MapReduce-like task scheduling problem. Finally, we evaluate our algorithms through both simulations and experiments on Amazon EC2 with an implementation with Hadoop. The results confirm the superiority of our algorithms.
URI: http://hdl.handle.net/10397/61154
ISSN: 0743-7315
DOI: 10.1016/j.jpdc.2016.02.002
Rights: © 2016 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
The following publication Ling, X., Yuan, Y., Wang, D., Liu, J., & Yang, J. (2016). Joint scheduling of mapreduce jobs with servers: Performance bounds and experiments. Journal of Parallel and Distributed Computing, 90, 52-66 is available at https://doi.org/10.1016/j.jpdc.2016.02.002
Appears in Collections:Journal/Magazine Article

Files in This Item:
File Description SizeFormat 
Ling_Joint_scheduling_MapReduce.pdf1.19 MBAdobe PDFView/Open
Access
View full-text via PolyU eLinks SFX Query
Show full item record
PIRA download icon_1.1View/Download Contents

SCOPUSTM   
Citations

6
Last Week
0
Last month
Citations as of Feb 13, 2019

WEB OF SCIENCETM
Citations

6
Last Week
0
Last month
Citations as of Feb 17, 2019

Page view(s)

63
Last Week
2
Last month
Citations as of Feb 19, 2019

Download(s)

3
Citations as of Feb 19, 2019

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.