Dynamic Processing Slots Scheduling for I/O Intensive Jobs of Hadoop MapReduce


Shiori KURAZUMI, Tomoaki TSUMURA, Shoichi SAITO, Hiroshi MATSUO : "Dynamic Processing Slots Scheduling for I/O Intensive Jobs of Hadoop MapReduce", Proc. 3rd Int'l Workshop on Advances in Networking and Computing (WANC'12) ,pp288--292 (Dec. 2012) 予稿


Hadoop, consists of Hadoop MapReduce and Hadoop Distributed File System (HDFS) , is a platform for largescale data and processing. Distributed processing has become common as the number of data has been increasing rapidly worldwide and the scale of processes has become larger, so that Hadoop has attracted many cloud computing enterprises and technology enthusiasts. Hadoop users are expanding under this situation. Our studies are to develop the faster of executing jobs originated by Hadoop. In this paper, we propose dynamic processing slots scheduling for I/O intensive jobs of Hadoop MapReduce focusing on I/O wait during execution of jobs. Assigning more tasks to added free slots when CPU resources with the high rate of I/O wait have been detected on each active TaskTracker node leads to the improvement of CPU performance. We implemented our method on Hadoop 1.0.3, which results in an improvement of up to about 23% in the execution time.

Go back to index.