2012 IEEE Seventh International Conference on Networking, Architecture, and Storage
Download PDF

Abstract

As organizations start to use data intensive cluster computing systems like Hadoop MapReduce to handle large-scale data, scheduling of jobs become very important in order to achieve efficiency. In the default implementations of Hadoop MapReduce, jobs are scheduled in FIFO order. It easily causes the starvation of small jobs in the event of resources being utilized by large jobs, while Fair Scheduler is inefficient when handling large jobs and it leads to sticky slots problem. In this paper, we proposed a new job scheduling algorithm TDWS. The scheduling algorithm takes account characters of different applications to meet their different needs. In addition, it is also highly robust to heterogeneity and easy to achieve optimal data locality. The experiments demonstrate the feasibility and efficiency of our solution.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles