The Impact of Error in User-Provided Bandwidth Estimates on Multi-Site Parallel Job Scheduling Performance

W.M. Jones (USA)


parallel job scheduling, user estimates, clusters, multi-site


Multi-cluster schedulers can dramatically improve average job turn-around time performance by making use of frag mented node resources available throughout the grid. By carefully mapping jobs across potentially many clusters, jobs that would otherwise wait in the queue for local clus ter resources can begin execution much earlier; thereby improving system utilization and reducing average queue waiting time. Recent research in this area leverages user-provided estimates of job communication characteristics to effec tively partition the job across cluster boundaries. In this paper, we address the impact of inaccuracies in these es timates on overall system performance. Furthermore, we demonstrate that multi-site job scheduling techniques ben efit from these estimates, even in the presence of consider able inaccuracy.

Important Links:

Go Back