P. Beckman, K. Iskra, K. Yoshii, S. Coghlan, and A. Nataraj (2008)
Benchmarking the Effects of Operating System Interference on Extreme-Scale Parallel Machines
Cluster Computing, 11(1):3-16.
We investigate operating system noise, which we identify as one of the main reasons for a lack of synchronicity in parallel applications. Using a microbenchmark, we measure the noise on several contemporary platforms and find that, even with a general-purpose operating system, noise can be limited if certain precautions are taken. We then inject artificially generated noise into a massively parallel system and measure its influence on the performance of collective operations. Our experiments indicate that on extreme-scale platforms, the performance is correlated with the largest interruption to the application, even if the probability of such an interruption on a single process is extremely small.We demonstrate that synchronizing the noise can significantly reduce its negative influence.
by Jennifer Harris — last modified 2009-04-20 10:08