A. Nataraj, A.D. Malony, A. Morris, D.C. Arnold, and B.P. Miller (2008)
In Search of Sweet-Spots in Parallel Performance Monitoring
IEEE International Conference on Cluster Computing (Cluster 2008).
Parallel performance monitoring extends parallel measurement systems with infrastructure and interfaces for online performance data access, communication, and analysis. At the same time it raises concerns for the impact on application execution from monitor overhead. The application monitoring scheme parameterized by performance events to monitor, access frequency and data analysis operation defines a set of monitoring requirements. The monitoring infrastructure presents its own choices, particularly the amount and configuration of resources devoted explicitly to monitoring. The key to scalable, lowoverhead parallel performance monitoring, then, is to match the application monitoring demands to the effective operating range of the monitoring system (or vice-versa). A poor match can result in over-provisioning (wasted resources) or in underprovisioning (lack of scalability, high overheads and poor quality of performance data). We present a methodology and evaluation framework to determine the sweet-spots for performance monitoring using TAU and MRNet.