Skip to content. | Skip to navigation
Nishtala, R, Almasi, G, and Cascaval, C (2008). Performance without Pain = Productivity, Data layouts and Collectives in UPC In: Principles and Practices of Parallel Programming (PPoPP).
Raicu, I, Zhang, Z, Wilde, M, Foster, I, Beckman, P, Iskra, K, and Clifford, B (2008). Toward Loosely Coupled Programming on Petascale Systems Proceedings of the 20th ACM/IEEE Conference on Supercomputing.
Rosenblum, N, Zhu, X, Miller, B, and Hunt, K (2008). Learning to Analyze Binary Computer Code In: 23rd AAAI Conference on Artificial Intelligence (AAAI 2008).
Tallent, N, Mellor-Crummey, J, Adhianto, L, Fagan, M, and Krentel, M (2008). HPCToolkit: performance tools for scientific computing Proc. of the SciDAC 2008 Conference, J. Phys., 125(012088).
Williams, S, Carter, J, Oliker, L, Shalf, J, and Yelick, K (2008). Lattice Boltzmann Simulation Optimization on Leading Multicore Platforms ,” IEEE International Parallel and Distributed Processing Symposium (IPDPS’08).
Williams, S, Patterson, D, Oliker, L, Shalf, J, and Yelick, K (2008). The Roofline Model: A pedagogical tool for auto-tuning kernels on multicore architectures In: HOT Chips, A Symposium on High Performance Chips.
Yoshii, K, Iskra, K, Broekema P, Naik, H, and Beckman, P (2008). Characterizing the Performance of Big Memory on Blue Gene Linux Proceedings of the 2nd International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2).
Zhang, Z, Espinosa, A, Iskra, K, Raicu, I, Foster, I, and Wilde, M (2008). Design and Evaluation of a Collective I/O Model for Loosely-coupled Petascale Programming Proceedings of the 1st Workshop on Many-Task Computing on Grids and Supercomputers.
Agarwal, S, Barik, R, Bonachea, D, Sarkar, V, Shyamasundar, R, and Yelick, K (2007). Deadlock-Free Scheduling of X10 Computations with Bounded Resources In: Symposium on Parallel Algorithms and Architecture (SPAA), pp. 229–240, San Diego, California, ACM.
Arnold, DC, Ahn, DH, Supinski, BR, Lee, G, Miller, BP, and Schulz, M (2007). Stack Trace Analysis for Large Scale Debugging In: Proceedings of the 21st IEEE International Parallel and Distributed Processing Symposium (IPDPS 07), Long Beach, California, IEEE.
Bordelon, A (2007). Developing a Scalable, Extensible Parallel Performance Analysis Toolkit Master thesis, Rice University, Department of Computer Science.
Budlimic, Z, Zhang, R, and Scherer, W (2007). Runtime Tuning of STM Validation Techniques In: Workshop on Exploiting Parallelism with Transactional Memory.
Buttari, A, Dongarra, J, Husbands, P, Kurzak, J, and Yelick, K (2007). Multithreading for Synchronization Tolerance in Matrix Factorization In: Proceedings of the SciDAC 2007 Conference, Boston, Massachusetts, Journal of Physics: Conference Series.
Buttari, A, Langou, J, Kurzak, J, and Dongarra, J (2007). A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures Parallel Computing.
Buttari, A, Langou, J, Kurzak, J, and Dongarra, J (2007). Parallel Tiled QR Factorization for Multicore Architectures Concurrency and Computation: Practice and Experience.
Chen, W (2007). Optimizing Partitioned Global Address Space Programs for Cluster Architectures PhD thesis, University of California-Berkeley, Computer Science Division.
Chen, W, Bonachea, D, Iancu, C, and Yelick, K (2007). Automatic Nonblocking Communication for Partitioned Global Address Space Programs In: Proceedings of the International Conference on Supercomputing (ICS), pp. 158–167, Seattle, Washington, ACM.
Coarfa, C, Mellor-Crummey, J, Froyd, N, and Dotsenko, Y (2007). Scalability Analysis of SPMD Codes Using Expectations In: Proceedings of the International Conference on Supercomputing, pp. 13–22, Seattle, Washington, ACM.
Demmel, J, Hoemmen, M, Mohiyuddin, M, and Yelick K (2007). Avoiding Communication in Computing Krylov Subspaces University of California EECS Department .
Duell, J (2007). Pthreads or Processes: Which is Better for Implementing Global Address Space languages? Master's Thesis, UC Berkeley.
Husbands, P and Yelick, K (2007). Multithreading and One-Sided Communication in Parallel LU Factorization In: Proceedings of Supercomputing (SC07), Reno, Nevada, ACM.
Husbands, P and Yelick, K (2007). Multithreading and One-Sided Communication in Parallel LU Factorization In: Proceedings of Supercomputing (SC07).
Kamil, A and Yelick, K (2007). Hierarchical Pointer Analysis for Distributed Programs In: Static Analysis Symposium (SAS), pp. 281–297, Kongens Lyngby, Denmark, Springer Berlin / Heidelberg.
Kurzak, J and Dongarra, J (2007). Implementation of Mixed Precision in Solving Systems of Linear Equations on the Cell Processor Concurrency and Computation: Practice and Experience., 19(10):1371–1385.
Kurzak, J, Buttari, A, and Dongarra, J (2007). Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization IEEE Transactions on Parallel and Distributed Systems.
CScADS Collaborators include: