Document Actions

Bibliography

References to publications, sorted by year and author.

Nishtala, R, Almasi, G, and Cascaval, C (2008).
Performance without Pain = Productivity, Data layouts and Collectives in UPC
In: Principles and Practices of Parallel Programming (PPoPP).

Raicu, I, Zhang, Z, Wilde, M, Foster, I, Beckman, P, Iskra, K, and Clifford, B (2008).
Toward Loosely Coupled Programming on Petascale Systems
Proceedings of the 20th ACM/IEEE Conference on Supercomputing.

Rosenblum, N, Zhu, X, Miller, B, and Hunt, K (2008).
Learning to Analyze Binary Computer Code
In: 23rd AAAI Conference on Artificial Intelligence (AAAI 2008).

Tallent, N, Mellor-Crummey, J, Adhianto, L, Fagan, M, and Krentel, M (2008).
HPCToolkit: performance tools for scientific computing
Proc. of the SciDAC 2008 Conference, J. Phys., 125(012088).

Williams, S, Carter, J, Oliker, L, Shalf, J, and Yelick, K (2008).
Lattice Boltzmann Simulation Optimization on Leading Multicore Platforms
,” IEEE International Parallel and Distributed Processing Symposium (IPDPS’08).

Williams, S, Patterson, D, Oliker, L, Shalf, J, and Yelick, K (2008).
The Roofline Model: A pedagogical tool for auto-tuning kernels on multicore architectures
In: HOT Chips, A Symposium on High Performance Chips.

Yoshii, K, Iskra, K, Broekema P, Naik, H, and Beckman, P (2008).
Characterizing the Performance of Big Memory on Blue Gene Linux
Proceedings of the 2nd International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2).

Zhang, Z, Espinosa, A, Iskra, K, Raicu, I, Foster, I, and Wilde, M (2008).
Design and Evaluation of a Collective I/O Model for Loosely-coupled Petascale Programming
Proceedings of the 1st Workshop on Many-Task Computing on Grids and Supercomputers.

Agarwal, S, Barik, R, Bonachea, D, Sarkar, V, Shyamasundar, R, and Yelick, K (2007).
Deadlock-Free Scheduling of X10 Computations with Bounded Resources
In: Symposium on Parallel Algorithms and Architecture (SPAA), pp. 229–240, San Diego, California, ACM.

Arnold, DC, Ahn, DH, Supinski, BR, Lee, G, Miller, BP, and Schulz, M (2007).
Stack Trace Analysis for Large Scale Debugging
In: Proceedings of the 21st IEEE International Parallel and Distributed Processing Symposium (IPDPS 07), Long Beach, California, IEEE.

Bordelon, A (2007).
Developing a Scalable, Extensible Parallel Performance Analysis Toolkit
Master thesis, Rice University, Department of Computer Science.

Budlimic, Z, Zhang, R, and Scherer, W (2007).
Runtime Tuning of STM Validation Techniques
In: Workshop on Exploiting Parallelism with Transactional Memory.

Buttari, A, Dongarra, J, Husbands, P, Kurzak, J, and Yelick, K (2007).
Multithreading for Synchronization Tolerance in Matrix Factorization
In: Proceedings of the SciDAC 2007 Conference, Boston, Massachusetts, Journal of Physics: Conference Series.

Buttari, A, Langou, J, Kurzak, J, and Dongarra, J (2007).
A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures
Parallel Computing.

Buttari, A, Langou, J, Kurzak, J, and Dongarra, J (2007).
Parallel Tiled QR Factorization for Multicore Architectures
Concurrency and Computation: Practice and Experience.

Chen, W (2007).
Optimizing Partitioned Global Address Space Programs for Cluster Architectures
PhD thesis, University of California-Berkeley, Computer Science Division.

Chen, W, Bonachea, D, Iancu, C, and Yelick, K (2007).
Automatic Nonblocking Communication for Partitioned Global Address Space Programs
In: Proceedings of the International Conference on Supercomputing (ICS), pp. 158–167, Seattle, Washington, ACM.

Coarfa, C, Mellor-Crummey, J, Froyd, N, and Dotsenko, Y (2007).
Scalability Analysis of SPMD Codes Using Expectations
In: Proceedings of the International Conference on Supercomputing, pp. 13–22, Seattle, Washington, ACM.

Demmel, J, Hoemmen, M, Mohiyuddin, M, and Yelick K (2007).
Avoiding Communication in Computing Krylov Subspaces
University of California EECS Department .

Duell, J (2007).
Pthreads or Processes: Which is Better for Implementing Global Address Space languages?
Master's Thesis, UC Berkeley.

Husbands, P and Yelick, K (2007).
Multithreading and One-Sided Communication in Parallel LU Factorization
In: Proceedings of Supercomputing (SC07), Reno, Nevada, ACM.

Husbands, P and Yelick, K (2007).
Multithreading and One-Sided Communication in Parallel LU Factorization
In: Proceedings of Supercomputing (SC07).

Kamil, A and Yelick, K (2007).
Hierarchical Pointer Analysis for Distributed Programs
In: Static Analysis Symposium (SAS), pp. 281–297, Kongens Lyngby, Denmark, Springer Berlin / Heidelberg.

Kurzak, J and Dongarra, J (2007).
Implementation of Mixed Precision in Solving Systems of Linear Equations on the Cell Processor
Concurrency and Computation: Practice and Experience., 19(10):1371–1385.

Kurzak, J, Buttari, A, and Dongarra, J (2007).
Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization
IEEE Transactions on Parallel and Distributed Systems.

« Previous 25 items Next 25 items » 1 [2] 3 4

Center for Scalable Application Development Software

Sections

Personal tools

Document Actions

Bibliography