Day 1 -- March 29

Workshops and Tutorials
Full-Day Workshop
FastPath 2015 - Fourth International Workshop on Performance Analysis of Workload Optimized Systems
Morning Workshop
WDUD 2015 - Workshop on (Learning from) Discarded and Unsuccessful Data
Morning Tutorial
SIMICS 2015: System-level Program Analysis and Architectural Evaluation with Simics
Afternoon Tutorial
DSM 2015 - Tutorial on Datacenter Simulation Methodologies

Day 2 -- March 30

8:00-8:15 Welcome (General, Program Chairs)
8:15-9:15 Keynote I: James C. Hoe, Hardware Acceleration Comes of Age
9:30-10:50 Session I: Best Paper Candidates, Session Chair: Jose Renau, UC Santa Cruz
Critical-Path Candidates: Scalable Performance Modeling for MPI Workloads
J. Chen, R. Clapp
DELPHI: A Framework for RTL-Based Architecture Design Evaluation Using DSENT Models
M. Papamichael, C. Cakir, C. Sun, O. Chen, J. Hoe, K. Mai, L. Peh, V. Stojanovic
Where Does the Time Go? Characterizing Tail Latency in Memcached
G. Blake, A. Saidi
Micro-Architecture Independent Analytical Processor Performance and Power Modeling
S. Van den Steen, S. De Pestel, M. Mechri, S. Eyerman, T. Carlson, L. Eeckhout, E. Hagersten, D. Black-Schaffer
11:15-11:55 Session II: Graphs, Session chair: Aaron Smith, Microsoft
Graph processing platforms at scale: practices and experiences
S. Lim, S. Lee, G. Ganesh, T. Brown, S. Sukumar
Graph-Matching-Based Simulation-Region Selection for Multiple Binaries
C. Yount, H. Patil, M. Islam, A. Srikanth
1:20-2:20 Session III: Sampling, Session chair: Stijn Eyerman, Ghent University
A Modeling Framework for Reuse Distance-based Estimation of Cache Performance
X. Pan, B. Jonsson
Multi-Program Benchmark Definition
A. Jacobvitz, A. Hilton, D. Sorin
Precise Computer Performance Comparisons Via Statistical Resampling Methods
B. Li, S. Chen, L. Peng
2:45-3:45 Session IV: Operating Systems, Session chair: Geoffrey Blake, ARM
PairMiner: mining for paired functions in kernel extensions
H. Liu, B. JiaJu, Y. Wang, S. Hu
Self-monitoring Overhead of the Linux perf_event Performance Counter Interface
V. Weaver
Hierarchical Cycle Accounting: A new method for application performance tuning
A. Nowak, D. Levinthal, W. Zwaenepoel
3:45-4:45 Session V: Insights, Session chair: Andrew Hilton, Duke University
Revisiting Symbiotic Job Scheduling
S. Eyerman, P. Michaud, W. Rogiest
Micro-Architecture Independent Branch Behavior Characterization
S. De Pestel, S. Eyerman, L. Eeckhout
Non-Volatile Memory Host Controller Interfaces Performance Analysis in High-Performance I/O Systems
A. Awad, B. Kettering, Y. Solihin
5:00-7:00 Poster Session and Receptions

Day 3 -- March 31

8:00-9:00 Keynote II: Andreas Olofsson, What I Learned Building a Parallel Processor from Scratch
9:10-10:30 Session VI: Synthesizable and GPUs: Andrew Hilton, Duke University
Nyami: Synthesizable GPU Architectural Model for General-Purpose and Graphics-Specific Workloads
J. Bush, P. Dexter, T. Miller, A. Carpenter
DRAW: Investigating Benefits of Adaptive Fetch Group Size on GPU
M. Yoon, Y. Oh, S. Lee, S. Kim, D. Kim, W. Ro
DNOC: An Accurate and Fast Virtual Channel and Deflection Routing Network-On-Chip Simulator
G. Oxman, S. Weiss
Performance Evaluation of a DySER FPGA Prototype System Spanning the Compiler, Microarchitecture, and Hardware Implementation
C. Ho, V. Govindaraju, T. Nowatzki, R. Nagaraju, Z. Marzec, P. Agarwal, C. Frericks, R. Cofell, K. Sankaralingam
10:55-11:55 Session VII: Mobile, Session chair: Mark Hempstead, Drexel University
Mosaic: Cross-Platform User-Interaction Record and Replay for the Fragmented Android Ecosystem
M. Halpern, Y. Zhu, R. Peri, V. Reddi
A Study of Mobile Device Utilization
C. Gao, A. Gutierrez, M. Rajan, R. Dreslinski, T. Mudge, C. Wu
Full-System Approach to Analyze the Impact of Next-Generation Mobile Flash Storage
R. de Jong, A. Hansson
1:20-2:40 Session VIII: Emulation/Simulation, Session chair: Benjamin C. Lee, Duke University
QTrace: An Framework for Customizable Full System Instrumentation
X. Tong, A. Moshovos
Pydgin: Generating Fast Instruction Set Simulators from Simple Architecture Descriptions with Meta-Tracing JIT Compilers
D. Lockhart, B. Ilbeyi, C. Batten
Reciprocal Abstraction for Computer Architecture Co-Simulation
M. Moeng, R. Melhem, A. Jones
SynchroTrace: Synchronization-aware Architecture-agnostic Traces for Light-Weight Multicore Simulation
S. Nilakantan, K. Sangaiah, A. More, G. Salvador, B. Taskin, M. Hempstead
3:05-4:45 Session IX: Real Hardware, Session chair: Vince Weaver, University of Maine
Performance and Energy Evaluation of Data Prefetching on Intel Xeon Phi
D. Guttman, M. Kandemir, M. Arunachalam, V. Calina
Emulating Cache Organizations on Real Hardware Using Performance Cloning
Y. Wang, Y. Solihin
Prometheus: Scalable and Accurate Emulation of Task-Based Applications on Many-Core Systems
G. Kestor, R. Gioiosa, D. Chavarria
Analyzing Communication Models for Distributed Thread-Collaborative Processors in Terms of Energy and Time
B. Klenk, L. Oden, H. Froening
Characterization and analysis of a Web Search benchmark
Z. Hadjilambrou, M. Kleanthous, Y. Sazeides
4:45 Concluding Remarks, Best Paper Award