COMMITTEE

FINAL PROGRAM

Day 1 April 26 (Sunday)	(Breakfast (8:00-9:00) & lunch (12:00-1:30) will be provided) -WORKSHOPS ( UCAS-5, NetEval-2009, VPACT 2009 )
Day 2 April 27 (Monday)	8:00 - 8:45 Breakfast 8:45 - 9:00 Welcome 9:00 -10:00 Keynote I 10:00 -10:30 Break 10:30 -12:00 Session 1 12:00 - 1:30 Lunch 1:30 - 3:00 Session 2 3:00 - 3:30 Break
	Room A	Room B
	3:30 - 5:00 Session 3	3:30 - 5:00 Session 4
	5:30 - 7:00 Reception
Day 3 April 28 (Tuesday)	8:00 - 9:00 Breakfast 9:00 -10:00 Keynote II 10:00 -10:30 Break 10:30 -12:00 Session 5 12:00 - 1:30 Lunch 1:30 - 3:00 Session 6 3:00 - 3:30 Break
	Room A	Room B
	3:30 - 5:00 Session 7	3:30 - 5:00 Session 8

Day 1 - April 26 (Sunday)

8:00 - 9:00 Breakfast

Morning sessions
- Room1: UCAS-5 (first half)
- Room2: NetEval-2009

12:00 - 1:30 Lunch

Afternoon sessions
- Room1: UCAS-5 (second half)
- Room2: VPACT 2009

Day 2 - April 27 (Monday)

8:00 - 8:45 Breakfast

8:45 - 9:00 Welcome (by the general chairs and program chair)

9:00 - 10:00 Keynote I (Session chair: Lieven Eeckhout, Ghent University)

Title: Performance Analysis in the Real World of On Line Services (abstract) (slide)
Speaker: Dileep Bhandarkar, Microsoft

10:00 - 10:30 Break

10:30 - 12:00 Session 1: Real Hardware Measurements (Session chair: Derek Chiou, University of Texas at Austin)

Differentiating the Roles of IR Measurement and Simulation for Power and Temperature-Aware Design
Wei Huang, University of Virginia

Kevin Skadron, University of Virginia

Sudhanva Gurumurthi, University of Virginia

Robert Ribando, University of Virginia

Mircea R. Stan, University of Virginia
User- and Process-Driven Dynamic Voltage and Frequency Scaling
Bin Lin, Intel
Arindam Mallik, IMEC

Peter Dinda, Northwestern University

Gokhan Memik, Northwestern University

Robert Dick, Northwestern University
Accuracy of performance counter measurements
Dmitrijs Zaparanuks, University of Lugano
Milan Jovic, University of Lugano

Matthias Hauswirth, University of Lugano

12:00 - 1:30 Lunch

1:30 - 3:00 Session 2: Tools (Session chair: Todd Austin, University of Michigan)

GARNET: A Detailed On-Chip Network Model inside a Full-System Simulator
Niket Agarwal, Princeton University

Tushar Krishna, Princeton University

Li-Shiuan Peh, Princeton University

Niraj Jha, Princeton University
Cetra: A trace and analysis framework for the evaluation of Cell BE systems
Julio Merino, Universitat Politècnica de Catalunya
Lluc Alvarez, Universitat Politècnica de Catalunya

Marisa Gil, Universitat Politècnica de Catalunya

Nacho Navarro, Universitat Politècnica de Catalunya
Zesto: A Cycle-Level Simulator for Highly Detailed Microarchitecture Exploration
Gabriel H. Loh, Georgia Tech
Samantika Subramaniam, Georgia Tech

Yuejian Xie, Georgia Tech

3:00 - 3:30 Break

<<Parallel Session >>

3:30 - 5:00 Session 3: Parallelism (Session chair: Gilles Pokam, Intel)

Lonestar: A Suite of Parallel Irregular Programs
Milind Kulkarni, The University of Texas at Austin
Martin Burtscher, The University of Texas at Austin

Calin Cascaval, IBM Research

Keshav Pingali, The University of Texas at Austin
Exploring Speculative Parallelism in SPEC2006
Venkatesan Packirisamy, University of Minnesota
Antonia Zhai, University of Minnesota

Pen-Chung Yew, University of Minnesota
Machine Learning Based Online Performance Prediction for Runtime Parallelization and Task Scheduling
Jiangtian Li, North Carolina State University
Xiaosong Ma, North Carolina State University and Oak Ridge National Laboratory

Karan Singh, Cornell University

Martin Schulz , Lawrence Livermore National Laboratory

Bronis R. de Supinski, Lawrence Livermore National Laboratory

Sally A. McKee, Cornell University

3:30 - 5:00 Session 4: Architecture/OS Effects (Session chair: Harish Patil, Intel)

WARP: Enabling Fast CPU Scheduler Development and Evaluation
Haoqiang Zheng, VMware and Columbia University
Jason Nieh, Columbia University
CMPSched$im: Evaluating OS/CMP Interaction on Shared Cache Management
Jaideep Moses, Intel
Konstantinos Aisopos, Princeton University

Ravi Iyer, Intel

Ramesh Illikkal, Intel

Don Newell, Intel

Aamer Jaleel, Intel

Srihari Makineni, Intel
Understanding the Cost of Thread Migration for Multi-threaded Java Applications Running on a Multi-core Platform
Qi Ming Teng, IBM
Peter F. Sweeney, IBM

Evelyn Duesterwald, IBM

5:30 - 7:00 Reception

Day 3 - April 28 (Tuesday)

8:00 - 9:00 Breakfast

9:00 - 10:00 Keynote II (Session chair: Dean Tullsen, University of California, San Diego)

Title: Accelerating Architecture Research (abstract) (slide)
Speaker: Joel Emer, Intel

10:00 - 10:30 Break

10:30 - 12:00 Session 5: Workload Characterization and Modeling (Session chair: David Brooks, Harvard University)

The Data-centricity of Web 2.0 Workloads and its Impact on Server Performance
Moriyoshi Ohara, IBM Research
Priya Nagpurkar, IBM Research

Yohei Ueda, IBM Research

Kazuaki Ishizaki, IBM Research
Characterizing and Optimizing the Memory Footprint of De Novo Short Read DNA Sequence Assembly
Jeffrey J. Cook, University of Illinois at Urbana-Champaign
Craig Zilles, University of Illinois at Urbana-Champaign
An Analytic Model of Optimistic Software Transactional Memory
Armin Heindl, University of of Erlangen-Nuremberg, Germany
Gilles Pokam, Intel Corporation

Ali-Reza Adl-Tabatabai, Intel Corporation

12:00 - 1:30 Lunch

1:30 - 3:00 Session 6: GPU Workloads and Trace Compression (Session chair: Bronis de Supinski, LLNL)

Analyzing CUDA Workloads Using a Detailed GPU Simulator
Ali Bakhoda, University of British Columbia
George Lai Yuan, University of British Columbia

Wilson W. L. Fung, University of British Columbia

Henry Wong, University of British Columbia

Tor M. Aamodt, University of British Columbia
Evaluating GPUs for Network Packet Signature Matching
Randy Smith, University of Wisconsin--Madison
Neelam Goyal, University of Wisconsin--Madison

Justin Ormont, University of Wisconsin--Madison

Karthikeyan Sankaralingam, University of Wisconsin--Madison

Cristian Estan, University of Wisconsin--Madison
Online compression of cache-filtered address traces
Pierre Michaud, INRIA/IRISA

3:00 - 3:30 Break

<<Parallel Session >>

3:30 - 5:00 Session 7: Branch Prediction and Phase Detection (Session chair: Tor Aamodt, University of British Columbia)

Analysis of the TRIPS prototype block predictor
Nitya Ranganathan, University of Texas at Austin
Doug Burger, Microsoft Research

Stephen W. Keckler, University of Texas at Austin
Experiment Flows and Microbenchmarks for Reverse Engineering of Branch Predictor Structures
Vladimir Uzelac, University of Alabama in Huntsville
Aleksandar Milenkovic, University of Alabama in Huntsville
Analyzing the Impact of On-chip Network Traffic on Program Phases for CMPs
Yu Zhang, Northwestern University
Gokhan Memik, Northwestern University

Berkin Ozisikyilmaz, Northwestern University

John Kim, Northwestern University
Alok Choudhary, Northwestern University

3:30 - 5:00 Session 8: Simulation (Session chair: Ken Barr, VMware)

SuiteSpecks and SuiteSpots: A Methodology for the Automatic Conversion of Benchmarking Programs into Intrinsically Checkpointed Assembly Code

Jeff Ringenberg, University of Michigan

Trevor Mudge, University of Michigan
Accurately Approximating Superscalar Processor Performance from Traces
Kiyeon Lee, University of Pittsburgh
Shayne Evans, Lime Brokerage, LLC

Sangyeun Cho, University of Pittsburgh
QUICK: A Flexible Full-System Functional Model
Dam Sunwoo, The University of Texas at Austin
Joonsoo Kim, The University of Texas at Austin

Derek Chiou, The University of Texas at Austin

Keynote I:

Title: Performance Analysis in the Real World of On Line Services

Speaker: Dileep Bhandarkar, Microsoft

Abstract:

Performance analysis has always been an integral part of a computer architect's agenda. However, the term performance is used largely to measure "speed". The dictionary defines performance more broadly as" the manner in which or the efficiency with which something reacts or fulfills its intended purpose". In today's internet based on line computing environment, performance has taken a broader view. For example, power and energy efficiency is becoming as or more important measures of system performance as speed of computation. The industry's ability to deliver speed has outpaced the ability of most applications to consume it effectively. This talk will discuss how performance is viewed in the world of on line web services from an end user's point of view.

Bio:

Dr. Dileep Bhandarkar joined Microsoft as a Distinguished Engineer responsible for Server Hardware Architecture and Standards for Global Foundation Services in May 2007. Global Foundation Services delivers the foundational platform for Microsoft’s Online Services, including MSN and Windows Live-branded services, Microsoft communication and collaboration services, as well as 150 additional services and Web portals. He was elected an IEEE Fellow in 1997 for contributions and technical leadership in the design of complex and reduced instruction set architecture and in computer system performance analysis. In 1998, he was recognized as a Distinguished Alumnus of the Indian Institute of Technology, Bombay, where he received his B. Tech in Electrical Engineering in 1970. He also has a M.S. and Ph.D. in Electrical Engineering from Carnegie Mellon University, and has done graduate work in Business Administration at the University of Dallas. Prior to joining Microsoft, he was Director of Advanced Architecture in the CTO Office of Intel’s Digital Enterprise Group and a lead spokesperson for evangelizing Intel server platform technologies to the industry and financial analysts. He was an Intel Distinguished Lecturer for several years. He has held several Director-level positions related to CPU and Platform Architecture, and Strategic Planning over a 12 year career at Intel. He was instrumental in driving the strategic decision to implement AMD compatible 64-bit x86 architecture at Intel, and pioneered the adoption of energy efficient microprocessor cores across Intel’s product line. Prior to joining Intel in 1995, he spent almost 18 years at Digital Equipment Corporation, where he managed processor and system architecture, and performance analysis work related to the VAX, Prism, MIPS, and Alpha architectures. He also worked at Texas Instruments for 4 years in their research labs in a variety of areas including magnetic bubble memories, charge coupled devices, fault tolerant memories, and computer architecture. Dr. Bhandarkar holds 16 U.S. Patents and has published more than 30 technical papers in various journals and conference proceedings. He is also the author of a book titled Alpha Implementations and Architecture. He has delivered several invited keynote speeches at computer and financial industry conferences.

Keynote II:

Title: Accelerating Architecture Research

Speaker: Joel Emer, Intel

Abstract:

With the recent demonstration of 32nm processors we have seen Moore's law providing another large increase in the number of transistors. While more transistors provides architects with a great opportunity, I believe we have been observing increasing challenges in finding the most effective uses for these transistors. Design team size, mask costs and fabrication costs are all increasing, thus there is increasing desire to make the right decisions about which research ideas to bring forward to design. Unfortunately, our existing evaluation methodologies are proving increasingly ineffective at providing compelling evidence that a new idea warrants inclusion in future designs. In this talk, I will elaborate on these challenges and discuss some approaches to improve on our ability to prove the merit of architectural ideas. In particular, there is a recent movement toward using field-programmable gate arrays (FPGAs) as the basis for the evaluating future systems. Therefore, I will outline the alternative approaches to using FPGAs with an emphasis on using FPGAs to do performance modeling. But designing hardware models is far more complicated than writing software models, so included in the discussion will be techniques to reduce that complexity. These will include a practical approach to modularizing the model, separation of the functional and timing aspects of the simulation, and additional infrastructure important for performance modeling.

Bio:

Dr. Joel S. Emer is an Intel Fellow working in the Digital Enterprise Group, where he is director of micro-architecture research. Before joining Intel he spent 22 years as a Digital/Compaq employee, where he worked on processor architecture, performance analysis and performance modeling methodologies for a number of VAX and Alpha CPUs. He is widely recognized for his architecture contributions to various VAX and Alpha processors, also for pioneering efforts in simultaneous multithreading, for his analysis of the architectural impact of soft errors and for his seminal work on the now pervasive quantitative approach to processor evaluation. He also has researched heterogeneous distributed systems and networked file systems at DEC and during a three year sabbatical at MIT. His current research interests include processor reliability, multithreaded processor organizations, techniques for increased instruction level parallelism, pipeline organization, instruction and data cache organizations, branch prediction schemes, and performance modeling. Dr. Emer holds a Ph.D. in Electrical Engineering from the University of Illinois, and M.S.E.E. and B.S.E.E. degrees from Purdue University. He is also a Fellow of both the ACM and the IEEE.