COMMITTEE

PRELIMINARY PROGRAM

April 10

(Sunday)

-Tutorial #1: Energy Efficient Data Centers and Systems (full-day: 8:30AM - ) [Lunch on your own]
Organizer: IBM Research, Austin

-Tutorial #2: MV5: A Reconfigurable Simulator for Heterogeneous Multicore Architectures (half-day: 1:30PM - ) [Lunch on your own]
Organizer: Jiayuan Meng, Argonne National Lab

Day 2

April 11

(Monday)

8:45 - 9:00 Welcome by General and Program Chairs

9:00 -10:00 Keynote I The Era of Heterogeneity: Are we prepared?

10:30 -12:10 Session 1

12:10 - 1:30 Lunch

1:30 - 3:10 Session 2

3:30 - 4:45 Session 3

5:00 - 6:30 Reception &

Poster Session

Day 3

April 12

(Tuesday)

9:00 - 10:00 Keynote II Integrated Modeling Challenges in Extreme-Scale Computing

10:30 -12:10 Session 4

12:10 - 1:30 Lunch

1:30 - 3:35 Session 5

3:50 - 5:30 Session 6

5:30 - Concluding Remarks, Best Paper Award

Day 1 - April 10 (Sunday)

Tutorial #1: Energy Efficient Data Centers and Systems (full-day: 8:30AM - )
Organizer: IBM Research, Austin
Tutorial #2: MV5: A Reconfigurable Simulator for Heterogeneous Multicore Architectures (half-day: 1:30PM - )
Organizer: Jiayuan Meng, Argonne National Lab

Day 2 - April 11 (Monday)

8:45 - 9:00 Welcome (by the general and program chairs)

9:00 - 10:00 Keynote I The Era of Heterogeneity: Are we prepared? Ravi Iyer (Intel)

(Session Chair: Rajeev Balasubramonian, Univ. of Utah)

10:30 - 12:10 Session 1: Best Paper Nominees

(Session Chair: David Christie, AMD)

Characterization and Dynamic Mitigation of Intra-Application Cache Interference
Carole-Jean Wu (Princeton University), Margaret Martonosi (Princeton University)
A Semi-Preemptive Garbage Collector for Solid State Drives [Slides]
Junghee Lee (Georgia Institute of Technology), Youngjae Kim (Oak Ridge National Laboratory), Galen M. Shipman (Oak Ridge National Laboratory), Sarp Oral (Oak Ridge National Laboratory), Jongman Kim (Georgia Institute of Technology), Feiyi Wang (Oak Ridge National Laboratory)
PRISM: Zooming in Persistent RAM Storage Behavior [Slides]
Ju-Young Jung (University of Pittsburgh), Sangyeun Cho (University of Pittsburgh)
Evaluation and Optimization of Multicore Performance Bottlenecks in Supercomputing Applications [Best Paper Award] [Slides]
Jeff Diamond (University of Texas at Austin), Martin Burtscher (Texas State University), John D. McCalpin (Texas Advanced Computer Center, University of Texas at Austin), Byoung-Do Kim (Texas Advanced Computer Center, University of Texas at Austin), Stephen W. Keckler (University of Texas at Austin, NVIDIA Corporation), James C. Browne (University of Texas at Austin)

12:10 - 1:30 Lunch

1:30 - 3:10 Session 2: Memory Hierarchies

(Session Chair: Suzanne Rivoire, Sonoma State Univ.)

Minimizing Interference through Application Mapping in Multi-Level Buffer Caches
Christina M Patrick (Pennsylvania State University), Nicholas Voshell (Pennsylvania State University), Mahmut Kandemir (Pennsylvania State University)
Analyzing the Impact of Useless Write-Backs on the Endurance and Energy Consumption of PCM Main Memory [Slides]
Santiago Bock (University of Pittsburgh), Bruce R. Childers (University of Pittsburgh), Rami G. Melhem (University of Pittsburgh), Daniel Mosse (University of Pittsburgh), Youtao Zhang (University of Pittsburgh)
Memory Access Pattern-Aware DRAM Performance Model for Multi-Core Systems [Slides]
Hyojin Choi (Seoul National University), Jongbok Lee (Hansung University), Wonyong Sung (Seoul National University)
Characterizing Multi-threaded Applications based on Shared-Resource Contention [Slides]
Tanima Dey (University of Virginia), Wei Wang (University of Virginia), Jack Davidson (University of Virginia), Mary Lou Soffa (University of Virginia)

3:30 - 4:45 Session 3: Tracing

(Session Chair: Tom Wenisch, Univ. of Michigan)

Trace-driven Simulation of Multithreaded Applications [Slides]
Alejandro Rico (Barcelona Supercomputing Center), Alejandro Duran (Barcelona Supercomputing Center), Felipe Cabarcas (Barcelona Supercomputing Center), Alex Ramirez (Barcelona Supercomputing Center and Universitat Politecnica de Catalunya), Yoav Etsion (Barcelona Supercomputing Center), Mateo Valero (Barcelona Supercomputing Center and Universitat Politecnica de Catalunya)
Efficient Memory Tracing by Program Skeletonization [Slides]
Alain Ketterlin (Universit?de Strasbourg (France) & INRIA), Philippe Clauss (Universit?de Strasbourg (France) & INRIA)
Portable Trace Compression through Instruction Interpretation
Svilen Kanev (Harvard University), Robert Cohn (Intel)

5:00 - 6:30 Reception and Poster Session

Poster papers link

Day 3 - April 12 (Tuesday)

9:00 - 10:00 Keynote II Integrated Modeling Challenges in Extreme-Scale Computing, Pradip Bose (IBM) [Slides]

(Session Chair: David Brooks, Harvard Univ.)

10:30 - 12:10 Session 4: Emerging Workloads

(Session Chair: Derek Chiou, UT Austin)

Where is the Data? Why you Cannot Debate CPU vs. GPU Performance Without the Answer [Slides]
Chris Gregg (University of Virginia), Kim Hazelwood (University of Virginia)
Accelerating Search and Recognition Workloads with SSE 4.2 String and Text Processing Instructions [Slides]
Guangyu Shi (UW-Madison), Min Li (UW-Madison), Mikko Lipasti (UW-Madison)
A Comprehensive Analysis and Parallelization of an Image Retrieval Algorithm [Slides]
Zhenman Fang (Fudan University), Weihua Zhang (Fudan University), Haibo Chen (Fudan University), Binyu Zang (Fudan University)
Performance Evaluation of Adaptivity in Software Transactional Memory [Slides]
Mathias Payer (ETH Zurich), Thomas R. Gross (ETH Zurich)

12:10 - 1:30 Lunch

1:30 - 3:35 Session 5: Simulation and Modeling

(Session Chair: David Murrell, Freescale)

Scalable, accurate NoC simulation for the 1000-core era
Mieszko Lis (MIT), Omer Khan (MIT)
A Single-Specification Principle for Functional-to-Timing Simulator Interface Design
David A. Penry (Brigham Young University)
WiLIS: Architectural Modelling of Wireless Systems
Kermin Fleming (MIT), Man Cheuk Ng (MIT), Sam Gross (MIT), Arvind (MIT)
Detecting Race Conditions in Asynchronous DMA Operations with Full-System Simulation
Michael Kistler (IBM), Daniel Brokenshire (IBM)
Mechanistic-Empirical Processor Performance Modeling for Constructing CPI Stacks on Real Hardware
Stijn Eyerman (Ghent University), Kenneth Hoste (Ghent University), Lieven Eeckhout (Ghent University)

3:50 - 5:30 Session 6: Power and Reliability

(Session Chair: Bronis de Supinski, LLNL)

Power Signature Analysis of the SPECpower_ssj2008 Benchmark
Chunghsing Hsu (ORNL), Stephen W. Poole (ORNL)
Analyzing Throughput of GPGPUs Exploiting Within-Die Core-to-Core Frequency Variation [Slides]
Jung Seob Lee (University of Wisconsin, Madison), Nam Sung Kim (University of Wisconsin, Madison)
Universal Rules Guided Design Parameter Selection for Soft Error Resilient Processors [Slides]
Lide Duan (LSU), Ying Zhang (LSU), Bin Li (LSU), Lu Peng (LSU)
A Dynamic Energy Management in Multi-Tier Data Centers
Seung-Hwan Lim (The Pennsylvania State University), Bikash Sharma (The Pennsylvania State University), Byung Chul Tak (The Pennsylvania State University), Chita R. Das (The Pennsylvania State University)

5:30 - Concluding Remarks, Best Paper Award

Keynote I: The Era of Heterogeneity: Are we prepared?

Bio: Ravi Iyer is a Principal Engineer and Director of SoC Platform Architecture research group in Intel Labs. His research focus is on future SoC and CMP architectures, with specific emphasis on small cores, accelerators, cache/memory hierarchies, fabrics, QoS, emerging applications and performance evaluation. He has published 120+ papers and has filed 30+ patent applications. He will serve as the General Co-Chair for ISCA 2011 and was the Program Co-Chair for ANCS 2010. He is also an Associate Editor for ACM TACO and was previously an Associate Editor for IEEE TPDS. He has served on program committees for many conferences and workshops. Ravi received his Ph.D. in Computer Science from Texas A\&M University.

Abstract: Usage models and applications are rapidly changing as a new class of devices (smart phones, smart TVs, etc) and rich cloud computing services (on datacenter servers) enter the marketplace. In this talk, I will start by describing some key examples of these radical changes in usage models, applications and devices. I will then highlight why the next decade of computing (clients and servers) will be based on heterogeneous architectures consisting of asymmetric cores, accelerators and hybrid cache/memory structures. The rest of the talk will be an in-depth discussion of the power/performance analysis challenges for heterogeneous architectures, such as (i) how do we analyze applications to determine the right mix of cores and accelerators, (ii) how do we provide performance/power prediction techniques for efficient OS scheduling on heterogeneous architectures?, (iii) how do we enable runtimes and applications to achieve the required QoS on heterogeneous architectures?, (iv) how do simulation/emulation methodologies and infrastructure have to change for rapid and consistent heterogeneous architecture exploration? For each of these, I will also give examples of work that is on-going and outline potential areas for future work on performance/power analysis for heterogeneous architectures.

Keynote II: Integrated Modeling Challenges in Extreme-Scale Computing

Bio: Pradip Bose is a Research Staff Member and Manager of the Reliability- and Power-Aware Microarchitectures Department at IBM T. J. Watson Research Center. He has been with IBM for over twenty-five years, and has been involved in the definition and pre-silicon modeling of virtually all IBM POWER-series microprocessors. Dr. Bose is a member of the IBM Academy of Technology and is an IBM Master Inventor. He is a Fellow of IEEE.

Abstract: Extreme-scale computer systems of the future target orders of magnitude improvement in performance over current large-scale server or supercomputing systems. These targets must be achieved for the same power consumption and reliability at the system level. Accomplishing the goal requires investment in new generation integrated pre-silicon modeling environments that allow rapid exploration of power, performance and reliability tradeoffs. In this talk, I present an overview of the alluded modeling challenges and methods of hierarchical abstractions to ease the pre-silicon simulation bottleneck.