* All times are AM and in EDT

StartEnd Monday, March 29, 2021
09:00 09:10
Welcome
09:10 10:00
Speaker: Dan Lustig
(NVIDIA's Architecture Research Group)
10:00 10:21
Session 1: Benchmarking
Session Chair: Matthew D. Sinclair
(University of Wisconsin - Madison, AMD Research)
10:00 10:02
GenomicsBench: A Benchmark Suite for Genomics
Arun Subramaniyan (University of Michigan-Ann Arbor); Yufeng Gu (University of Michigan-Ann Arbor); Timothy Dunn (University of Michigan-Ann Arbor); Somnath Paul (Intel Corporation); Md Vasimuddin (Intel Corporation); Sanchit Misra (Intel Corporation); David Blaauw (University of Michigan-Ann Arbor); Satish Narayanasamy (University of Michigan-Ann Arbor); Reetuparna Das (University of Michigan-Ann Arbor)
10:02 10:04
GNNMark: A Benchmark Suite to Characterize Graph Neural Network Training on GPUs
Trinayan Baruah (Northeastern University); Kaustubh Shivdikar (Northeastern University); Shi Dong (Cerebras); Yifan Sun (William & Mary); Saiful A. Mojumder (Boston University); Kihoon Jung (KAIST); Jose L. Abellan (Universidad Catolica de Murcia); Yash Ukidave (Millennium Management); Ajay Joshi (Boston University); John Kim (KAIST); David Kaeli (Northeastern University)
10:04 10:06
AIBench Training: Balanced Industry-Standard AI Training Benchmarking
Fei Tang (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences); Wanling Gao (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences); Jianfeng Zhan (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences); Chuanxin Lan (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences ); Xu Wen (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences); Lei Wang (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences); Chunjie Luo (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences); Jiahui Dai (Beijing Academy of Frontier Sciences and Technology); Zheng Cao (Alibaba Group); Xingwang Xiong (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences); Zihan Jiang (Institute of Computing Technology, Chinese Academy of Sciences); Tianshu Hao (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences); Fanda Fan (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences); Fan Zhang (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences); Yunyou Huang (Guangxi Normal University); Jianan Chen (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences); Mengjia Du (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences); Rui Ren (China Electronics Technology Research Institute of Cyberspace Security); Chen Zheng (Institute of Software, Chinese Academy of Sciences); Daoyi Zheng (Baidu); Haoning Tang (Tencent); Kunlin Zhan (58.com); Biao Wang (NetEase); Defei Kong (ByteDance); Minghe Yu (Zhihu); Chongkang Tan (Lenovo); Huan Li (Paypal); Xinhui Tian (Moqi); Yatao Li (Microsoft Research Asia China); Gang Lu (Huawei); Junchao Shao (JD.com); Zhenyu Wang (CloudTa); Xiaoyu Wang (Intellifusion); Hainan Ye (Beijing Academy of Frontier Sciences and Technology)
10:06 10:21
Discussion & Questions
10:30 10:51
Session 2: GPUs
Session Chair: Rachata Ausavarungnirun
(TGGS, King Mongkut's University of Technology North Bangkok)
10:30 10:32
CoCoPeLia: Communication-Computation Overlap Prediction for Efficient Linear Algebra on GPUs
Petros Anastasiadis (National Technical University of Athens); Nikela Papadopoulou (National Technical University of Athens); Georgios Goumas (National Technical University of Athens); Nectarios Koziris (National Technical University of Athens)
10:32 10:34
Learning Sparse Matrix Row Permutations for Efficient SpMM on GPU Architectures
Atefeh Mehrabi (Duke University); Donghyuk Lee (NVIDIA); Niladrish Chatterjee (NVIDIA); Daniel J. Sorin (Duke University); Benjamin C. Lee (University of Pennsylvania); Mike O'Connor (NVIDIA / UT-Austin)
10:34 10:36
Analyzing Secure Memory Architecture for GPUs
Shougang Yuan (NC State University); Ardhi Yudha (University of Central Florida); Yan Solihin (University of Central Florida); Huiyang Zhou (NC State University)
10:36 10:51
Discussion & Questions
10:51 11:21
Poster Session A
MicroGrad: A Centralized Framework for Workload Cloning and Stress Testing
Gokul Subramanian Ravi (University of Chicago); Ramon Bertran and Pradip Bose (IBM Research); Mikko Lipasti (UW-Madison)
ViStA: Video Streaming and Analytics Benchmark
Navneet Raju, Hari Om, Rahul M Koushik and Subramaniam Kalambur (PES University,Bengaluru,India)
Analysis of Factors Affecting Power Consumption and Energy Efficiency of SGEMM Workload on Low-Power 28nm Myriad-2 VPU
Suyash Bakshi and Lennart Johnsson (University of Houston)
A Defense-Inspired Benchmark Suite
Pete Ehrett, Nathan Block, Bing Schaefer, Adrian Berding, John Paul Koenig, Pranav Srinivasan, Valeria Bertacco and Todd Austin (University of Michigan)
An Automated Traffic Generation Framework for Performance Evaluation of Networks-on-Chip for Real World Use Cases
Sri Harsha Gade (Arm Ltd., Bangalore); Anup Gangwar (Arm Ltd., Austin); Ambica Prasad, Nitin Kumar Agarwal and Ravishankar Sreedharan (Arm Ltd., Bangalore)
How Do Graph Relabeling Algorithms Improve Memory Locality?
Mohsen Koohi Esfahani, Peter Kilpatrick and Hans Vandierendonck (Queen's University Belfast)
Designing GPU Architecture for Memory Bandwidth Reservation
Emir C Marangoz, Kyoung-Don Kang and Seunghee Shin (The State University of New York at Binghamton)
Reducing BERT Computation by Padding Removal and Curriculum Learning
Wei Zhang, Wei Wei, Wen Wang, Lingling Jin and Zheng Cao (Alibaba Group)
Efficient Split Counter Mode Encryption for NVM
Qi Pei and Seunghee Shin (The State University of New York at Binghamton)
11:21 11:42
Session 3: Characterization
Session Chair: Omer Khan
(University of Connecticut)
11:21 11:23
AI Tax in Mobile SoCs: Quantifying the End-to-End AI Application Performance on Smartphones
Michael Buch (Harvard University); Zahra Azad (Boston University); Ajay Joshi (Boston University); Vijay Janapa (Reddi Harvard/UT Austin/Google)
11:23 11:25
Performance Characterization of .NET Benchmarks
Aniket Deshmukh (The University of Texas at Austin); Ruihao Li (The University of Texas at Austin); Rathijit Sen (Microsoft); Robert R. Henry (Microsoft); Monica Beckwith (Microsoft); Gagan Gupta (Microsoft)
11:25 11:27
Performance Analysis of Graph Neural Network Frameworks
Junwei Wu (University of Science and Technology of China); Jingwei Sun (University of Science and Technology of China); Hao Sun (University of Science and Technology of China); Guangzhong Sun (University of Science and Technology of China)
11:27 11:42
Discussion & Questions
11:50 12:11
Session 4: Software Analysis
Session Chair: Nikos Nikoleris
(Arm Research)
11:50 11:52
Loopapalooza: Investigating Limits of Loop-Level Parallelism with a Compiler-Driven Approach
Ali Zaidi (Arm Inc.); Konstantinos Iordanou (The University of Manchester); Mikel Lujan (The University of Manchester); Giacomo Gabrielli (Arm Inc.)
11:52 11:54
Real-Time Characterization of Data Access Correlations
Bryan Harris (University of Louisville); Michael Marzullo (University of Louisville); Nihat Altiparmak (University of Louisville)
11:54 11:56
Comparative Code Structure Analysis using Deep Learning for Performance Prediction
Nathan Pinnow (Lawrence Livermore National Laboratory); Tarek Ramadan (Texas State University); Tanzima Z. Islam (Texas State University); Chase Phelps (Texas State University); Jayaraman Thiagarajan (Lawrence Livermore National Laboratory)
11:56 12:11
Discussion & Questions
StartEnd Tuesday, March 30, 2021
09:00 09:50
Speaker: Reetuparna Das
(University of Michigan)
09:50 10:25
Session 5: Best Paper Nominations
Session Chair: Trevor E. Carlson
(National University of Singapore)
09:50 09:52
Understanding Capacity-Driven Scale-Out Neural Recommendation Inference
Michael Lui (Drexel University); Yavuz Yetim (Facebook); Oz Ozkan (Facebook); Zhuoran Zhao (Facebook); Shin-Yeh Tsai (Facebook); Carole-Jean Wu (Facebook); Mark Hempstead (Tufts University)
09:52 09:54
Re-establishing Fetch-Directed Instruction Prefetching: An Industry Perspective
Yasuo Ishii (Arm); Jaekyu Lee (Arm Research); Krishnendra Nathella (Arm Research); Dam Sunwoo (Arm Research)
09:54 09:56
Enabling reproducible and agile full-system simulation
Bobby R. Bruce (University of California, Davis); Ayaz Akram (University of California, Davis); Hoa Nguyen (University of California, Davis); Kyle Roarty (University of Wisconsin-Madison); Mahyar Samani (University of California, Davis); Marjan Fariborz (University of California, Davis); Trivikram Reddy (University of California, Davis); Matthew D. Sinclair (University of Wisconsin-Madison, AMD Research); Jason Lowe-Power (University of California, Davis)
09:56 09:58
A Case Against Hardware Managed DRAM Caches for NVRAM based Systems
Mark Hildebrand (University of California, Davis); Julian T. Angeles (University of California, Davis); Jason Lowe-Power (University of California, Davis); Venkatesh Akella (University of California, Davis)
09:58 10:00
Characterizing Massively Parallel Polymorphism
Mengchi Zhang (Purdue University); Ahmad Alawneh (Purdue University); Timothy G. Rogers (Purdue University)
10:00 10:25
Discussion & Questions
10:25 10:55
Poster Session B
Pinpointing the Memory Behaviors of DNN Training
Jiansong Li (Institute of Computing Technology, Chinese Academy of Sciences); Xiao Dong (Youtu Lab, Tencent); Guangli Li (Institute of Computing Technology, Chinese Academy of Sciences); Peng Zhao (2012 Labs, Huawei Technology Co., Ltd); Xueying Wang (Institute of Computing Technology, Chinese Academy of Sciences); Xianzhi Yu (Noah’s Ark Lab, Huawei Technology Co., Ltd); Wei Cao, Lei Liu, and Xiaobing Feng (Institute of Computing Technology, Chinese Academy of Sciences)
Thermal-Aware Overclocking for Smartphones
Guru Prasad Srinivasa (University at Buffalo); David Werner and Mark Hempstead (Tufts University); Geoffrey Challen (University of Illinois)
The Impact of SoC Integration and OS Deployment on the Reliability of Arm Processors
Pablo Bodmann (UFRGS); George Papadimitriou (University of Athens); Dimitris Gizopoulos (University of Athens); Paolo Rech (Politecnico di Torino)
Memory-Efficient Hardware Performance Counters with Approximate-Counting Algorithms
Jingyi Xu, Sehoon Kim, Borivoje Nikolic, and Yakun Sophia Shao (University of California, Berkeley)
Architecture-Level Energy Estimation for Heterogeneous Computing Systems
Francis Wang, Yannan Nellie Wu, Matthew Woicik, Vivienne Sze and Joel S. Emer (Massachusetts Institute of Technology)
Sparseloop: An Analytical, Energy-Focused Design Space Exploration Methodology for Sparse Tensor Accelerators
Yannan Nellie Wu (MIT); Po-An Tsai and Angshuman Parashar (NVIDIA); Vivienne Sze (MIT); Joel S. Emer (MIT/NVIDIA)
Splash-4: Improving Scalability with Lock-Free Constructs
Eduardo José Gómez Hernández and Ruixiang Shao (University of Murcia); Christos Sakalis and Stefanos Kaxiras (Uppsala University); Alberto Ros (University of Murcia)
Accelerating Fully Homomorphic Encryption Through Microarchitecture-Aware Analysis and Optimization
Wonkyung Jung (Seoul National University); Eojin Lee (Samsung Electronics); Sangpyo Kim, Namhoon Kim and Keewoo Lee (Seoul National University); Chohong Min (Ewha Woman's University); Jung Hee Cheon and Jung Ho Ahn (Seoul National University)
Efficient Management of Scratch-Pad Memories in Deep Learning Accelerators
Subhankar Pal (University of Michigan); Swagath Venkataramani, Viji Srinivasan and Kailash Gopalakrishnan (IBM Research)
10:55 11:23
Session 6: Datacenters and HPC
Session Chair: Christina Delimitrou
(Cornell University)
10:55 10:57
Hardware Acceleration for DBMS Machine Learning Scoring: Is It Worth the Overheads?
Zahra Azad (Boston University); Rathijit Sen (Microsoft); Kwanghyun Park (Microsoft); Ajay Joshi (Boston University)
10:57 10:59
TPUPoint: Automatically Characterizing Hardware Accelerated Data Center Machine Learning Program Behavior
Abenezer Wudenhe (University of California, Riverside); Hung-Wei Tseng (University of California, Riverside)
10:59 11:01
Pitfalls of InfiniBand with On-Demand Paging
Takuya Fukuoka (The University of Tokyo); Shigeyuki Sato (The University of Tokyo); Kenjiro Taura (University of Tokyo)
11:01 11:03
Analyzing the Interplay Between Random Shuffling and Storage Devices for Efficient Machine Learning
Zhi-Lin Ke (National Taiwan University); Hsiang-Yun Cheng (Academia Sinica); Chia-Lin Yang (National Taiwan University); Han-Wei Huang (National Taiwan University)
11:03 11:23
Discussion & Questions
11:30 11:51
Session 7: HW and Co-Design
Session Chair: Arrvindh Shriraman
(Simon-Fraser University)
11:30 11:32
E3: A HW/SW Co-design Neuroevolution Platform for Autonomous Learning in Edge Device
Sheng-Chun Kao (Georgia Institute of Technology); Tushar Krishna (Georgia Institute of Technology)
11:32 11:34
FireMarshal: Making HW/SW Co-Design Reproducible and Reliable
Nathan Pemberton (University of California, Berkeley); Alon Amid (University of California, Berkeley)
11:34 11:36
COBRA: A Framework for Evaluating Compositions of Hardware Branch Predictors
Jerry Zhao (University of California, Berkeley); Abraham Gonzalez (University of California, Berkeley); Alon Amid (University of California, Berkeley); Sagar Karandikar (University of California, Berkeley); Krste Asanovic (University of California Berkeley)
11:36 11:51
Discussion & Questions
11:51 12:01
Closing Remarks



This website is maintained by the ISPASS-2021 Committee.

Contact Andreas Diavastos if you have any questions or comments on this website.