Start | End |
Monday, March 29, 2021 |
09:00 |
09:10 |
Welcome
|
09:10 |
10:00 |
Speaker: Dan Lustig
(NVIDIA's Architecture Research Group)
|
10:00 |
10:21 |
Session 1: Benchmarking
Session Chair: Matthew D. Sinclair
(University of Wisconsin - Madison, AMD Research)
|
10:00 |
10:02 |
GenomicsBench: A Benchmark Suite for Genomics
Arun Subramaniyan (University of Michigan-Ann Arbor);
Yufeng Gu (University of Michigan-Ann Arbor);
Timothy Dunn (University of Michigan-Ann Arbor);
Somnath Paul (Intel Corporation);
Md Vasimuddin (Intel Corporation);
Sanchit Misra (Intel Corporation);
David Blaauw (University of Michigan-Ann Arbor);
Satish Narayanasamy (University of Michigan-Ann Arbor);
Reetuparna Das (University of Michigan-Ann Arbor)
|
10:02 |
10:04 |
GNNMark: A Benchmark Suite to Characterize Graph Neural Network Training on GPUs
Trinayan Baruah (Northeastern University);
Kaustubh Shivdikar (Northeastern University);
Shi Dong (Cerebras);
Yifan Sun (William & Mary);
Saiful A. Mojumder (Boston University);
Kihoon Jung (KAIST);
Jose L. Abellan (Universidad Catolica de Murcia);
Yash Ukidave (Millennium Management);
Ajay Joshi (Boston University);
John Kim (KAIST);
David Kaeli (Northeastern University)
|
10:04 |
10:06 |
AIBench Training: Balanced Industry-Standard AI Training Benchmarking
Fei Tang (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences);
Wanling Gao (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences);
Jianfeng Zhan (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences);
Chuanxin Lan (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences );
Xu Wen (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences);
Lei Wang (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences);
Chunjie Luo (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences);
Jiahui Dai (Beijing Academy of Frontier Sciences and Technology);
Zheng Cao (Alibaba Group);
Xingwang Xiong (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences);
Zihan Jiang (Institute of Computing Technology, Chinese Academy of Sciences);
Tianshu Hao (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences);
Fanda Fan (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences);
Fan Zhang (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences);
Yunyou Huang (Guangxi Normal University);
Jianan Chen (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences);
Mengjia Du (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences);
Rui Ren (China Electronics Technology Research Institute of Cyberspace Security);
Chen Zheng (Institute of Software, Chinese Academy of Sciences);
Daoyi Zheng (Baidu);
Haoning Tang (Tencent);
Kunlin Zhan (58.com);
Biao Wang (NetEase);
Defei Kong (ByteDance);
Minghe Yu (Zhihu);
Chongkang Tan (Lenovo);
Huan Li (Paypal);
Xinhui Tian (Moqi);
Yatao Li (Microsoft Research Asia China);
Gang Lu (Huawei);
Junchao Shao (JD.com);
Zhenyu Wang (CloudTa);
Xiaoyu Wang (Intellifusion);
Hainan Ye (Beijing Academy of Frontier Sciences and Technology)
|
10:06 |
10:21 |
Discussion & Questions
|
10:30 |
10:51 |
Session 2: GPUs
Session Chair: Rachata Ausavarungnirun
(TGGS, King Mongkut's University of Technology North Bangkok)
|
10:30 |
10:32 |
CoCoPeLia: Communication-Computation Overlap Prediction for Efficient Linear Algebra on GPUs
Petros Anastasiadis (National Technical University of Athens);
Nikela Papadopoulou (National Technical University of Athens);
Georgios Goumas (National Technical University of Athens);
Nectarios Koziris (National Technical University of Athens)
|
10:32 |
10:34 |
Learning Sparse Matrix Row Permutations for Efficient SpMM on GPU Architectures
Atefeh Mehrabi (Duke University);
Donghyuk Lee (NVIDIA);
Niladrish Chatterjee (NVIDIA);
Daniel J. Sorin (Duke University);
Benjamin C. Lee (University of Pennsylvania);
Mike O'Connor (NVIDIA / UT-Austin)
|
10:34 |
10:36 |
Analyzing Secure Memory Architecture for GPUs
Shougang Yuan (NC State University);
Ardhi Yudha (University of Central Florida);
Yan Solihin (University of Central Florida);
Huiyang Zhou (NC State University)
|
10:36 |
10:51 |
Discussion & Questions
|
10:51 |
11:21 |
Poster Session A
|
|
|
MicroGrad: A Centralized Framework for Workload Cloning and Stress Testing
Gokul Subramanian Ravi (University of Chicago); Ramon Bertran and Pradip Bose (IBM Research); Mikko Lipasti (UW-Madison)
|
|
|
ViStA: Video Streaming and Analytics Benchmark
Navneet Raju, Hari Om, Rahul M Koushik and Subramaniam Kalambur (PES University,Bengaluru,India)
|
|
|
Analysis of Factors Affecting Power Consumption and Energy Efficiency of SGEMM Workload on Low-Power 28nm Myriad-2 VPU
Suyash Bakshi and Lennart Johnsson (University of Houston)
|
|
|
A Defense-Inspired Benchmark Suite
Pete Ehrett, Nathan Block, Bing Schaefer, Adrian Berding, John Paul Koenig, Pranav Srinivasan, Valeria Bertacco and Todd Austin (University of Michigan)
|
|
|
An Automated Traffic Generation Framework for Performance Evaluation of Networks-on-Chip for Real World Use Cases
Sri Harsha Gade (Arm Ltd., Bangalore); Anup Gangwar (Arm Ltd., Austin); Ambica Prasad, Nitin Kumar Agarwal and Ravishankar Sreedharan (Arm Ltd., Bangalore)
|
|
|
How Do Graph Relabeling Algorithms Improve Memory Locality?
Mohsen Koohi Esfahani, Peter Kilpatrick and Hans Vandierendonck (Queen's University Belfast)
|
|
|
Designing GPU Architecture for Memory Bandwidth Reservation
Emir C Marangoz, Kyoung-Don Kang and Seunghee Shin (The State University of New York at Binghamton)
|
|
|
Reducing BERT Computation by Padding Removal and Curriculum Learning
Wei Zhang, Wei Wei, Wen Wang, Lingling Jin and Zheng Cao (Alibaba Group)
|
|
|
Efficient Split Counter Mode Encryption for NVM
Qi Pei and Seunghee Shin (The State University of New York at Binghamton)
|
11:21 |
11:42 |
Session 3: Characterization
Session Chair: Omer Khan
(University of Connecticut)
|
11:21 |
11:23 |
AI Tax in Mobile SoCs: Quantifying the End-to-End AI Application Performance on Smartphones
Michael Buch (Harvard University);
Zahra Azad (Boston University);
Ajay Joshi (Boston University);
Vijay Janapa (Reddi Harvard/UT Austin/Google)
|
11:23 |
11:25 |
Performance Characterization of .NET Benchmarks
Aniket Deshmukh (The University of Texas at Austin);
Ruihao Li (The University of Texas at Austin);
Rathijit Sen (Microsoft);
Robert R. Henry (Microsoft);
Monica Beckwith (Microsoft);
Gagan Gupta (Microsoft)
|
11:25 |
11:27 |
Performance Analysis of Graph Neural Network Frameworks
Junwei Wu (University of Science and Technology of China);
Jingwei Sun (University of Science and Technology of China);
Hao Sun (University of Science and Technology of China);
Guangzhong Sun (University of Science and Technology of China)
|
11:27 |
11:42 |
Discussion & Questions
|
11:50 |
12:11 |
Session 4: Software Analysis
Session Chair: Nikos Nikoleris
(Arm Research)
|
11:50 |
11:52 |
Loopapalooza: Investigating Limits of Loop-Level Parallelism with a Compiler-Driven Approach
Ali Zaidi (Arm Inc.);
Konstantinos Iordanou (The University of Manchester);
Mikel Lujan (The University of Manchester);
Giacomo Gabrielli (Arm Inc.)
|
11:52 |
11:54 |
Real-Time Characterization of Data Access Correlations
Bryan Harris (University of Louisville);
Michael Marzullo (University of Louisville);
Nihat Altiparmak (University of Louisville)
|
11:54 |
11:56 |
Comparative Code Structure Analysis using Deep Learning for Performance Prediction
Nathan Pinnow (Lawrence Livermore National Laboratory);
Tarek Ramadan (Texas State University);
Tanzima Z. Islam (Texas State University);
Chase Phelps (Texas State University);
Jayaraman Thiagarajan (Lawrence Livermore National Laboratory)
|
11:56 |
12:11 |
Discussion & Questions
|
Start | End |
Tuesday, March 30, 2021 |
09:00 |
09:50 |
Speaker: Reetuparna Das
(University of Michigan)
|
09:50 |
10:25 |
Session 5: Best Paper Nominations
Session Chair: Trevor E. Carlson
(National University of Singapore)
|
09:50 |
09:52 |
Understanding Capacity-Driven Scale-Out Neural Recommendation Inference
Michael Lui (Drexel University);
Yavuz Yetim (Facebook);
Oz Ozkan (Facebook);
Zhuoran Zhao (Facebook);
Shin-Yeh Tsai (Facebook);
Carole-Jean Wu (Facebook);
Mark Hempstead (Tufts University)
|
09:52 |
09:54 |
Re-establishing Fetch-Directed Instruction Prefetching: An Industry Perspective
Yasuo Ishii (Arm);
Jaekyu Lee (Arm Research);
Krishnendra Nathella (Arm Research);
Dam Sunwoo (Arm Research)
|
09:54 |
09:56 |
Enabling reproducible and agile full-system simulation
Bobby R. Bruce (University of California, Davis);
Ayaz Akram (University of California, Davis);
Hoa Nguyen (University of California, Davis);
Kyle Roarty (University of Wisconsin-Madison);
Mahyar Samani (University of California, Davis);
Marjan Fariborz (University of California, Davis);
Trivikram Reddy (University of California, Davis);
Matthew D. Sinclair (University of Wisconsin-Madison, AMD Research);
Jason Lowe-Power (University of California, Davis)
|
09:56 |
09:58 |
A Case Against Hardware Managed DRAM Caches for NVRAM based Systems
Mark Hildebrand (University of California, Davis);
Julian T. Angeles (University of California, Davis);
Jason Lowe-Power (University of California, Davis);
Venkatesh Akella (University of California, Davis)
|
09:58 |
10:00 |
Characterizing Massively Parallel Polymorphism
Mengchi Zhang (Purdue University);
Ahmad Alawneh (Purdue University);
Timothy G. Rogers (Purdue University)
|
10:00 |
10:25 |
Discussion & Questions
|
10:25 |
10:55 |
Poster Session B
|
|
|
Pinpointing the Memory Behaviors of DNN Training
Jiansong Li (Institute of Computing Technology, Chinese Academy of Sciences); Xiao Dong (Youtu Lab, Tencent); Guangli Li (Institute of Computing Technology, Chinese Academy of Sciences); Peng Zhao (2012 Labs, Huawei Technology Co., Ltd); Xueying Wang (Institute of Computing Technology, Chinese Academy of Sciences); Xianzhi Yu (Noah’s Ark Lab, Huawei Technology Co., Ltd); Wei Cao, Lei Liu, and Xiaobing Feng (Institute of Computing Technology, Chinese Academy of Sciences)
|
|
|
Thermal-Aware Overclocking for Smartphones
Guru Prasad Srinivasa (University at Buffalo); David Werner and Mark Hempstead (Tufts University); Geoffrey Challen (University of Illinois)
|
|
|
The Impact of SoC Integration and OS Deployment on the Reliability of Arm Processors
Pablo Bodmann (UFRGS); George Papadimitriou (University of Athens); Dimitris Gizopoulos (University of Athens); Paolo Rech (Politecnico di Torino)
|
|
|
Memory-Efficient Hardware Performance Counters with Approximate-Counting Algorithms
Jingyi Xu, Sehoon Kim, Borivoje Nikolic, and Yakun Sophia Shao (University of California, Berkeley)
|
|
|
Architecture-Level Energy Estimation for Heterogeneous Computing Systems
Francis Wang, Yannan Nellie Wu, Matthew Woicik, Vivienne Sze and Joel S. Emer (Massachusetts Institute of Technology)
|
|
|
Sparseloop: An Analytical, Energy-Focused Design Space Exploration Methodology for Sparse Tensor Accelerators
Yannan Nellie Wu (MIT); Po-An Tsai and Angshuman Parashar (NVIDIA); Vivienne Sze (MIT); Joel S. Emer (MIT/NVIDIA)
|
|
|
Splash-4: Improving Scalability with Lock-Free Constructs
Eduardo José Gómez Hernández and Ruixiang Shao (University of Murcia); Christos Sakalis and Stefanos Kaxiras (Uppsala University); Alberto Ros (University of Murcia)
|
|
|
Accelerating Fully Homomorphic Encryption Through Microarchitecture-Aware Analysis and Optimization
Wonkyung Jung (Seoul National University); Eojin Lee (Samsung Electronics); Sangpyo Kim, Namhoon Kim and Keewoo Lee (Seoul National University); Chohong Min (Ewha Woman's University); Jung Hee Cheon and Jung Ho Ahn (Seoul National University)
|
|
|
Efficient Management of Scratch-Pad Memories in Deep Learning Accelerators
Subhankar Pal (University of Michigan); Swagath Venkataramani, Viji Srinivasan and Kailash Gopalakrishnan (IBM Research)
|
10:55 |
11:23 |
Session 6: Datacenters and HPC
Session Chair: Christina Delimitrou
(Cornell University)
|
10:55 |
10:57 |
Hardware Acceleration for DBMS Machine Learning Scoring: Is It Worth the Overheads?
Zahra Azad (Boston University);
Rathijit Sen (Microsoft);
Kwanghyun Park (Microsoft);
Ajay Joshi (Boston University)
|
10:57 |
10:59 |
TPUPoint: Automatically Characterizing Hardware Accelerated Data Center Machine Learning Program Behavior
Abenezer Wudenhe (University of California, Riverside);
Hung-Wei Tseng (University of California, Riverside)
|
10:59 |
11:01 |
Pitfalls of InfiniBand with On-Demand Paging
Takuya Fukuoka (The University of Tokyo);
Shigeyuki Sato (The University of Tokyo);
Kenjiro Taura (University of Tokyo)
|
11:01 |
11:03 |
Analyzing the Interplay Between Random Shuffling and Storage Devices for Efficient Machine Learning
Zhi-Lin Ke (National Taiwan University);
Hsiang-Yun Cheng (Academia Sinica);
Chia-Lin Yang (National Taiwan University);
Han-Wei Huang (National Taiwan University)
|
11:03 |
11:23 |
Discussion & Questions
|
11:30 |
11:51 |
Session 7: HW and Co-Design
Session Chair: Arrvindh Shriraman
(Simon-Fraser University)
|
11:30 |
11:32 |
E3: A HW/SW Co-design Neuroevolution Platform for Autonomous Learning in Edge Device
Sheng-Chun Kao (Georgia Institute of Technology);
Tushar Krishna (Georgia Institute of Technology)
|
11:32 |
11:34 |
FireMarshal: Making HW/SW Co-Design Reproducible and Reliable
Nathan Pemberton (University of California, Berkeley);
Alon Amid (University of California, Berkeley)
|
11:34 |
11:36 |
COBRA: A Framework for Evaluating Compositions of Hardware Branch Predictors
Jerry Zhao (University of California, Berkeley);
Abraham Gonzalez (University of California, Berkeley);
Alon Amid (University of California, Berkeley);
Sagar Karandikar (University of California, Berkeley);
Krste Asanovic (University of California Berkeley)
|
11:36 |
11:51 |
Discussion & Questions
|
11:51 |
12:01 |
Closing Remarks
|
This website is maintained by the ISPASS-2021 Committee.