PACT 2023October 21–25, 2023

Schedule

Saturday, October 21, 2023

Time What
09:00 - 10:30 Workshops and tutorials - first block
10:30 - 11:00 Coffee break
11:00 - 12:30 Workshops and tutorials - second block
12:30 - 13:30 Lunch break
13:30 - 15:00 Workshops and tutorials - third block
15:00 - 15:30 Coffee break
15:30 - 17:30 Workshops and tutorials - last block

Sunday, October 22, 2023

Time What
09:00 - 10:30 Workshops and tutorials - first block
10:30 - 11:00 Coffee break
11:00 - 12:30 Workshops and tutorials - second block
12:30 - 13:30 Lunch break
13:30 - 15:00 Workshops and tutorials - third block
15:00 - 15:30 Coffee break
15:30 - 17:30 Workshops and tutorials - last block

Monday, October 23, 2023

Time What
8:00 Opening
8:30 Keynote: Concurrent Data Sketches, Idit Keidar, Technion - Israel Institute of Technology
9:30 Break
10:00 Session: Compilers
Chair: Albert Cohen
  • 10:00 – 10:30
    CELLO: Compiler-Assisted Efficient Load-Load Ordering in Data-Race-Free Regions
    Sawan Singh, Josue Feliu, Manuel E. Acacio, Alexandra Jimborean, Alberto Ros
  • 10:30 – 11:00
    Automatic Code Generation for High-Performance Graph Algorithms,
    Zhen Peng, Rizwan A. Ashraf, Luanzheng Guo, Ruiqin Tian, Gokcen Kestor
  • 11:00 – 11:30
    UWOmp𝑝𝑟𝑜: UWOmp++ with Point-to-Point Synchronization, Reduction and Schedules
    Aditya Agrawal, V. Krishna Nandivada
  • 11:30 – 12:00
    mlirSynth: Automatic, Retargetable Program Raising in Multi-Level IR using Program Synthesis
    Alexander Brauckmann, Elizabeth Polgreen, Tobias Grosser, Michael O'Boyle
12:00 Lunch
13:00 Session: Memory system
Chair: Bernhard Egger
  • 13:00 – 13:30
    Drishyam: An Image is Worth a Data Prefetcher
    Shubdeep Mohapatra, Biswabandan Panda
  • 13:30 – 14:00
    HugeGPT: Storing Guest Page Tables on Host Huge Pages to Accelerate Address Translation
    Weiwei Jia, Jiyuan Zhang, Jianchen Shan, Yiming Du, Xiaoning Ding, Tianyin Xu
  • 14:00 – 14:30
    PreFlush: Lightweight Hardware Prediction Mechanism for Cache Line Flush and Writeback
    Hussein Elnawawy, James Tuck, Gregory Byrd
14:30 Break
15:00 Session: Memory system (cont)
Chair: Tien-Pao Shih
  • 15:00 – 15:30
    SDM: Sharing-enabled Disaggregated Memory System with Cache Coherent Compute Express Link
    Hyokeun Lee, Kwanseok Choi, Hyuk-Jae Lee, Jaewoong Sim
  • 15:30 – 16:00
    SimplePIM: A Software Framework For Productive And Efficient In-Memory Processing
    Jinfan Chen, Juan Gomez Luna, Izzat El Hajj, YuXin Guo, Onur Mutlu
  • 16:00 – 16:30
    Virtual PIM: Resource-aware Dynamic DPU Allocation and Workload Scheduling Framework on Multi-DPU PIM Architecture
    Donghyeon Kim, Taehoon Kim, Inyong Hwang, Taehyeong Park, Hanjun Kim, Youngsok Kim, Yongjun Park
16:30 Break
17:00 Reception Poster session
19:00 Business meeting

Tuesday, October 24, 2023

Time What
8:30 Keynote: Energy-Efficient GPU Architectures for Real-Time Rendering, Antonio González, UPC Barcelona
9:30 Break
10:00 Session: GPUs
Chair: Gregory Byrd
  • 10:00 – 10:30
    Boustrophedonic Frames: Quasi-Optimal L2 Caching for Textures in GPUs
    Diya Joseph, Juan L. Aragón, Joan-Manuel Parcerisa, Antonio Gonzalez
  • 10:30 – 11:00
    G-Sparse: Compiler-Driven Acceleration for Generalized Sparse Computation for Graph Neural Networks on Modern GPUs
    Yue Jin, Heng Zhang, Chengying Huan, Yongchao Liu, Shuaiwen Leon Song, Rui Zhao, Yao Zhang, Charles He, Wenguang Chen
  • 11:00 – 11:30
    TSUNAMI: a GPU implementation of the WFA algorithm
    Giulia Gerometta, Alberto Zeni, Marco D. Santambrogio
  • 11:30 – 12:00
    Parallelizing Maximal Clique Enumeration on GPUs
    Mohammad Almasri, Yen-Hsiang Chang, Izzat El Hajj, Rakesh Nagi, Jinjun Xiong, Wen-mei Hwu
12:00 Lunch
13:00 Session: Algorithms
Chair: Michael Spear
  • 13:00 – 13:30
    Accelerating Decision-Tree-based Inference through Adaptive Parallelization
    Jan van Lunteren
  • 13:30 – 14:00
    Automatic Algorithm-Based Fault Tolerance (AABFT) of Stencil Computations
    Louis Narmour, Steven Derrien, Sanjay Rajopadhye
  • 14:00 – 14:30
    Performance Characterization of Popular DNN Models on Out-of-Order CPUs
    Pablo Prieto, Pablo Abad, Jose Angel Gregorio, Valentin Puente
  • 14:30 – 15:00
    GraphMini: Accelerating Subgraph Enumeration Using Auxiliary Graphs
    Juelin Liu, Sandeep Polisetty, Hui Guan, Marco Serafini
15:00 Break
15:30 Session: Architecture
Chair: Tamara Lehman
  • 15:30 – 16:00
    Barad-dur: Near-Storage Accelerator for Training Large Graph Neural Networks
    Jiyoung An, Sang-Woo Jun
  • 16:00 – 16:30
    A Silicon Photonic Multi-DNN Accelerator
    Yuan Li, Ahmed Louri, Avinash Karanth
  • 16:30 – 17:00
    Architecture-Aware Currying
    Mahmut Taylan Kandemir, Gulsum Gudukbay Akbulut, Wonil Choi, Mustafa Karakoy
  • 17:00 – 17:30
    SpecCheck: A Tool for Systematic Identification of Vulnerable Transient Execution in gem5
    Zack McKevitt, Ashutosh Trivedi, Tamara Silbergleit Lehman
17:30 Break
18:30 Conference dinner at the Motto am Fluss

Wednesday, October 25, 2023

Time What Where
8:30 Keynote: Optimizing Compilers in an Age of Ubiquitous AI, Albert Cohen, Google DeepMind
9:30 Break
10:00 SRC poster winners presentations
11:00 Session: Optimization
Chair: Riyadh Baghdadi
  • 11:00 – 11:30
    Separating Mechanism from Policy in STM
    Yaodong Sheng, Ahmed Hassan, Michael Spear
  • 11:30 – 12:00
    MBAPIS: Multi-Level Behavior Analysis Guided Program Interval Selection for Microarchitecture Studies
    Hongwei Cui, Yujie Cui, Honglan Zhan, Shuhao Liang, Xianhua Liu, Chun Yang, Xu Cheng
  • 12:00 – 12:30
    INTERPRET: Inter-Warp Register Reuse for GPU Tensor Core
    Jae Seok Kwak, Myung Kuk Yoon, Ipoom Jeong, Seunghyun Jin, Won Woo Ro
12:30 Closing


Keynotes

Monday, Oct 23, 2023: Concurrent Data Sketches, Idit Keidar

Data sketching algorithms have become an indispensable tool for high-speed computations over massive datasets. They maintain a succinct summary of a data stream’s state and answer queries on it using limited memory, at the cost of giving approximations rather than exact answers. For example, a Θ sketch estimates the number of unique items in a data stream, the CountMin sketch approximates the frequencies at which distinct stream elements occur, and a Quantiles sketch estimates the data distribution of a large input stream.

This talk will discuss efficient concurrent (multi-threaded) implementations of such objects.We will first present an efficient generic approach to parallelizing data sketches and allowing them to be queried in real time, while bounding the error that such parallelism introduces. When instantiated with the KMV Θ sketch sketch, this solution achieves high scalability with a small error. Its implementation is now now publicly available as part of the popular Apache Data Sketches library.

Second, we will discuss the correctness semantics of such objects. We will define Intermediate Value Linearizability (IVL), a correctness criterion that relaxes linearizability to allow more parallelism, and yet preserves the error bounds of sequential (probabilistic) sketches. To illustrate the power of this result, we will show a straightforward and efficient concurrent implementation of a CountMin sketch, which is IVL (albeit not linearizable).

Finally, we will consider the Quantiles sketch, which does not scale well using the generic concurrent sketches approach. We instead present Quancurrent, a highly scalable quantiles sketch.

Based on joint works with Edward Bortnikov, Shaked Elias-Zada, Eshcar Hillel, Lee Rhodes, Arik Rinberg, Hadar Serviansky, and Alexander Spiegelman.

Bio:

Idit Keidar is a Chaired Professor and the current Dean of the Viterbi Faculty of Electrical and Computer Engineering at the Technion – Israeli Institute of Technology. She received her BSc (summa cum laude), MSc (summa cum laude), and PhD from the Hebrew University of Jerusalem in 1992, 1994, and 1998, respectively. Subsequently, she was a Rothschild Postdoctoral Fellow at MIT’s Laboratory for Computer Science. She was a Visiting Professor at Cornell and has consulted for several companies. Prof. Keidar has also served as the program chair for leading conferences (PODC, DISC, PPoPP, and SYSTOR). In her free time, she enjoys writing prose.

Tuesday, Oct 24, 2023: Energy-Efficient GPU Architectures for Real-Time Rendering, Antonio González

Mobile devices such as smartphones and tablets have become the most commonly used computing device nowadays, and projections forecast a significant growth in the future in both the number of shipped units and their capabilities. These devices have become quite powerful and most applications that run on them make an intensive use of graphics animation to provide a rich user experience. Energy consumption and the related heat dissipation issues are the main constraints for the capabilities provided by such systems. These systems are equipped with a very small battery that is expected to last for hours if not days, and since they are normally handheld, their external temperature cannot significantly exceed typical human levels. To provide richer user experiences in these devices, dramatic improvements in energy efficiency are required. This talk focuses on one of the main components of these systems, which is the GPU, and describes some novel microarchitectures that we have recently developed for increasing its performance and energy efficiency.

Bio:

Antonio González is a Full Professor at the Computer Architecture Department of the Universitat Politècnica de Catalunya, Barcelona (Spain), and the director of the Architecture and Compilers research group. He was the founding director of the Intel Barcelona Research Center from 2002 to 2014.

His research has focused on computer architecture and compilers, with special emphasis on cognitive computing systems and graphics processors in recent years. Antonio holds 53 patents, has published over 400 research papers and has given over 130 invited talks. He has a long track record of innovations through technology transfer of his research results to commercial products, especially microprocessors and computing systems in general.

Antonio has served as associate editor for five IEEE and ACM journals, program chair for ISCA, MICRO, HPCA, ICS and ISPASS, general chair for MICRO and HPCA, and member of the program committee for more than 140 symposia.

He is a recipient of multiple awards including the Rosina Ribalta award as the advisor of the best PhD project in Information Technology and Communications, the Duran Farell award for research in technology, the Aritmel National Award of Informatics to the Computer Engineer of the Year, the King Jaime I award in the area of New Technologies, and the ICREA Academia Award. He has been inducted into the “IEEE/ACM MICRO Hall of Fame, the “IEEE HPCA Hall of Fame” and the “ACM/IEEE ISCA Hall of Fame”. Antonio is a Fellow of IEEE and ACM.

Wednesday, Oct 25, 2023: Optimizing Compilers in an Age of Ubiquitous AI, Albert Cohen

Compilers and AI have been working hand in hand for some time… almost “forever” actually. Yet, despite decades of mutually profitable inspiration and progress, the art of compiler construction has not seen changes comparable to the accelerating history of ML for the last 10 years. And yet again, we may actually be standing at the doorstep of such a radical shift in the design of compilers. I am not the only one to notice this of course: ML-enhanced compilers become the norm rather than the exception, while high-performance ML is made possible by advances in domain-specific compilers. I would like to draw the PACT community’s attention to challenges that may require more radical changes to compiler construction, pertaining to correctness, performance and agility. For example, today’s highest performance libraries and heroic accelerator programming are only made possible at the expense of a dramatic loss of programmability. Are we ever going to find a way out of this portability/performance dilemma? What about the agility of compiler engineers? Can we build a software infrastructure scalable enough to compile billions of lines of code while leveraging advanced ML-based heuristics? Can we do so while enabling massive code reuse across domains, languages and hardware? We will shed some light on these questions, based on recent successes and half-successes in academia and industry. We will also form an invitation to tackle these challenges in future research and software development.

Bio:

Albert Cohen is a research scientist at Google. An alumnus of École Normale Supérieure de Lyon and the University of Versailles, he has been a research scientist at Inria, a visiting scholar at the University of Illinois, an invited professor at Philips Research, and a visiting scientist at Facebook Artificial Intelligence Research. Albert works on parallelizing, optimizing and machine learning compilers, and on dataflow and synchronous programming languages, with applications to high-performance computing, artificial intelligence and reactive control.


Accepted papers

This is the preliminary list of accepted papers and posters accepted at PACT 2023.

  • A Silicon Photonic Multi-DNN Accelerator, Yuan Li, George Washington University, Ahmed Louri, George Washington University, Avinash Karanth, Ohio University

  • Accelerating Decision-Tree-based Inference through Adaptive Parallelization, Jan van Lunteren, IBM Research

  • Architecture-Aware Currying, Mahmut Taylan Kandemir, The Pennsylvania State University, Gulsum Gudukbay Akbulut, The Pennsylvania State University, Wonil Choi, Hanyang University, Mustafa Karakoy, TUBITAK-BILGEM

  • Automatic Algorithm-Based Fault Tolerance (AABFT) of Stencil Computations, Louis Narmour, University of Rennes 1, CNRS, IRISA, Colorado State University, Steven Derrien, University of Rennes 1, CNRS, IRISA, Sanjay Rajopadhye, Colorado State University

  • Barad-dur: Near-Storage Accelerator for Training Large Graph Neural Networks, Jiyoung An, University of California Irvine, Sang-Woo Jun, University of California Irvine

  • Boustrophedonic Frames: Quasi-Optimal L2 Caching for Textures in GPUs, Diya Joseph, Polytechnic University of Catalonia, Juan L. Aragón, University of Murcia, Joan-Manuel Parcerisa, Polytechnic University of Catalonia, Antonio Gonzalez, Polytechnic University of Catalonia

  • CELLO: Compiler-Assisted Efficient Load-Load Ordering in Data-Race-Free Regions, Sawan Singh, University of Murcia, Josue Feliu, University of Murcia, Manuel E. Acacio, University of Murcia, Alexandra Jimborean, University of Murcia, Alberto Ros, University of Murcia

  • Drishyam: An Image is Worth a Data Prefetcher, Shubdeep Mohapatra, Student at BITS Pilani, Biswabandan Panda, IIT Bombay

  • G-Sparse: Compiler-Driven Acceleration for Generalized Sparse Computation for Graph Neural Networks on Modern GPUs, Yue Jin, Ant Group, Heng Zhang, Institute of Software, Chinese Academy of Sciences, Chengying Huan, Institute of Software, Chinese Academy of Sciences, Yongchao Liu, Ant Group, Shuaiwen Leon Song, Microsoft/University of Sydney, Rui Zhao, Ant Group, Yao Zhang, Microsoft, Charles He, Dipeak Ltd, Wenguang Chen, Ant Group

  • Automatic Code Generation for High-Performance Graph Algorithms, Zhen Peng, Pacific Northwest National Laboratory, Rizwan A. Ashraf, Pacific Northwest National Laboratory, Luanzheng Guo, Pacific Northwest National Laboratory, Ruiqin Tian, Horizon Robotics, Gokcen Kestor, Pacific Northwest National Laboratory

  • HugeGPT: Storing Guest Page Tables on Host Huge Pages to Accelerate Address Translation, Weiwei Jia, The University of Rhode Island, Jiyuan Zhang, University of Illinois Urbana-Champaign, Jianchen Shan, Hofstra University, Yiming Du, The University of Rhode Island, Xiaoning Ding, New Jersey Institute of Technology, Tianyin Xu, University of Illinois at Urbana-Champaign

  • INTERPRET: Inter-Warp Register Reuse for GPU Tensor Core, Jae Seok Kwak, Yonsei University, Myung Kuk Yoon, Ewha Womans University, Ipoom Jeong, University of Illinois Urbana-Champaign, Seunghyun Jin, Yonsei University, Won Woo Ro, Yonsei University

  • MBAPIS: Multi-Level Behavior Analysis Guided Program Interval Selection for Microarchitecture Studies, Hongwei Cui, School of Computer Science, Peking University, Yujie Cui, School of Computer Science, Peking University, Honglan Zhan, School of Computer Science, Peking University, Shuhao Liang, School of Computer Science, Peking University, Xianhua Liu, School of Computer Science, Peking University, Chun Yang, School of Computer Science, Peking University, Xu Cheng, School of Computer Science, Peking University

  • GraphMini: Accelerating Subgraph Enumeration Using Auxiliary Graphs, Juelin Liu, University of Massachusetts Amherst, Sandeep Polisetty, University of Massachusetts Amherst, Hui Guan, University of Massachusetts Amherst, Marco Serafini, University of Massachusetts Amherst

  • Parallelizing Maximal Clique Enumeration on GPUs, Mohammad Almasri, University of Illinois at Urbana-Champaign, Yen-Hsiang Chang, University of Illinois at Urbana-Champaign, Izzat El Hajj, American University of Beirut, Rakesh Nagi, University of Illinois at Urbana-Champaign, Jinjun Xiong, University at Buffalo, Wen-mei Hwu, NVIDIA, University of Illinois at Urbana-Champaign

  • Performance Characterization of Popular DNN Models on Out-of-Order CPUs, Pablo Prieto, Universidad de Cantabria, Pablo Abad, Universidad de Cantabria, Jose Angel Gregorio, Universidad de Cantabria, Valentin Puente, Universidad de Cantabria

  • PreFlush: Lightweight Hardware Prediction Mechanism for Cache Line Flush and Writeback, Hussein Elnawawy, North Carolina State University, James Tuck, North Carolina State University, Gregory Byrd, North Carolina State University

  • SDM: Sharing-enabled Disaggregated Memory System with Cache Coherent Compute Express Link, Hyokeun Lee, North Carolina State University, Kwanseok Choi, Seoul National University, Hyuk-Jae Lee, Seoul National University, Jaewoong Sim, Seoul National University

  • Separating Mechanism from Policy in STM, Yaodong Sheng, Lehigh University, Ahmed Hassan, Lehigh University, Michael Spear, Lehigh University

  • SimplePIM: A Software Framework For Productive And Efficient In-Memory Processing, Jinfan Chen, ETH Zurich, Juan Gomez Luna, ETH Zurich, Izzat El Hajj, American University of Beirut, YuXin Guo, ETH Zurich, Onur Mutlu, ETH Zurich

  • SpecCheck: A Tool for Systematic Identification of Vulnerable Transient Execution in gem5, Zack McKevitt, University of Colorado Boulder, Ashutosh Trivedi, University of Colorado Boulder, Tamara Silbergleit Lehman, University of Colorado Boulder

  • TSUNAMI: a GPU implementation of the WFA algorithm, Giulia Gerometta, Politecnico di Milano, Alberto Zeni, Politecnico di Milano, Marco D. Santambrogio, Politecnico di Milano, Italy

  • UWOmp𝑝𝑟𝑜: UWOmp++ with Point-to-Point Synchronization, Reduction and Schedules, Aditya Agrawal, IIT Madras, V. Krishna Nandivada, IIT Madras

  • Virtual PIM: Resource-aware Dynamic DPU Allocation and Workload Scheduling Framework on Multi-DPU PIM Architecture, Donghyeon Kim, Hanyang University, Taehoon Kim, Hanyang University, Inyong Hwang, Yonsei University, Taehyeong Park, Yonsei University, Hanjun Kim, Yonsei University, Youngsok Kim, Yonsei University, Yongjun Park, Yonsei University

  • mlirSynth: Automatic, Retargetable Program Raising in Multi-Level IR using Program Synthesis, Alexander Brauckmann, University of Edinburgh, Elizabeth Polgreen, University of Edinburgh, Tobias Grosser, University of Edinburgh, Michael O’Boyle, University of Edinburgh

Accepted posters

  • A CPU-FPGA Holistic Source-To-Source Compilation Approach for Partitioning and Optimizing C/C++ Applications, Tiago Santos, Faculty of Engineering, University of Porto, João M. P. Cardoso, Faculty of Engineering, University of Porto, João Bispo, Faculty of Engineering, University of Porto

  • Dynamic Allocation of Processor Cores to Graph Applications on Commodity Servers, Lucia Pons, Universitat Politècnica de València, Julio Sahuquillo, Universitat Politècnica de València, Timothy M. Jones, University of Cambridge

  • Breaking the Complexity Barrier: Enhancing Quality of Service in Simultaneous Multithreading Processors, Gürhan Küçük, Yeditepe University, Onur Demir, Yeditepe University, Sercan Sari, Yeditepe University, Yiğit Bilgin, Yeditepe University, Uğur Nezir, Yeditepe University, Mehmet Erdem Çakır, Yeditepe University

  • QeiHaN: An Energy-Efficient DNN Accelerator that Leverages Logarithmic Quantization in Near-Data Processing Architectures, Bahareh Khabbazan, Polytechnic University of Catalonia, Barcelona Tech (UPC), Marc Riera Villanueva, Polytechnic University of Catalonia, Antonio Gonzalez, Polytechnic University of Catalonia

  • Quickloop: An efficient, FPGA-accelerated exploration of parameterized DNN accelerators, Tayyeb Mahmood, Incheon National University, Kashif Inayat, Barcelona Supercomputing Center, Jaeyong Chung, Incheon National University

  • Retargeting Applications for Heterogeneous Systems with the Tribble Source-to-Source Framework, Luís Miguel Sousa, Faculty of Engineering, University of Porto / INESC TEC, Nuno Paulino, Faculty of Engineering, University of Porto / INESC TEC, João Bispo, Faculty of Engineering, University of Porto / INESC TEC

  • SLIDEX: Sliding Window Extension for Image Processing, Raúl Taranco, Polytechnic University of Catalonia, Jose Maria Arnau, Polytechnic University of Catalonia, Antonio Gonzalez, Polytechnic University of Catalonia

  • Thread-to-Core Allocation in ARM Processors Building Synergistic Pairs, Marta Navarro, Universitat Politècnica de València, Josué Feliu Pérez, Universitat Politècnica de València, Salvador Petit, Universitat Politècnica de València, Maria E. Gómez, Universitat Politècnica de València, Victor Lixin, HiSilicon, Julio Sahuquillo, Universitat Politècnica de València

  • SparseFT: Sparsity-aware Fault Tolerance for Reliable CNN Inference on GPUs, GwangeunByeon, Sungkyunkwan University, Seungtae Lee, Sungkyunkwan University, Seongwook Kim, Sungkyunkwan university, Yongjun Kim, Sungkyunkwan University, Prashant J. Nair, University of British Columbia, Seokin Hong, Sungkyunkwan University

Important Dates and Deadlines

Registration:

  • Early registration deadline: Sep 3, 2023

Conference Papers:

  • Abstract submission deadline: Mar 25, 2023
  • Paper submission deadline: Apr 1, 2023
    Extended to April 15, 2023
  • Round 1 rebuttal period: Jun 12-15, 2023
  • Round 2 rebuttal period: Jul 10-13, 2023
  • Author notification: Aug 1, 2023
  • Artifact submission: Aug 22, 2023
  • Camera ready papers: Sep 15, 2023

Workshops and Tutorials:

  • Workshop submission deadline: July 3, 2023
  • Tutorial submission deadline: August 21, 2023 August 14, 2023

Student Research Competition:

  • Abstract submission deadline: August 17, 2023
    Extended to August 21, 2023
  • Author notification: September 1, 2023
  • Poster session: October 23, 2023
  • Finalist presentations: October 25, 2023

Artifact Evaluation:

  • Artifact submission deadline: August 22, 2023
  • Author notification: September 13, 2023

Conference: October 21–25, 2023


Sponsors

Platinum

Gold

Bronze

Supporters


Previous PACTs

Earlier PACTs