Hardware-aware Training for In-memory Computing Systems

Date
Dec 17, 2024, 10:00 am11:30 am
Location
EQUAD B205

Speaker

Details

Event Description

In-memory Computing (IMC) is an emerging approach to address the compute and data-movement costs inherent in high-dimensional matrix-vector multiplies (MVM), a major operation in modern AI workloads. Despite its promise, IMC fundamentally faces two critical challenges: 1) low-level noise due to circuit non-idealities; and 2) ADC quantization at the compute output, both of which limit its applicability to complex AI tasks. In this talk, I will discuss approaches to address these challenges, enabling efficient AI on state-of-the-art IMC systems.

We first develop S-DDHR, a noise-aware training approach that addresses process variations, the dominant source of analog noise in advanced node silicon technologies. This method significantly improves the accuracy of neural network inference running on IMC systems with emerging technologies. To tackle broader and more diverse analog noise challenges, we propose a holistic framework comprising an enhanced training algorithm and an MVM-level modeling approach abstracted from circuit-level noise. This framework is generalizable across different noise levels and types, and has demonstrated consistently high accuracy on practical IMC systems. To address ADC quantization effects, we introduce RAOQ, a novel method that reforms the statistics of activations and weights within neural networks, combined with optimization techniques to adapt the network to the presence of such quantization. RAOQ effectively restores otherwise degraded accuracy to the level of the baseline across multiple bit precisions, datasets, and tasks. The training overhead is minimized via our improved LoRA technique.

These advancements collectively demonstrate the potential of IMC systems to support efficient, high-performance AI workloads under real-world constraints

Adviser: Naveen Verma