Deep learning has advanced machine capabilities in a variety of fields typically associated with human intelligence, including image recognition, object detection, natural language processing, healthcare, and competitive games. The models behind deep learning called Deep Neural Networks (DNNs) generally require billions of operations for inference, and a hundred if not thousand fold more operations for training. These operations are typically dominated by high dimensionality Matrix-Vector Multiplies (MVMs). Due to the dominance of MVMs, a number of hardware accelerators have been developed to enhance the compute efficiency of MVMs, but data movement and accessing typically remain key bottlenecks. In-Memory Computing (IMC) has the potential to overcome these key bottlenecks by performing computations in-place within dense 2-D memory. However, IMC has challenges of its own, as it fundamentally trades efficiency and throughput gains for dynamic range. This tradeoff is especially challenging for training where higher dynamic range is typically involved in the form of higher compute precision requirements and greater noise sensitivity compared to inferencing. In this work, we will discuss how to enable deep learning training with IMC by mitigating these challenges while substantially mapping operations to IMC. Key advancements in this work include: (1) leveraging new high-precision IMC techniques, like charge-based and analog-input IMC to reduce noise sources; (2) mapping aggressive quantization techniques such as radix-4 gradients to IMC; and (3) ADC range adjustments to better capture output distributions and mitigate biases during training. These methods enable training of DNNs on IMC capable of over 400 x energy savings and of achieving high levels of testing accuracy on a variety of DNN models.
Adviser: Naveen Verma