The unabated pursuit of omniscient and omnipotent AI is levying hefty latency, memory, and energy taxes at all computing scales. My research is building a heterogeneity of self-adaptive solutions co-optimized across the algorithm, architecture, and silicon stack to generate breakthrough advances in arithmetic performance, compute density and efficiency for on-chip machine learning, and natural language processing (NLP) in particular. In the algorithm front, I will discuss best paper award-winning work on developing a novel floating-point based data type, AdaptivFloat, which enables resilient quantized AI computations; and is particularly suitable for NLP networks with large parameter distribution. Then, I will describe a 16nm chip prototype that adopts AdaptivFloat in the acceleration of noise-robust AI speech and machine translation tasks – and whose fidelity to the front-end application is verified via a formal hardware/software compiler interface. Towards the goal of lowering the prohibitive energy cost of inferencing large language models on TinyML devices, I will describe a principled algorithm-hardware co-design solution, validated in a 12nm chip tapeout, that accelerates Transformer workloads by tailoring the accelerator's latency and energy expenditures according to the complexity of the input query it processes. Finally, I will conclude with some of my current research efforts on further pushing the on-chip energy-efficiency frontiers by leveraging specialized non-conventional dynamic memory structures for ML training -- and recently prototyped in a 16nm chip tapeout.
Thierry Tambe is a final year Electrical Engineering PhD candidate at Harvard University advised by Prof. Gu-Yeon Wei and Prof. David Brooks. His current research interests focus on designing energy-efficient and high-performance algorithms, hardware accelerators and systems for machine learning and natural language processing in particular. He also bears a keen interest in agile SoC design methodologies. Prior to debuting his doctoral studies, Thierry was an engineer at Intel in Hillsboro, Oregon, USA designing various mixed-signal architectures for high-bandwidth memory and peripheral interfaces on Xeon and Xeon-Phi HPC SoCs. He received a B.S. (2010) and M.Eng. (2012) in Electrical Engineering from Texas A&M University. Thierry Tambe is a recipient of the Best Paper Award at the 2020 ACM/IEEE Design Automation Conference, a 2021 NVIDIA Graduate PhD Fellowship, and a 2022 IEEE SSCS Predoctoral Achievement Award.