Speaker
Details
The stagnant growth of general-purpose processors' performance has led to the rise of hardware acceleration, while the high non-recurring engineering costs of application specific hardware accelerators have increased the popularity of field-programmable gate arrays (FPGAs). The growing diversity in FPGA applications motivates the design of domain-optimized FPGAs and their integration with other processing units like processors. These two research topics constitute the focus of this dissertation.
Conventionally, FPGAs are built with customized electronic design automation (EDA) tools and large collections of custom-layout circuits. Therefore, it is costly and time-consuming to evaluate different FPGA architectures with transistor-level fidelity, let alone silicon prototyping or agile productionization of novel FPGAs. Addressing this issue, this thesis investigates synthesizable FPGAs that can be designed with off-the-shelf EDA tools. This thesis first presents the Princeton Reconfigurable Gate Array (PRGA), an open-source FPGA research and prototyping framework. Per user specification, PRGA generates the synthesizable Verilog descriptions of a custom FPGA and all the necessary scripts to configure several other open-source FPGA tools into a complete Verilog-to-bitstream toolchain for the custom FPGA. Leveraging PRGA, this thesis then proposes an algorithm for designing intrinsically cycle-free FPGAs, enabling automated optimization and accurate characterization using off the-shelf EDA tools. Cycle-free FPGAs offer comparable mutability and performance to conventional synthesizable FPGAs while consuming less area.
In complement to optimizing FPGA architectures, this thesis proposes a cache coherent, manycore-FPGA system named Duet. Unlike commercial CPU-FPGA system-on-chips (SoC) in which a few processors play a supportive role for a single, monolithic embedded FPGA (eFPGA), Duet integrates multiple, possibly heterogeneous eFPGAs with a manycore processor, enabling two paradigms of acceleration: fine-grained acceleration, which partitions an application into small tasks and offloads the compute-intensive ones onto eFPGA-emulated accelerators, leaving the less accelerable tasks to the processors; hardware augmentation, which employs eFPGA-emulated hardware widgets to improve processor efficiency in certain execution models.
The synthesizable FPGA design methodology and the Duet system are evaluated with two SoC prototypes, CIFER and DECADES, both fabricated in a Global Foundries 12nm FinFET technology. This thesis details the design and characterization of the eFPGAs and CPU-FPGA interfaces on the two chips.
Adviser David Wentzlaff