🦋 Reconfigurable Butterfly Accelerator for Attention-based Neural Networks (MICRO’22)

We propose an algorithm and hardware co-design approach to accelerate Attention-based Neural Networks (AttNNs) with Butterfly Sparsity.

This work is awarded three Artifacts Badges (Artifact Available, Artifact Functional, and Results Reproduced). The code mainly contains:

  • Hardware Verilog implementation of a butterfly AttNN accelerator.
  • Sofware PyTorch implementation of an efficient AttNN (FABNet).
  • Automatic scripts to validate hardware functionality, measure power consumption and generate bitstream.

🌟 FPGA-based Acceleration for Dropout-based Bayesian Neural Networks (DAC’23)

We propose multiple algorithm and hardware optimizations to accelerate MCD-based BayesNNs and Maskembles. The code mainly contains:

  • Hardware HLS implementation for the proposed FPGA-based accelerators.
  • Sofware PyTorch implementation of Dropout-based BayesNNs.
  • Automatic scripts for HLS prediction, synthesis and implmentation.

💫 Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN Workloads (MICRO’23)

We propose a dynamic and static scheduling algorithm for sparse Multi-DNN workloads. This work is under review for Artifact Evaluation of MICRO’23. The code mainly contains:

  • Sparse Multi-DNN Benchmark: Collections of diverse sparsified (SparseML) Convolutional NNs (CNNs) and Attention-based NNs (AttNNs) from three distinct applications, namely visual perception, personal assistant, and hand tracking.
  • Simulation-based Evaluation Infrastructure: Seamless integration with PyTorch to evaluate the performance of different multi-DNN scheduling approaches.
  • Hardware and Software Prototypes of our Static and Dynamic Scheduler: FPGA-based implementation to obtain resource estimation of hardware resources.

💫 Prior 2022

Check my GitHub HomePage for details. We have open-sourced a series of code, such as software and hardware implementations for FPT’18, ASAP’19, DAC’21 and so on.