Home > Published Issues > 2024 > Volume 15, No. 10, 2024 >
JAIT 2024 Vol.15(10): 1163-1173
doi: 10.12720/jait.15.10.1163-1173

Optimizing Deep Learning Efficiency through Algorithm-Hardware Co-design

Joseph T. Santoso *, Mars C. Wibowo, and Budi Raharjo
Faculty of Computer and Business, University of Science and Computer Technology, Semarang, Indonesia
Email: joseph_teguh@stekom.ac.id (J.T.S.); caroline@stekom.cac.id (M.C.W.); budiraharjo@stekom.ac.id (B.R.)
*Corresponding author

Manuscript received February 1, 2024; revised June 06, 2024; accepted July 15, 2024; published October 23, 2024.

Abstract—This study proposes a collaborative approach between algorithms and hardware to enhance the efficiency and effectiveness of deep learning algorithms by exploring hardware architectures most suitable for efficiently executing deep learning algorithms. This co-design approach comprises algorithms and hardware to optimize model complexity and build resource-efficient solutions. Regularization techniques are employed to reduce computational demands from deep learning models to enhance the effectiveness and accuracy of deep learning outputs in deep compression applications. Additionally, this research also develops an Efficient Inference Engine (EIE), a specialized hardware accelerator designed for direct inference on compressed models, significantly improving performance and energy efficiency. The techniques utilized in this research encompass Deep Compression (DC), Dense-Sparse-Dense (DSD), and EIE. The DSD technique involves periodic pruning and restoration of connections in deep learning models to enhance prediction accuracy and prevent overfitting. For the EIE technique, direct inference is performed on compressed models, thus saving memory bandwidth and improving inference speed and energy efficiency. The results of this research show that the development of DC techniques, DSD training, and the EIE hardware architecture have successfully improved deep learning efficiency. With the deep compression model, DNN models can be compressed by 17× up to 49× without sacrificing prediction accuracy. Meanwhile, DSD improves prediction accuracy for various deep learning models (convolutional neural networks, recurrent neural networks, and long short term memory networks). Lastly, EIE has achieved a speed improvement of 13-fold and energy efficiency of 3,400-fold compared to GPUs, enabling faster, more energy-efficient, and more accurate usage in various applications.
 
Keywords—Deep Compression (DC), Efficient Inference Engine (EIE), Dense-Sparse-Dense (DSD), algorithm, hardware co-design

Cite: Joseph T. Santoso, Mars C. Wibowo, and Budi Raharjo, "Optimizing Deep Learning Efficiency through Algorithm-Hardware Co-design," Journal of Advances in Information Technology, Vol. 15, No. 10, pp. 1163-1173, 2024.

Copyright © 2024 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.