The forum provides a comprehensive full-stack (hardware and software) view of ML acceleration from cloud to edge. The first talk focuses on the main design and benchmarking challenges facing large general-purpose accelerators, including multi-die scaling, and describes strategies for conducting relevant research as the complexity gap between research prototype and product continues to widen. The second talk looks at how to leverage and specialize the open-source RISC-V ISA for edge ML, exploring the trade-offs between different forms of acceleration such as lightweight ISA extensions and tightly-coupled memory accelerators. The third talk details an approach based on a practical unified architecture for ML that can be easily 'tailored' to fit in different scenarios ranging from smart watches, smartphones, autonomous cars to intelligent cloud. The fourth talk explores the co-design of hardware and DNN models to achieve stateof- the-art performance for real-time, extremely energy/throughput-constrained inference applications. The fifth talk deals with ML on reconfigurable logic, discussing many examples of forms of specializations implemented on FPGAs and their impact on potential applications, flexibility, performance and efficiency. The sixth talk describes the software complexities for enabling ML APIs for various different types of specialized hardware accelerators (GPU, TPUs, including EdgeTPU). The seventh talk look into how to optimize the training process for sparse and low-precision network models for general platforms as well as nextgeneration memristor-based ML engines.
Lim S., Liu Y.P., Benini L., Karnik T., Chang H.-C. (2021). F1: Striking the Balance between Energy Efficiency Flexibility: General-Purpose vs Special-Purpose ML Processors. Institute of Electrical and Electronics Engineers Inc. [10.1109/ISSCC42613.2021.9365804].
F1: Striking the Balance between Energy Efficiency Flexibility: General-Purpose vs Special-Purpose ML Processors
Benini L.;
2021
Abstract
The forum provides a comprehensive full-stack (hardware and software) view of ML acceleration from cloud to edge. The first talk focuses on the main design and benchmarking challenges facing large general-purpose accelerators, including multi-die scaling, and describes strategies for conducting relevant research as the complexity gap between research prototype and product continues to widen. The second talk looks at how to leverage and specialize the open-source RISC-V ISA for edge ML, exploring the trade-offs between different forms of acceleration such as lightweight ISA extensions and tightly-coupled memory accelerators. The third talk details an approach based on a practical unified architecture for ML that can be easily 'tailored' to fit in different scenarios ranging from smart watches, smartphones, autonomous cars to intelligent cloud. The fourth talk explores the co-design of hardware and DNN models to achieve stateof- the-art performance for real-time, extremely energy/throughput-constrained inference applications. The fifth talk deals with ML on reconfigurable logic, discussing many examples of forms of specializations implemented on FPGAs and their impact on potential applications, flexibility, performance and efficiency. The sixth talk describes the software complexities for enabling ML APIs for various different types of specialized hardware accelerators (GPU, TPUs, including EdgeTPU). The seventh talk look into how to optimize the training process for sparse and low-precision network models for general platforms as well as nextgeneration memristor-based ML engines.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.