Machine learning on tiny IoT devices based on microcontroller units (MCU) is appealing but challenging: the memory of microcontrollers is 2-3 orders of magnitude smaller even than mobile phones. I’ll present the MCUNet project, a framework that jointly designs the efficient neural architecture (TinyNAS) and the lightweight inference engine (TinyEngine), enabling ImageNet-scale inference on microcontrollers. Beyond inference, our latest work also enables fine-tuning the model and learning on the edge by quantized training and sparse backpropagation, requiring 1000x less memory than Pytorch and Tensorflow. Our study suggests that the era of always-on tiny machine learning on IoT devices has arrived.
Song Han is an associate professor at MIT EECS. He received his PhD degree from Stanford University. He proposed “deep compression” technique that’s widely used by industry for efficient AI computing, and “Efficient Inference Engine” that first brought weight sparsity to neural network accelerators. His team’s work on hardware-aware neural architecture search (once-for-all network, MCUNet) brought deep learning to IoT devices that has only 256KB memory, and enables learning on the edge. Song received the NSF CAREER Award for “efficient algorithms and hardware for accelerated machine learning” and was named “35 Innovators Under 35” by MIT Technology Review.