Abstract
The continuous advances of Artificial Intelligence (AI) techniques have created new application domains for smaller and more efficient Machine Learning (ML) models. In the context of embedded ML, network sparsification strategies became crucial steps to fit models with severe space constraints. Hence, the aim of this research is to evaluate Neural Network (NN) sparsification and compression on embedded systems. To do so, we investigate the problem of miniaturised binary classifiers (i.e., disease detection) in the computer vision domain. We applied a constant pruning technique during the training process of three architectures: a standard Convolutional Neural Network, CNN in short, (i.e., AlexNet), a residual network (i.e., ResNet), and a densely connected CNN (i.e., DenseNet). We varied: network sparsity (up to 95%), image resolution (from 8×8 up to 32×32), and quantisation. The results indicate that the use of sparse networks has a significant impact on the accuracy of miniaturised binary classifiers. With a 70% of sparsity, it was reached an accuracy improvement of 4% in low-resolution images (i.e., 8×8) compared to the standard dense approach. Our findings suggest that sparse NNs can significantly reduce both the size and computational demands of the models while increasing their accuracy on these edge cases.