Computer vision applications are various and multifaceted. While most of the training steps take place on a server or a computer, deep learning models are used on a variety of frontend devices, such as mobile phones, self-driving cars, and Internet-of-Things (IoT) devices. With limited computing power, performance optimization becomes paramount.
In this chapter, we will introduce techniques to limit your model size and improve inference speed while maintaining good prediction quality. As a practical example, we will create a simple mobile application to recognize facial expressions on iOS and Android devices, as well as in the browser.
The following topics will be covered in this chapter:
- How to reduce model size and boost speed without impacting accuracy
- Analyzing model computational performance in depth
- Running models on...