24
May
FasterViT is a family of Vision Transformer models that is both fast and provides better accuracy than other ViT models. It combines the local representation learning of CNNs and the global learning properties of ViTs. In this article, we will cover the FasterViT model for image classification. Figure 1. FasterViT architecture, throughput, and benchmark on ImageNet1K. We will go through image inference using the pretrained network along with a brief of its architectural components. Furthermore, we will also fine-tune a FasterViT model for image classification. We will cover the following topics in this article We will start with a discussion…