Mixture of Experts in Image Classification: What’s the Sweet Spot?

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning



arXiv:2411.18322v1 Announce Type: cross
Abstract: Mixture-of-Experts (MoE) models have shown promising potential for parameter-efficient scaling across various domains. However, the implementation in computer vision remains limited, and often requires large-scale datasets comprising billions of samples. In this study, we investigate the integration of MoE within computer vision models and explore various MoE configurations on open datasets. When introducing MoE layers in image classification, the best results are obtained for models with a moderate number of activated parameters per sample. However, such improvements gradually vanish when the number of parameters per sample increases.



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.