BLR-MoE: Boosted Language-Routing Mixture of Experts for Domain-Robust Multilingual E2E ASR

arXiv:2501.12602v1 Announce Type: new
Abstract: Recently, the Mixture of Expert (MoE) architecture, such as LR-MoE, is often used to alleviate the impact of language confusion on the multilingual ASR (MASR) task. However, it still faces language confusion issues, especially in mismatched domain scenarios. In this paper, we decouple language confusion in LR-MoE into confusion in self-attention and router. To alleviate the language confusion in self-attention, based on LR-MoE, we propose to apply attention-MoE architecture for MASR. In our new architecture, MoE is utilized not only on feed-forward network (FFN) but also on self-attention. In addition, to improve the robustness of the LID-based router on language confusion, we propose expert pruning and router augmentation methods. Combining the above, we get the boosted language-routing MoE (BLR-MoE) architecture. We verify the effectiveness of the proposed BLR-MoE in a 10,000-hour MASR dataset.

Source link
lol

BLR-MoE: Boosted Language-Routing Mixture of Experts for Domain-Robust Multilingual E2E ASR

By stp2y

Leave a Reply Cancel reply