arXiv:2501.01470v1 Announce Type: new
Abstract: To address the modality imbalance caused by data heterogeneity, existing multi-modal learning (MML) approaches primarily focus on balancing this difference from the perspective of optimization objectives. However, almost all existing methods ignore the impact of sample sequences, i.e., an inappropriate training order tends to trigger learning bias in the model, further exacerbating modality imbalance. In this paper, we propose Balance-aware Sequence Sampling (BSS) to enhance the robustness of MML. Specifically, we first define a multi-perspective measurer to evaluate the balance degree of each sample. Via the evaluation, we employ a heuristic scheduler based on curriculum learning (CL) that incrementally provides training subsets, progressing from balanced to imbalanced samples to rebalance MML. Moreover, considering that sample balance may evolve as the model capability increases, we propose a learning-based probabilistic sampling method to dynamically update the training sequences at the epoch level, further improving MML performance. Extensive experiments on widely used datasets demonstrate the superiority of our method compared with state-of-the-art (SOTA) MML approaches.
Source link
lol