arXiv:2412.20218v1 Announce Type: new
Abstract: In this work, we present Yor`ub’a automatic diacritization (YAD) benchmark dataset for evaluating Yor`ub’a diacritization systems. In addition, we pre-train text-to-text transformer, T5 model for Yor`ub’a and showed that this model outperform several multilingually trained T5 models. Lastly, we showed that more data and larger models are better at diacritization for Yor`ub’a
Source link
lol