Exploring Retrieval Augmented Generation in Arabic

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning


[Submitted on 14 Aug 2024]

View a PDF of the paper titled Exploring Retrieval Augmented Generation in Arabic, by Samhaa R. El-Beltagy and Mohamed A. Abdallah

View PDF

Abstract:Recently, Retrieval Augmented Generation (RAG) has emerged as a powerful technique in natural language processing, combining the strengths of retrieval-based and generation-based models to enhance text generation tasks. However, the application of RAG in Arabic, a language with unique characteristics and resource constraints, remains underexplored. This paper presents a comprehensive case study on the implementation and evaluation of RAG for Arabic text. The work focuses on exploring various semantic embedding models in the retrieval stage and several LLMs in the generation stage, in order to investigate what works and what doesn’t in the context of Arabic. The work also touches upon the issue of variations between document dialect and query dialect in the retrieval stage. Results show that existing semantic embedding models and LLMs can be effectively employed to build Arabic RAG pipelines.

Submission history

From: Samhaa R El-Beltagy [view email]
[v1]
Wed, 14 Aug 2024 10:03:28 UTC (1,283 KB)



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.