BoundingDocs: a Unified Dataset for Document Question Answering with Spatial Annotations

stp2yJanuary 8, 20250 Comments

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

[Submitted on 6 Jan 2025]

View a PDF of the paper titled BoundingDocs: a Unified Dataset for Document Question Answering with Spatial Annotations, by Simone Giovannini and 3 other authors

View PDF
HTML (experimental)

Abstract:We present a unified dataset for document Question-Answering (QA), which is obtained combining several public datasets related to Document AI and visually rich document understanding (VRDU). Our main contribution is twofold: on the one hand we reformulate existing Document AI tasks, such as Information Extraction (IE), into a Question-Answering task, making it a suitable resource for training and evaluating Large Language Models; on the other hand, we release the OCR of all the documents and include the exact position of the answer to be found in the document image as a bounding box. Using this dataset, we explore the impact of different prompting techniques (that might include bounding box information) on the performance of open-weight models, identifying the most effective approaches for document comprehension.

Submission history

From: Simone Giovannini [view email]
[v1]
Mon, 6 Jan 2025 21:46:22 UTC (5,992 KB)

Source link
lol

By stp2y