Image Based Character Recognition, Documentation System To Decode Inscription From Temple

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning



arXiv:2405.17449v1 Announce Type: new
Abstract: This project undertakes the training and analysis of optical character recognition OCR methods applied to 10th century ancient Tamil inscriptions discovered on the walls of the Brihadeeswarar Temple.The chosen OCR methods include Tesseract,a widely used OCR engine,using modern ICR techniques to pre process the raw data and a box editing software to finetune our model.The analysis with Tesseract aims to evaluate their effectiveness in accurately deciphering the nuances of the ancient Tamil characters.The performance of our model for the dataset are determined by their accuracy rate where the evaluated dataset divided into training set and testing set.By addressing the unique challenges posed by the script’s historical context,this study seeks to contribute valuable insights to the broader field of OCR,facilitating improved preservation and interpretation of ancient inscriptions



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.