Challenges in the Islamic Question Answering Corpora

Authors

  • Sarah Alnefaie University of Leeds
  • Eric Atwell University of Leeds
  • Mohammad Ammar Alsalka University of Leeds

Keywords:

: Quran Question Answering Dataset, Islamic Corpus, Islamic Knowledge Base

Abstract

In the past, researchers in Islamic question-answering systems created their datasets to evaluate their systems due to the lack of the dataset, making it difficult to compare the performance of the systems. In the last three years, several studies have provided different datasets of Islamic questions and answers to the research community that can be used as a gold dataset to evaluate systems, a knowledge base for the system, and a training dataset to train pre-trained models. In this research, we review and explore the Islamic questions and answers datasets, study the percentage of its coverage of the Quran or Hadith, evaluate them using thirteen criteria, and identify their weaknesses which could serve as a basis for future research. We concluded that there is a limited number of Quran questions, and their answers are available in Arabic only. In addition, as far as we know, there is no Hadith Shareef questions and answers dataset.

Downloads

Published

2025-05-22