A Siamese Transformer-based Architecture for Detecting Semantic Similarity in the Quran

Authors

  • Menwa Alshammeri University of Leeds
  • Eric Atwell University of Leeds
  • Mhd Ammar Alsalka University of Leeds

Keywords:

The Quran, Transformer, Pre-trained Contextual Representations, Siamese Transformer-Networks, AraBERT, SBERT

Abstract

Semantic similarity detection is a crucial task in natural language comprehension and plays an important role in many NLP applications such as information extraction, text summarization, and text clustering. This paper focuses on the semantic similarity in the Quran. We propose a Siamese transformer-based architecture for pairwise semantic similarity detection in the Quran. We exploit Arabic pre-trained contextual representations to derive semantically meaningful verse embeddings. We then finetune the twin transformers networks on a semantic similarity dataset drawn from the Quran. We show that our model improves the Quranic semantic similarity measures and performance over previous studies.

Downloads

Published

2025-05-24