A Siamese Transformer-based Architecture for Detecting Semantic Similarity in the Quran
Keywords:
The Quran, Transformer, Pre-trained Contextual Representations, Siamese Transformer-Networks, AraBERT, SBERTAbstract
Semantic similarity detection is a crucial task in natural language comprehension and plays an important role in many NLP applications such as information extraction, text summarization, and text clustering. This paper focuses on the semantic similarity in the Quran. We propose a Siamese transformer-based architecture for pairwise semantic similarity detection in the Quran. We exploit Arabic pre-trained contextual representations to derive semantically meaningful verse embeddings. We then finetune the twin transformers networks on a semantic similarity dataset drawn from the Quran. We show that our model improves the Quranic semantic similarity measures and performance over previous studies.
Downloads
Published
2025-05-24