Semantic analysis of Hadith for topic classification using Latent Semantic Indexing (LSI) / Aiman Haziq Ibrahim

Ibrahim, Aiman Haziq (2024) Semantic analysis of Hadith for topic classification using Latent Semantic Indexing (LSI) / Aiman Haziq Ibrahim. Degree thesis, Universiti Teknologi MARA, Terengganu.

Abstract

The aim of this project is to provide a framework utilizing Latent Semantic Indexing (LSI) to categorize topics in Hadith texts for semantic analysis. Islamic teachings place a high value on the hadith literature, which records the words and deeds of Prophet Muhammad (peace be upon him). To make it simple to access, retrieve, and comprehend pertinent information, Hadith writings must be effectively organized and categorized depending on their topics. The subjectivity, labor-intensive manual categorization, and insufficient capture of semantic links within texts are only a few of the drawbacks of the currently available approaches for Hadith topic classification. To address these challenges, LSI-based framework was proposed that leverages the latent semantic meaning in Hadith texts. LSI captures the underlying semantic relationships between words and enables more accurate topic classification. The research framework consists of six phases, including a preliminary study, requirement analysis, data finding, development, evaluation, and documentation. The data finding involves collecting and preprocessing reliable Hadith datasets. Development focuses on creating an information retrieval system using LSI. The evaluation assesses the system's performance through metrics like cosine similarity, precision, recall, and F1 Score. The experiment assessed the effectiveness of LSI by utilizing ten queries and relevant judgements, precision ranged from 5.4% to 100%, recall from 0% to 65%, yielding an average F1 Score of 19.4%. Finally, documentation encompasses writing a comprehensive report that includes background, methodology, findings, and conclusions.

Metadata

Item Type: Thesis (Degree)
Creators:
Creators
Email / ID Num.
Ibrahim, Aiman Haziq
20222779903
Contributors:
Contribution
Name
Email / ID Num.
Thesis advisor
Sadjirin, Rosian
UNSPECIFIED
Subjects: Q Science > QA Mathematics > Instruments and machines > Electronic Computers. Computer Science > Algorithms
Divisions: Universiti Teknologi MARA, Terengganu > Kuala Terengganu Campus
Programme: Bachelor of Computer Science (Hons)
Keywords: Latent Semantic Indexing (LSI), Hadith Texts, Semantic Analysis
Date: 2024
URI: https://ir.uitm.edu.my/id/eprint/95548
Edit Item
Edit Item

Download

[thumbnail of 95548.pdf] Text
95548.pdf

Download (91kB)

Digital Copy

Digital (fulltext) is available at:

Physical Copy

Physical status and holdings:
Item Status:

ID Number

95548

Indexing

Statistic

Statistic details