Lucene search engine development: a beginner’s experience / Azilawati Azizan ... [et al.]

Azizan, Azilawati and Mohd Sanusi, Najwa Izzah Najihah and Khairuddin, Nurkhairizan and Shafie, Ana Salwa (2022) Lucene search engine development: a beginner’s experience / Azilawati Azizan ... [et al.]. Mathematical Sciences and Informatics Journal (MIJ), 3 (2). pp. 80-92. ISSN 2735-0703

Abstract

Lucene provides a basic library package for building a complete textbased search engine. It can be used in various ways to benefit both researchers and users. However, for a beginner, to create a search engine utilizing Lucene, require a thorough understanding of the procedures and library packages. Therefore, this project seeks to explore and demonstrate the development of a search engine by employing the Malay Quran translation text as the dataset for testing purposes. This project applied the fundamental Information Retrieval (IR) model as the main methodology for developing the search engine. Apache Lucene framework, a full-text search engine library which is written in JAVA was used to construct the whole search engine components namely the indexer, searcher, query processor, and ranker. Then, the developed search engine was evaluated using a standard IR measurement, where it achieved 67% of precision and 32% recall value. This paper provides a basic approach to developing a text-based search engine that can be used for any IR testing purposes. The result of this project may also benefit the IR community in comparing the retrieval performance.

Metadata

Item Type: Article
Creators:
Creators
Email / ID Num.
Azizan, Azilawati
azila899@uitm.edu.my
Mohd Sanusi, Najwa Izzah Najihah
2017769845@isiswa.uitm.edu.my
Khairuddin, Nurkhairizan
nurkh098@uitm.edu.my
Shafie, Ana Salwa
anas674@uitm.edu.my
Subjects: Q Science > QA Mathematics > Instruments and machines > Electronic Computers. Computer Science
Q Science > QA Mathematics > Instruments and machines > Electronic Computers. Computer Science > Algorithms
Q Science > QA Mathematics > Web databases
Divisions: Universiti Teknologi MARA, Perak > Tapah Campus > Faculty of Computer and Mathematical Sciences
Journal or Publication Title: Mathematical Sciences and Informatics Journal (MIJ)
UiTM Journal Collections: UiTM Journal > Mathematical Science and Information Journal (MIJ)
ISSN: 2735-0703
Volume: 3
Number: 2
Page Range: pp. 80-92
Keywords: Search Engine; Lucene; Quran Translation; Information Retrieval; Precision Recall
Date: November 2022
URI: https://ir.uitm.edu.my/id/eprint/74926
Edit Item
Edit Item

Download

[thumbnail of 74926.pdf] Text
74926.pdf

Download (1MB)

ID Number

74926

Indexing

Statistic

Statistic details