Evaluation of the effectiveness of clustering algorithm in retrieving Malay documents / Aminah Mahmood

Mahmood, Aminah (2004) Evaluation of the effectiveness of clustering algorithm in retrieving Malay documents / Aminah Mahmood. Degree thesis, Universiti Teknologi MARA.

Abstract

In recent years, we have witnessed a tremendous growth in the volume of text
documents available on the Internet, digital libraries, new sources and company-wide
intranets. This has led to an increased interest in developing methods that can help users
to effectively navigate, summarize and organize this information with the ultimate goal
of helping them to find what they are looking for. The main issue in this information age
is the efficiency and effectiveness of the retrieval system that can be used by the
information provider. A good retrieval system should provide tools to perform searching
accurately based on user requirements. Cluster analysis is a technique for multivariate
analysis that assigns items to automatically created group based on a calculation of the
degrees of association between items and groups. In the information retrieval (IR) field,
cluster analysis has been used to create groups of documents with the goal of improving
the efficiency and effectiveness of retrieval, or to determine the structure of the literature
of a field. The IR community has explored docimient clustering as an alternative method
of organizing retrieval results, but clustering has yet to be deployed on the major search
engines. This study has evaluated and identified the effectiveness of clustering algorithm
in Malay document retrieval system using Hadith test collections, which consists of
Hadith documents, relevant judgments and one set of queries. Three types of
experiments are conducted. First experiment use exact match, which is no method, is
applied. Second experiment use stenmiing method. Finally, the last experiment uses
combination of stemming and clustering methods.

Metadata

Item Type: Thesis (Degree)
Creators:
Creators
Email / ID Num.
Mahmood, Aminah
UNSPECIFIED
Subjects:
Divisions: Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences
Keywords: Retrieval system, information retrieval, clustering algorithm
Date: 2004
URI: https://ir.uitm.edu.my/id/eprint/1370
Edit Item
Edit Item

Download

[thumbnail of 1370.pdf] Text
1370.pdf

Download (209kB)

Digital Copy

Digital (fulltext) is available at:

Physical Copy

Physical status and holdings:
Item Status:

ID Number

1370

Indexing

Statistic

Statistic details