Keyword indexing for text documents using signature files / Abdul Hakim A. Gafa

A. Gafa, Abdul Hakim (2008) Keyword indexing for text documents using signature files / Abdul Hakim A. Gafa. Degree thesis, Universiti Teknologi MARA (UiTM).

Abstract

Information retrieval is the first step in developing retrieval systems for text document in collections. Signature File is popular and effective in searching and retrieving processes (Zobel and Moffat, 2006) other than Inverted Files. This project explores the potential and limitation of prototype text search engines using Signature Files on Malaysian Text Documents. Malaysian Text Documents is an official text report of proceedings and debates in parliament which is documented in Malay Language and maintained by House of Parliament. These document are categorizes into House of Commons and House of Lords. Currently, searching and retrieving information from text document in Malay Language are done manually. These process are tedious, very time consuming and inefficient. Text search engine prototype using signature file can speed up the process of searching and retrieving information from Malaysian text documents. The main of this project is to compare the effectiveness of searching Text documents between using Signature files algorithm and Inverted files algorithm. In order to achieve the main objective, the Signature Files algorithm for indexing methods needs to be understood and implemented. A text search engine prototype for Malay Text Document will developed as a tools to evaluate the effectiveness of searching Text Documents using Signature Files and Inverted Files.

Metadata

Item Type: Thesis (Degree)
Creators:
Creators
Email / ID Num.
A. Gafa, Abdul Hakim
2005614250
Contributors:
Contribution
Name
Email / ID Num.
Thesis advisor
Sheikh Aljunid, Syed Ahmad
UNSPECIFIED
Subjects: Z Bibliography. Library Science. Information Resources > Information organization
Divisions: Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences
Programme: Bachelor Of Computer Science (Hons)
Keywords: Retrieval systems, text search engine prototype, Signature Files algorithm
Date: 2008
URI: https://ir.uitm.edu.my/id/eprint/98182
Edit Item
Edit Item

Download

[thumbnail of 98182.pdf] Text
98182.pdf

Download (120kB)

Digital Copy

Digital (fulltext) is available at:

Physical Copy

Physical status and holdings:
Item Status:
Processing

ID Number

98182

Indexing

Statistic

Statistic details