The development of the indexing prototype considering tags into the inverted file: case study on FTMSK’s official letter / Mohd Sharizan Mohd Shariff

Mohd Shariff, Mohd Sharizan (2005) The development of the indexing prototype considering tags into the inverted file: case study on FTMSK’s official letter / Mohd Sharizan Mohd Shariff. Degree thesis, Universiti Teknologi MARA.

Abstract

The combination of IR and structure form (XML) make the retrieval process become more powerful then before. As far as the effectiveness of document retrieval is concerned, each segment (part) in the letter (document) has its own meaning or usage. Thus, term weight must be taken into consideration in order to make each segment (part) of the document more meaningful and to make the retrieval process produce more relevant output to the user. This idea is the basis for the prototype development. The prototype has been built using Visual Basic platform with MS Access as the data storage and structure. Inverted files technique had been chosen as the basis for the data structure in this prototype. The retrieval effectiveness is measured using redefined recall (R) and precision (P) that used to measure structured document. The evaluation will be done between the CAS (the prototype) and CO (benchmark) retrieval. The result of evaluation been done shows that the term weighting assist in production of more relevant output to user query rather then ignorance of it in structured document. Each part of the segment in the structured form of the document become more identical in query process with the used of term weighting inserted in the tags.

Metadata

Item Type: Thesis (Degree)
Creators:
Creators
Email / ID Num.
Mohd Shariff, Mohd Sharizan
UNSPECIFIED
Divisions: Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences
Keywords: Indexing prototype; Tag; IR and structure form; XML
Date: 2005
URI: https://ir.uitm.edu.my/id/eprint/18274
Edit Item
Edit Item

Download

[thumbnail of TD_MOHD SHARIZAN MOHD SHARIFF CS 05_5.pdf] Text
TD_MOHD SHARIZAN MOHD SHARIFF CS 05_5.pdf

Download (69kB)

Digital Copy

Digital (fulltext) is available at:

Physical Copy

Physical status and holdings:
Item Status:

ID Number

18274

Indexing

Statistic

Statistic details