The evaluation of contentoriented XML document retrieval: a case study of official letter / Hayati Abd Rahman

Abd Rahman, Hayati (2006) The evaluation of contentoriented XML document retrieval: a case study of official letter / Hayati Abd Rahman. [Research Reports] (Unpublished)


The research applies the process of document segmentation in which document is separated into many parts. The term segmentation is usually used in which the document retrieval is significant. It is important since the content of documents appear as one big part. Later in the retrieval development, the segmentation would beused for the indexing part. The letter document has their own format, which consists of many parts. The prototype has been developed to allow the segmentation and the existence of content-based to the letter document. The documents are divided into smaller, recognized labels that are intensive and flexible for managing, editing, and extracting. The target of this thesis is to apply the standard of official letter for the system, as well as to develop the algorithm which will segment the letter documents, and convert to XML documents. The software used for this prototype is Visual Basic6.0. More over, the information retrieval makes the retrieval of document or collection of data in the storage media more efficient, effective, relevant, faster and more reliable than before. Such indexing techniques may influence the effectiveness of retrieval itself. The extension component within the indexing structure may also influence the performance of the retrieval process. This research is to develop a prototype for indexing algorithm considering tag weighting for the XML document and also to test the indexer with the existing document. In order to perform efficient retrieval on documents, appropriate index structure or algorithm must be used which include the structural information. The inverted file method has been used for the indexing techniques to develop the indexing algorithm of the FTMSK official letter. The relevancy of the document for the retrieval by using the algorithm has been successful achieved and it can prove that the prototype can increase the relevancy of document retrieval.


Item Type: Research Reports
Email / ID Num.
Abd Rahman, Hayati
Email / ID Num.
Ahmad, Adnan (PM. Dr.)
Subjects: Z Bibliography. Library Science. Information Resources > ZA Information resources (General) > Research. Information retrieval. Information behavior. Information literacy
Divisions: Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences
Programme: Master of Science
Keywords: XML, Document, Letter
Date: February 2006
Edit Item
Edit Item


[thumbnail of 37522.PDF] Text

Download (715kB)

Digital Copy

Digital (fulltext) is available at:

Physical Copy

Physical status and holdings:
Item Status:
On Shelf

ID Number




Statistic details