Identifying topic for indexing Malay text document / Muhammad Alhafiz Hamzah

Hamzah, Muhammad Alhafiz (2011) Identifying topic for indexing Malay text document / Muhammad Alhafiz Hamzah. Degree thesis, Universiti Teknologi MARA (UiTM).

Abstract

The number of document increase for time to time. Malay document is one of example documents that increase from day to day. The Malay document includes the newspaper, articles, journals and so on. However not all document have their own special topic. It is hard for user to determine the exact topic that they want. Based on the word frequency, the topic can be determined by looking at the words that most frequently used in the document. Thus, this prototype will be implementing using Parliament document Hansard. It will calculate the 50 words that frequently occurred on every document. It will calculate the probability of the word occurred before and after the word frequently use. Furthermore using the frequently used word, it can generate the indexing file. As the conclusion, the prototype that will be developed is capable to give the appropriate output to user.

Metadata

Item Type: Thesis (Degree)
Creators:
Creators
Email / ID Num.
Hamzah, Muhammad Alhafiz
2009298326
Contributors:
Contribution
Name
Email / ID Num.
Thesis advisor
Md Hanum, Haslizatul Fairuz
UNSPECIFIED
Subjects: Q Science > QA Mathematics > Analysis
Divisions: Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences
Programme: Bachelor Of Computer Science (Hons)
Keywords: Word frequency, word probability, topic indexing
Date: 2011
URI: https://ir.uitm.edu.my/id/eprint/98092
Edit Item
Edit Item

Download

[thumbnail of 98092.pdf] Text
98092.pdf

Download (135kB)

Digital Copy

Digital (fulltext) is available at:

Physical Copy

Physical status and holdings:
Item Status:
Processing

ID Number

98092

Indexing

Statistic

Statistic details