Abstract
The number of document increase for time to time. Malay document is one of example documents that increase from day to day. The Malay document includes the newspaper, articles, journals and so on. However not all document have their own special topic. It is hard for user to determine the exact topic that they want. Based on the word frequency, the topic can be determined by looking at the words that most frequently used in the document. Thus, this prototype will be implementing using Parliament document Hansard. It will calculate the 50 words that frequently occurred on every document. It will calculate the probability of the word occurred before and after the word frequently use. Furthermore using the frequently used word, it can generate the indexing file. As the conclusion, the prototype that will be developed is capable to give the appropriate output to user.
Metadata
Item Type: | Thesis (Degree) |
---|---|
Creators: | Creators Email / ID Num. Hamzah, Muhammad Alhafiz 2009298326 |
Contributors: | Contribution Name Email / ID Num. Thesis advisor Md Hanum, Haslizatul Fairuz UNSPECIFIED |
Subjects: | Q Science > QA Mathematics > Analysis |
Divisions: | Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences |
Programme: | Bachelor Of Computer Science (Hons) |
Keywords: | Word frequency, word probability, topic indexing |
Date: | 2011 |
URI: | https://ir.uitm.edu.my/id/eprint/98092 |
Download
98092.pdf
Download (135kB)