Abstract
In natural language processing, part-of-speech tagging plays a vital role. It is a significant condition for putting a human language on the computer science track. Before developing a part-of-speech tagger, a tag set is required for that language. This project is about the rule based part-of-speech tagging system for Malay language in Malay hansard document and a tag set that helps in the development of a Parser for the said language. The tagged word will compare with a text with manually tagging each word. The context free grammar will attach with the word that have more than one possible word class to perform a better result of tagging. A very simple architecture is applied that gives reasonably good accuracy. The result shows that 1.37 percent of hansard dictionary with highest frequency helps to tagging more than 55 percent words in hansard document.
Metadata
Item Type: | Thesis (Degree) |
---|---|
Creators: | Creators Email / ID Num. Abd Jalil, Mohd Razif 2009829638 |
Contributors: | Contribution Name Email / ID Num. Thesis advisor Abu Bakar, Zainab UNSPECIFIED |
Subjects: | Q Science > QA Mathematics > Analysis |
Divisions: | Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences |
Programme: | Bachelor of Science (Hons) |
Keywords: | Part-of-speech tagging, Malay hansard document, context free grammar |
Date: | 2012 |
URI: | https://ir.uitm.edu.my/id/eprint/98198 |
Download
98198.pdf
Download (109kB)