Text-based tagging of Malay Hansard document / Mohd Razif Abd Jalil

Abd Jalil, Mohd Razif (2012) Text-based tagging of Malay Hansard document / Mohd Razif Abd Jalil. Degree thesis, Universiti Teknologi MARA (UiTM).

Abstract

In natural language processing, part-of-speech tagging plays a vital role. It is a significant condition for putting a human language on the computer science track. Before developing a part-of-speech tagger, a tag set is required for that language. This project is about the rule based part-of-speech tagging system for Malay language in Malay hansard document and a tag set that helps in the development of a Parser for the said language. The tagged word will compare with a text with manually tagging each word. The context free grammar will attach with the word that have more than one possible word class to perform a better result of tagging. A very simple architecture is applied that gives reasonably good accuracy. The result shows that 1.37 percent of hansard dictionary with highest frequency helps to tagging more than 55 percent words in hansard document.

Metadata

Item Type: Thesis (Degree)
Creators:
Creators
Email / ID Num.
Abd Jalil, Mohd Razif
2009829638
Contributors:
Contribution
Name
Email / ID Num.
Thesis advisor
Abu Bakar, Zainab
UNSPECIFIED
Subjects: Q Science > QA Mathematics > Analysis
Divisions: Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences
Programme: Bachelor of Science (Hons)
Keywords: Part-of-speech tagging, Malay hansard document, context free grammar
Date: 2012
URI: https://ir.uitm.edu.my/id/eprint/98198
Edit Item
Edit Item

Download

[thumbnail of 98198.pdf] Text
98198.pdf

Download (109kB)

Digital Copy

Digital (fulltext) is available at:

Physical Copy

Physical status and holdings:
Item Status:
Processing

ID Number

98198

Indexing

Statistic

Statistic details