Abstract
The proliferation of internet newspapers making an Automatic Text Summarization is now a need to produce a summary that contains most of the important information from the original document. This project focused on the keyword extraction using Latent Dirichlet Allocation and Sentence Selection that used rule based concept approach to produce extractive summary. 100 Malay news documents covering general, sports, health and technology were collected from Utusan Online to evaluate the effectiveness of the system. This project only used a single topic from LDA and top 10 words in the selected topic as the keywords. To evaluate, summary generated by the system was compared to summary generated by human expert using Precision Recall formula. The results showed the effectiveness of the summary generated by the system which is the best score 63.7 % that can help people read the Malay news documents in short time as the summary assist the readers to understand the important parts of the document without reading from the beginning to the end.
Metadata
Item Type: | Thesis (Degree) |
---|---|
Creators: | Creators Email / ID Num. Ramlan, Siti Nur Afiqah 2016645066 |
Contributors: | Contribution Name Email / ID Num. Thesis advisor Abdul Rahman, Nurazzah UNSPECIFIED |
Subjects: | Q Science > QA Mathematics > Instruments and machines > Electronic Computers. Computer Science Q Science > QA Mathematics > Instruments and machines > Electronic Computers. Computer Science > Computer software |
Divisions: | Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences |
Programme: | Bachelor of Computer Science (Hons) |
Keywords: | Automatic text summarization, Malay news documents, latent dirichlet allocation |
Date: | 2019 |
URI: | https://ir.uitm.edu.my/id/eprint/109371 |
Download
109371.pdf
Download (187kB)