Automatic text summarization for Malay news documents using latent dirichlet allocation and sentence selection / Siti Nur Afiqah Ramlan

Ramlan, Siti Nur Afiqah (2019) Automatic text summarization for Malay news documents using latent dirichlet allocation and sentence selection / Siti Nur Afiqah Ramlan. Degree thesis, Universiti Teknologi MARA (UiTM).

Abstract

The proliferation of internet newspapers making an Automatic Text Summarization is now a need to produce a summary that contains most of the important information from the original document. This project focused on the keyword extraction using Latent Dirichlet Allocation and Sentence Selection that used rule based concept approach to produce extractive summary. 100 Malay news documents covering general, sports, health and technology were collected from Utusan Online to evaluate the effectiveness of the system. This project only used a single topic from LDA and top 10 words in the selected topic as the keywords. To evaluate, summary generated by the system was compared to summary generated by human expert using Precision Recall formula. The results showed the effectiveness of the summary generated by the system which is the best score 63.7 % that can help people read the Malay news documents in short time as the summary assist the readers to understand the important parts of the document without reading from the beginning to the end.

Metadata

Item Type: Thesis (Degree)
Creators:
Creators
Email / ID Num.
Ramlan, Siti Nur Afiqah
2016645066
Contributors:
Contribution
Name
Email / ID Num.
Thesis advisor
Abdul Rahman, Nurazzah
UNSPECIFIED
Subjects: Q Science > QA Mathematics > Instruments and machines > Electronic Computers. Computer Science
Q Science > QA Mathematics > Instruments and machines > Electronic Computers. Computer Science > Computer software
Divisions: Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences
Programme: Bachelor of Computer Science (Hons)
Keywords: Automatic text summarization, Malay news documents, latent dirichlet allocation
Date: 2019
URI: https://ir.uitm.edu.my/id/eprint/109371
Edit Item
Edit Item

Download

[thumbnail of 109371.pdf] Text
109371.pdf

Download (187kB)

Digital Copy

Digital (fulltext) is available at:

Physical Copy

Physical status and holdings:
Item Status:

ID Number

109371

Indexing

Statistic

Statistic details