Text Analysis of 2019 Auditor General’s Report / Aisyah Hamizah Azmi, Faresha Farhana Rahman and Noor Dzulaiqa Izafitriah Mohd Alias

Azmi, Aisyah Hamizah and Rahman, Faresha Farhana and Mohd Alias, Noor Dzulaiqa Izafitriah (2021) Text Analysis of 2019 Auditor General’s Report / Aisyah Hamizah Azmi, Faresha Farhana Rahman and Noor Dzulaiqa Izafitriah Mohd Alias. [Student Project] (Unpublished)

Abstract

Text data analysis has become an essential tool in extracting information from enormous amount of online documents. One of the documents that can be analyzed is the Malaysian Auditor General’s report. This research was inspired to assist the National Audit Department collect valuable details from the report and to visualise it into a simplest form to monitor. The first objective of this research is to explore the word pattern of Auditor General’s Report for 2019. The method used to achieve this is by using the collocation analysis. It is found that the collocation of telaga tiub has the highest association strength, measured by lambda which has been standardized. Since telaga tiub has the highest probability that exactly follow each other, this research also investigate the words that relate to telaga by employing cluster analysis, which is the second objective. The method of clustering used is the Ward’s Minimum Variance. There are two clusters of words formed. The first cluster can be classified as authorities that are responsible for the Telaga Tiub project which are Kementerian Pendidikan Malaysia and Jabatan Mineral dan Geologi. The second cluster represents the agencies that can get benefits from the Telaga Tiub project. As for the third objective, this research also focus on determining the words that are significantly related to specific terms such as penyelewengan, pembaziran, gagal, kecuaian and ketirisan, using the multiple Fisher’s Exact test. The term penyelewengan is found to be highly significant with the words wujud and pengawal. The words hpkk, pengawal, memandang, diharapkan and mengelakkan are found to be highly significant with term pembaziran. As for the term gagal, it is found that the term is highly significant with the words bayaran, deposit, membayar, syarat and guaman. Whereas, the words skop, kolam, spesifikasi and uwet are found to be highly significant with the term kecuaian. The last term ketirisan is found to be highly significant with the words lkim, hasil, sewa and dikutip.

Metadata

Item Type: Student Project
Creators:
Creators
Email / ID Num.
Azmi, Aisyah Hamizah
UNSPECIFIED
Rahman, Faresha Farhana
UNSPECIFIED
Mohd Alias, Noor Dzulaiqa Izafitriah
UNSPECIFIED
Contributors:
Contribution
Name
Email / ID Num.
Thesis advisor
Khairol Azmi, Dr. Nurul Nisa’
UNSPECIFIED
Subjects: H Social Sciences > HA Statistics > Statistical data
H Social Sciences > HA Statistics > Theory and method of social science statistics > Surveys. Sampling. Statistical survey methodology
H Social Sciences > HA Statistics > Theory and method of social science statistics > Data envelopment analysis
H Social Sciences > HA Statistics > Theory and method of social science statistics > Data processing. Including SPSS
Divisions: Universiti Teknologi MARA, Negeri Sembilan > Seremban Campus > Faculty of Computer and Mathematical Sciences
Programme: Bachelor of Science (Hons.) Statistics
Keywords: Text Analysis, Auditor General’s Report, Ward’s Minimum Variance
Date: 2021
URI: https://ir.uitm.edu.my/id/eprint/59915
Edit Item
Edit Item

Download

[thumbnail of 59915.pdf] Text
59915.pdf

Download (139kB)

Digital Copy

Digital (fulltext) is available at:

Physical Copy

Physical status and holdings:
Item Status:

ID Number

59915

Indexing

Statistic

Statistic details