Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani

Abdullah Sani, Aidil Amirul Safwan (2020) Visualizing the reputation of Malaysian communication service providers through twitter sentiment analysis using naïve bayes / Aidil Amirul Safwan Abdullah Sani. Degree thesis, Universiti Teknologi MARA, Cawangan Melaka.

Abstract

A text classifier model optimized for short snippets like tweets is developed to make bilingual sentiment analysis possible. The two languages explored are Bahasa Malaysia and English, since they are the two most commonly spoken languages in Malaysia. The classifier model is trained and tested on a huge multi domain dataset pre-labelled with the labels “0” and “1”, which resemble “positive” and “negative” respectively. Naïve Bayes ML technique is used as the core of the classifier model. The data are all pre-processed, and once the development of the classifier model is done, the model is run on real-time data, which are public tweets directly or indirectly mentioned to the three biggest CSP in Malaysia, which are Celcom, Maxis and Digi in the year of 2018. The result of the analysis is incorporated into a web application built on Bootstrap on top of Python’s Flask allowing interactive data visualization. Agile methodology is used throughout the development of the application to ensure that this project is done according to the guideline prepared in the design phase. Functionality testing is also done to ensure that there is no significant error that will render the application useless. In conclusion, the findings gathered show that Naïve Bayes is fairly suitable to be used in NLP problems. The future work that can be put into this project is to improve the corpus to include different slangs of Bahasa Malaysia and commonly used short forms as well as adding an extra class to represent texts that do not belong to either “positive” or “negative”.

Metadata

Item Type: Thesis (Degree)
Creators:
Creators
Email / ID Num.
Abdullah Sani, Aidil Amirul Safwan
2016782377
Contributors:
Contribution
Name
Email / ID Num.
Thesis advisor
Abu Samah, Khyrina Airin Fariza
UNSPECIFIED
Subjects: H Social Sciences > HM Sociology > Groups and organizations > Social groups. Group dynamics
H Social Sciences > HM Sociology > Groups and organizations > Social groups. Group dynamics > Social networks > Online social networks > Particular networks, A-Z > Twitter
P Language and Literature > P Philology. Linguistics > Communication. Mass media
Divisions: Universiti Teknologi MARA, Melaka > Jasin Campus > Faculty of Computer and Mathematical Sciences
Keywords: Bilingual sentiment analysis; Twitter sentiment analysis; Communication service providers
Date: 2020
URI: https://ir.uitm.edu.my/id/eprint/31488
Edit Item
Edit Item

Download

[thumbnail of 31488.pdf] Text
31488.pdf

Download (146kB)

Digital Copy

Digital (fulltext) is available at:

Physical Copy

Physical status and holdings:
Item Status:

ID Number

31488

Indexing

Statistic

Statistic details