Abstract
Micro-blogs as a new textual domain offer a unique proposition for sentiment analysis. Their short document length suggests any sentiment they contain is compact and explicit. It can pose difficulties for standard machine learning document representations because of the short length coupled with their noisy nature. The aim of this project is to classify Twitter’s messages into sentiment categories based on the important keywords. This project methodology consists of five phases which are preliminary study, data collection and preparation, model development, model evaluation and documentation. This project is designed using negative selection algorithm to automatically classify the Twitter’s messages into its sentiment’s category based on important keyword recognition. In order to develop this model classification and prototype, 480 Twitter’s messages were used as training data and 120 Twitter’s messages for testing data to determine the accuracy of the classification model. The accuracy of this model is about 60 percent. Second experiment was carried out by reducing the data to 240 for training data and 60 data for testing. The accuracy for second experiment is improved to 63.33 percent.
Metadata
Item Type: | Thesis (Degree) |
---|---|
Creators: | Creators Email / ID Num. Che Alhadi, Nazirah 2010608328 |
Contributors: | Contribution Name Email / ID Num. Thesis advisor Jantan, Dr Hamidah UNSPECIFIED |
Subjects: | Q Science > QA Mathematics > Elementary mathematics. Arithmetic Q Science > QA Mathematics > Online data processing Q Science > QA Mathematics > Evolutionary programming (Computer science). Genetic algorithms |
Divisions: | Universiti Teknologi MARA, Terengganu > Dungun Campus > Faculty of Computer and Mathematical Sciences |
Programme: | Bachelor of Computer Science (Hons) |
Keywords: | Sentiment Analysis, text mining, negative selection algorithm (NSA), Twitter. |
Date: | 2012 |
URI: | https://ir.uitm.edu.my/id/eprint/35377 |
Download
35377.pdf
Download (144kB)