Using text mining algorithm to detect gender deception based on Malaysian chatroom lingo / Associate Professor Dianne L.M. Cheong and Nur Atiqah Sia Abdullah@Sia Sze Yieng

L., Dianne M. Cheong and Sia Abdullah, Nur Atiqah (2006) Using text mining algorithm to detect gender deception based on Malaysian chatroom lingo / Associate Professor Dianne L.M. Cheong and Nur Atiqah Sia Abdullah@Sia Sze Yieng. [Research Reports] (Unpublished)

Abstract

E-mail is used for communication between strangers and friends. It can be a fantasy playground for identity experimentations where players take on an imaginary persona and interact with each other in the virtual world. In communication, knowing the identity of those whom you communicate is essential for understanding and evaluating an interaction. However, the presentation of self in the virtual world is often a conscious and deliberate endeavour. Therefore, gender deception is difficult and risky and it can be abandoned at will. Inference can be made both from writing style and from clues hidden in the posting data. A text-mining algorithm was designed to detect gender deception based on gender-preferential features at the word or clause level of Malaysian e-mail users. Based on this designed text algorithm, a prototype in Visual Basic is developed. The prototype was tested with 16 documents; each consists of 5 e-mails exchanges of respective individuals. Out tests have shown that the prototype is at 81.3% of accuracy level. This is consistent with a human reader of the documents. The tested prototype will be a tool to assist interest parties such as the Criminology and Forensic Department, e-mail users and virtual communities to successfully identify gender deception.

Metadata

Item Type: Research Reports
Creators:
Creators
Email / ID Num.
L., Dianne M. Cheong
UNSPECIFIED
Sia Abdullah, Nur Atiqah
UNSPECIFIED
Subjects: Q Science > QA Mathematics > Analysis
Q Science > QA Mathematics > Instruments and machines > Electronic Computers. Computer Science > Algorithms
Divisions: Universiti Teknologi MARA, Shah Alam > Research Management Centre (RMC) > Institute of Research, Development and Commercialization (IRDC)
Keywords: Gender detection, Gender of e-mail author, Text mining algorithm to detect gender
Date: 2006
URI: https://ir.uitm.edu.my/id/eprint/49538
Edit Item
Edit Item

Download

[thumbnail of 49538.pdf] Text
49538.pdf

Download (158kB)

Fulltext

Fulltext is available at:
  • Koleksi Akses Terhad | PTAR Utama | Shah Alam

ID Number

49538

Indexing

|

Statistic

Statistic details