Using text mining for information extraction / Saliza Ramly

Ramly, Saliza (2007) Using text mining for information extraction / Saliza Ramly. Degree thesis, Universiti Teknologi MARA.


The growth of the Internet and the availability of very large amounts of documents
online that contain valuable information, have caused the need for tools to assist the
users to extract the relevant information from the bundle of information without
having to read them all, and also to retrieve it in a fast and effective. An e-mail is
composed of date, e-mail address, subject, body of the e-mail, and so on. It is possible
for the body to include pictures, sounds, and programs, but usually the body is mainly
composed of textual data. Thus, it is possible to use text mining techniques in order to
analyze e-mails. The research focuses on the email of students in Faculty of
Information Technology and Quantitative Sciences (FTMSK). There are three
objectives of the research that have been achieved. The survey was conducted to
achieve the first objective. The second objective was achieved through content
analysis and website observation. Researcher was identified the basic techniques that
usually used and tabulate it in form of table. A number of organizations that have been
done some development on text miner as their commercial product also have been
identified. Finally, the third objective of the research was achieved through the
development of a tool using text mining techniques. Furthermore, the Prototyping
Methodology is chosen in order to develop the system. The researcher identified
appropriate techniques from the past researches and existing text mining tool. As a
result, categorization, clustering and summarization techniques was selected and
applied for Text Mining Application Tool, TMAT development.


Item Type: Thesis (Degree)
Email / ID Num.
Ramly, Saliza
Subjects: Q Science > QA Mathematics > Instruments and machines > Electronic Computers. Computer Science
Divisions: Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences
Date: 2007
Edit Item
Edit Item


[thumbnail of TD_SALIZA RAMLY CS 07_5 P01.pdf]

Download (64kB) | Preview

Digital Copy

Digital (fulltext) is available at:

Physical Copy

Physical status and holdings:
Item Status:

ID Number




Statistic details