A guided hybrid k-means and genetic algorithm models for children handwriting legibility performance assessment / Norzehan Sakamat

Sakamat, Norzehan (2021) A guided hybrid k-means and genetic algorithm models for children handwriting legibility performance assessment / Norzehan Sakamat. PhD thesis, Universiti Teknologi MARA.

Abstract

Assessing and predicting children handwriting legibility performance is necessary for providing early interventions to those with handwriting difficulties. Thus, producing a good and reliable computerized handwriting assessment instruments will depend heavily on selection of appropriate handwriting features, handwritten recognition methods and clustering methods. Offline handwritten recognition method is challenging due to the fact that individual handwriting produces variations of handwriting shape, style and orientation and the input are in static form. This research proposes to develop and analyse an offline handwriting recognition instrument performance. The instruments combine's observations, feature extraction methods and clustering methods which are expected to produce predictive results of high agreement with human experts based on evaluation of selected individually handwritten alphabets. Four handwriting components have been identified which are time completion, readability, size consistency and shape formation. Time completion was calculated by observing the number of alphabets completed within the specified time. Readability was detected using a free online optical character recognition application called Aeosoft. Size was extracted using Extreme Point Detection algorithm and Hit or Miss Transformation method was used to extract the stroke formation pattern. K-Means algorithm a popular efficient clustering techniques and genetic algorithm a widely used evolutionary algorithm and known for its adaptive nature were combined to determine the level of handwriting legibility for each child. The hybridization of the two methods were proposed due to K-Means weaknesses which are predicted that it will not produce the expected results for this study. Order of input data and rescaling the input data for standardization influence K-Means in giving accurate results. The iterative nature of KMeans and random initialization of centroids which leads the algorithm to stick in a local optimum and unable to converge to optimum results, are another weakness. The combined method is called Hybrid K-MeansCGA. Modifications of K-Means structures were done by inserting genetic algorithm operators and tuning the population. This study will also tune the generation size to see whether it have an impact on producing high agreement result with human experts. The utilization of populations has been a commonly used strategy in tuning GA when it did not perform well, however, the studies of tuning generations size in GAs to find the best solution were rarely done. Euclidean Distance, Pearson Correlation and Matching Matrix were used to measure the performance of the feature extraction and clustering methods. Recognition software achieved 87.14%, EPD algorithm achieved 73.57% and HMT algorithm achieved 74.30%) prediction accuracy with OTs. While hybrid K-MeansCGA combination of fix population size=100 and various size of generation performs better than general KMeans algorithm and hybrid K-MeansCGA combination of fix generation size=100 and various size of populations. Hybrid K-MeansCGA with generation=150 and population=100 results in prediction accuracy scores of 87% with teachers and 85% prediction accuracy with OTs. Findings shows that by implementing different sizes of generations can improve the clustering results, thus verifies the statement given by the natural evolution theories that generations of species do have great impact in producing the most fit individuals. This research has achieves its objective as the combined methods are reliable instruments that best imitate the assessment decisions of occupational therapist who are the qualified professionals in treating issues related to the development of handwriting among children.

Metadata

Item Type: Thesis (PhD)
Creators:
Creators
Email / ID Num.
Sakamat, Norzehan
2011477874
Contributors:
Contribution
Name
Email / ID Num.
UNSPECIFIED
Abd Khalid, Noor Elaiza (Dr.)
UNSPECIFIED
Subjects: Q Science > QA Mathematics > Algebra
Q Science > QA Mathematics > Analysis
Q Science > QA Mathematics > Instruments and machines > Electronic Computers. Computer Science > Algorithms
Divisions: Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences
Programme: Doctor of Philosophy (Information Technology and Quantitative Sciences) - CS990
Keywords: Algorithm, Genetic, Handwriting
Date: February 2021
URI: https://ir.uitm.edu.my/id/eprint/53635
Edit Item
Edit Item

Download

[thumbnail of 53635.pdf] Text
53635.pdf

Download (229kB)

Digital Copy

Digital (fulltext) is available at:

Physical Copy

Physical status and holdings:
Item Status:
On Shelf

ID Number

53635

Indexing

Statistic

Statistic details