Modular deep neural network in reducing overfitting to enhance generalization / Mohd Razif Shamsuddin

Shamsuddin, Mohd Razif (2024) Modular deep neural network in reducing overfitting to enhance generalization / Mohd Razif Shamsuddin. PhD thesis, Universiti Teknologi MARA (UiTM).

Abstract

Machine Learning (ML) and Artificial Intelligence (AI) are a hype in this new age. Some researcher may use it to do classification, recognitions, or even predictions. Although NN have been around for so long since it was introduced. There are many of its renditions and variations that was created for different purposes. Some variation of the network design such as Deep Neural Network (DNN) is very practical and applicable in many different disciplines and field of study with massive amount of data. More and more research has been researched and developed that focuses on how the DNN model that can produce accurate results. Most of those research results varies as it uses different data, different network design, different parameters and optimizing algorithm. This research aims to experiment a new DNN model that functions modularly by looking into several features that will affect the NN training dynamics. This research also aims to discuss several issues and problems that is associated to deep networks such as overfitting, scaling issues and training time. This study is conducted through development of a Modular Deep Neural Network (MDNN) and several experiments to enhance its training capabilities. The experiments are conducted in three phases. In the first phase, the NN is tested with two different data formatting to ensure which data format is more suitable for the next consecutive experiments. It is found out that grayscale format works better as it retains the original information of the inputs and produced better precision based on the highest accuracy of 98.11%. Consecutively at the second phase, DNN models with different optimizers, batch size and activation functions are trained and analysed. From these experiments it has shown that when Adamax optimizer with ReLu activation can produce a promising test accuracy of 96% with just a small batch size. Finally, in the third phase, an experiment is executed to enhance the accuracy of the MDNN model with seeded data. By adding seeded dataset to the training data tremendously changes its accuracy with maximum test accuracy of 94%. This also greatly reduced its training time from 20 hours to just 30 minutes. Finally, a T-Test is conducted to compare several MDNN models to see its consistencies in producing good results.

Metadata

Item Type:	Thesis (PhD)
Creators:	Creators Email / ID Num. Shamsuddin, Mohd Razif 2013463402
Contributors:	Contribution Name Email / ID Num. Advisor Mohamed, Azlinah UNSPECIFIED Advisor Abd. Rahman, Shuzlina UNSPECIFIED
Subjects:	Q Science > Q Science (General) > Back propagation (Artificial intelligence)
Divisions:	Universiti Teknologi MARA, Shah Alam > College of Computing, Informatics and Media
Programme:	Doctor of Philosophy (Computer Science)
Keywords:	Modular deep neural network, generalization, artificial intelligence (AI)
Date:	2024
URI:	https://ir.uitm.edu.my/id/eprint/107353