Interpretable hybrid models of Kolmogorov–Arnold Networks and transformer for mental health classification in low-resource languages: a Malay social media case study

Ahmad, Zaaba (2025) Interpretable hybrid models of Kolmogorov–Arnold Networks and transformer for mental health classification in low-resource languages: a Malay social media case study. PhD thesis, Universiti Teknologi MARA (UiTM).

Abstract

Depression, anxiety, and stress (DAS) are among the most common global mental health disorders. Social media has become a key outlet where individuals express their psy­chological states. This research contributes to computational linguistics and mental health informatics by enhancing the classification of DAS in Malay social media, a linguistically diverse and low-resource context marked by extensive colloquial usage. The study addresses several core challenges: the lack of a gold-standard corpus, lim­itations in existing language models, feature overlap, class imbalance, and issues of model interpretability. A gold-standard annotated corpus is developed using a hybrid strategy that combines expert validation, community a!liations, and self-reported data to ensure reliability and cultural relevance. To address linguistic and computational lim­itations, this study employs a range of Natural Language Processing (NLP) techniques, including Word2Vec embeddings, Recurrent Neural Networks (RNNs) with attention mechanisms, and transformer-based models such as BERT. To mitigate class imbalance and feature overlap, novel strategies, namely the Class-Aware Attention Model (CAAM) and the Balancing Class Weight Algorithm (BCWA), are introduced, achieving a strong macro average F1-score of 0.88. Further improvement is realised through the inte­gration of Kolmogorov-Arnold Networks (KAN) with BERT. This hybrid KAN-BERT model, enhanced with residual connections, attains a macro average F1-score of 0.92. The structured approach of KAN improves model interpretability by clarifying feature importance, thereby enhancing trust and potential usability in clinical or community mental health settings. Overall, this study delivers a validated corpus, a domain-specific language model, and innovative neural network approaches tailored for low-resource languages. These contributions significantly improve the accuracy and applicability of DAS classification in Malay-language social media, underscoring the role of NLP in addressing mental health challenges in underrepresented linguistic contexts.

Metadata

Item Type: Thesis (PhD)
Creators:
Creators
Email / ID Num.
Ahmad, Zaaba
UNSPECIFIED
Contributors:
Contribution
Name
Email / ID Num.
Thesis advisor
Maskat, Ruhaila
UNSPECIFIED
Thesis advisor
Mohamed, Azlinah
UNSPECIFIED
Thesis advisor
Conway, Mike
UNSPECIFIED
Thesis advisor
Yusoff, Marina
UNSPECIFIED
Subjects: H Social Sciences > HM Sociology
H Social Sciences > HM Sociology > Social psychology > Interpersonal relations. Social behavior
Divisions: Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences
Programme: Doctor of Philosophy (Computer Science)
Keywords: Gated Recurrent Unit (GRU), Multi-Layer Perceptron (MLP), Kolmogorov-Arnold Network (KAN)
Date: September 2025
URI: https://ir.uitm.edu.my/id/eprint/132613
Edit Item
Edit Item

Download

[thumbnail of 132613.pdf] Text
132613.pdf

Download (21kB)

Digital Copy

Digital (fulltext) is available at:

Physical Copy

Physical status and holdings:
Item Status:

ID Number

132613

Indexing

Statistic

Statistic details