Malay dialect identification using Bi-LSTM trained on MFCC features

Sulaiman, Mohd Azman Hanif and Abd Aziz, Nurhakimah and Zabidi, Azlee and Jantan, Zuraidah and Mohd Yassin, Ihsan and Megat Ali, Megat Syahirul Amin (2025) Malay dialect identification using Bi-LSTM trained on MFCC features. Journal of Electrical and Electronic Systems Research (JEESR), 27 (1): 16. pp. 130-138. ISSN 1985-5389

Official URL: https://jeesr.uitm.edu.my

Identification Number (DOI): 10.24191/jeesr.v27i1.016

Abstract

The Malay language is a major language in the Austronesian family and is commonly spoken in various parts in Southeast Asia (SEA). Despite its many native speakers, research on intelligent techniques to analyse the language has been limited. In this paper, we present a Long Short-Term Memory (LSTM) to perform dialect recognition for the Malay Language. 240 samples were collected from 10 native dialect speakers to perform the experiments. Subsequently, we represented the raw audio recordings as Mel Frequency Cepstrum Coefficient (MFCC) features to train the LSTM classifier. The results achieved 98.20% classification accuracy, comparable to similar current methods.

Metadata

Item Type: Article
Creators:
Creators
Email / ID Num.
Sulaiman, Mohd Azman Hanif
UNSPECIFIED
Abd Aziz, Nurhakimah
UNSPECIFIED
Zabidi, Azlee
UNSPECIFIED
Jantan, Zuraidah
UNSPECIFIED
Mohd Yassin, Ihsan
UNSPECIFIED
Megat Ali, Megat Syahirul Amin
UNSPECIFIED
Subjects: P Language and Literature > P Philology. Linguistics > Language. Linguistic theory. Comparative grammar
T Technology > TK Electrical engineering. Electronics. Nuclear engineering > Electronics > Computer engineering. Computer hardware
Divisions: Universiti Teknologi MARA, Shah Alam > Faculty of Electrical Engineering
Journal or Publication Title: Journal of Electrical and Electronic Systems Research (JEESR)
UiTM Journal Collections: UiTM Journals > Journal of Electrical and Electronic Systems Research (JEESR)
ISSN: 1985-5389
Volume: 27
Number: 1
Page Range: pp. 130-138
Keywords: Malay language, Dialect classification, Long Short Term Memory (LSTM) neural network, Mel Frequency Cepstral Coefficient (MFCC)
Date: October 2025
URI: https://ir.uitm.edu.my/id/eprint/126335
Edit Item
Edit Item

Download

[thumbnail of 126335.pdf] Text
126335.pdf

Download (1MB)

ID Number

126335

Indexing

Altmetric
PlumX
Dimensions

Statistic

Statistic details