Investigation of speech disfluencies classification on different threshold selection techniques using energy feature extraction / Raseeda Hamzah and Nursuriati Jamil

Hamzah, Raseeda and Jamil, Nursuriati (2019) Investigation of speech disfluencies classification on different threshold selection techniques using energy feature extraction / Raseeda Hamzah and Nursuriati Jamil. Malaysian Journal of Computing (MJoC), 4 (1). pp. 178-192. ISSN 2231-7473

Official URL: https://mjoc.uitm.edu.my

Abstract

Filled pause and Elongation are the two types of speech disfluencies that need more suitable acoustical features to be classified correctly since they are always being misclassified. This work concentrates on developing an accurate and robust energy feature extraction for modelling filled pause and elongation by investigating different energy features using local maxima points of the speech energy. Method: In this paper, we extracted peak values from each frame of a voiced signal by implementing different thresholding techniques to classify filled pause and elongation. These energy features are evaluated by using statistical naïve Bayes classifier to see the contribution on the classification processes. Various samples of sustained syllables and filled pauses of spontaneous speech were extracted from Malaysian Parliamentary Debate Database of the year 2008. A naïve Bayes was used as a classifier. We performed F-measure evaluation to investigate the significant differences in mean of filled pause and elongation samples. Results: Results revealed that our proposed LM-E has increase the classification with up to 71% and 75% F-measure for elongation and filled pause. Conclusion: The best achieved accuracies in both filled pause and elongation classification were varied depending on the types of thresholding techniques applied during the local maxima of speech energy extraction. The most contributed thresholding technique is our proposed technique which is by using the adaptive height as the threshold that extracts the local maxima of the speech energy (LM-E).

Metadata

Item Type: Article
Creators:
Creators
Email / ID Num.
Hamzah, Raseeda
raseeda@tmsk.uitm.edu.my
Jamil, Nursuriati
liza@uitm.edu.my
Subjects: T Technology > TP Chemical technology > Chemical engineering > Special processes and operations > Extraction
Divisions: Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences
Journal or Publication Title: Malaysian Journal of Computing (MJoC)
UiTM Journal Collections: UiTM Journal > Malaysian Journal of Computing (MJoC)
ISSN: 2231-7473
Volume: 4
Number: 1
Page Range: pp. 178-192
Keywords: Filled pause and elongation, Energy feature extraction, Automatic speech recognition
Date: June 2019
URI: https://ir.uitm.edu.my/id/eprint/43819
Edit Item
Edit Item

Download

[thumbnail of 43819.pdf] Text
43819.pdf

Download (900kB)

ID Number

43819

Indexing

Statistic

Statistic details