Enhancing high-dimensional streaming data analysis: optimizing Online Feature Selection for handling drift using optimization technique and ensemble learning

Kamaru-Zaman, Ezzatul Akmal (2024) Enhancing high-dimensional streaming data analysis: optimizing Online Feature Selection for handling drift using optimization technique and ensemble learning. PhD thesis, Universiti Teknologi MARA (UiTM).

Abstract

In the era of data-driven decision-making, managing dynamic data streams characterized by evolving data distributions and high dimensionality presents a formidable challenge for online feature selection. This research addresses the challenge by devel-oping innovative solutions in optimizing Online Feature Selection (OFS) to manage feature irrelevancy and redundancy, tackling the issues of Feature Drift, and rigor-ously validating the proposed algorithms in high-dimensional dynamic data streams. The research employs a structured methodology, introducing two novel methods: PSO-OSFS (Particle Swarm Optimization for Online Streaming Feature Selection), an optimized online feature selection and its enhancement, PSO-OSFS+ HEFT de-signed to handle feature drift. The PSO-OSFS method is underpinned by the adaptive threshold particle representation of particle swarm optimization and enhanced fitness function using minimization of mean absolute deviation of dependency among fea-ture subsets. Adaptive threshold particle representation introduces a novel aspect in defining a threshold value of significance level from 0.01 to 0.1.

Metadata

Item Type: Thesis (PhD)
Creators:
Creators
Email / ID Num.
Kamaru-Zaman, Ezzatul Akmal
UNSPECIFIED
Contributors:
Contribution
Name
Email / ID Num.
Thesis advisor
Ahmad, Azlin
UNSPECIFIED
Thesis advisor
Mohamed, Azlinah
UNSPECIFIED
Subjects: Q Science > QA Mathematics > Multivariate analysis. Cluster analysis. Longitudinal method
Q Science > QA Mathematics > Online data processing
Divisions: Universiti Teknologi MARA, Shah Alam > College of Computing, Informatics and Mathematics
Programme: Doctor of Philosophy (Computer Science)
Keywords: High-dimensional data analysis, Online feature selection, Concept drift, Optimization techniques.
Date: 2024
URI: https://ir.uitm.edu.my/id/eprint/122888
Edit Item
Edit Item

Download

[thumbnail of 122888.pdf] Text
122888.pdf

Download (221kB)

Digital Copy

Digital (fulltext) is available at:

Physical Copy

Physical status and holdings:
Item Status:

ID Number

122888

Indexing

Statistic

Statistic details