Prediction of employee promotion using hybrid sampling method with machine learning architecture / Shahidan Shafie, Soek Peng Ooi and Khai Wah Khaw

Shafie, Shahidan and Soek, Peng Ooi and Khai, Wah Khaw (2023) Prediction of employee promotion using hybrid sampling method with machine learning architecture / Shahidan Shafie, Soek Peng Ooi and Khai Wah Khaw. Malaysian Journal of Computing (MJoC), 8 (1): 2. pp. 1264-1286. ISSN 2600-8238


Employee promotion plays an important role in an organization. It aids to inspire employees to grow and develop their skills, thus increase employee loyalty and reduce the turnover rate. This study predicts employee job promotion based on employee promotion data by using a hybrid sampling method with machine learning. The purpose of this study is to accelerate the promotion process and share the important features that might be determined when promoting an employee. In this study, there are eight machine learning algorithms have been used, such as Logistic Regression, Decision Tree, Random Forest, K-Nearest Neighbors, Support Vector Machine, Naïve Bayes, Adaptive Boosting Classifier, and Extreme Gradient Boost. The purpose of using eight machine learning algorithms is to find out the most suitable model to predict employee promotion. Additionally, hybrid sampling methods like Synthetic Minority Oversampling Technique combined with Edited Nearest Neighbor (SMOTE+ENN) and Synthetic Minority Oversampling Technique combined with Tomek Link (SMOTE+Tomek) were adopted. These two techniques are to cure the imbalanced dataset. For the importance of feature selection, the Recursive Feature Elimination method with Random Forest Classifier model (RFE-RFC), Explained Variance Ratio method with Principal Component Analysis (EVR-PCA), and the Rank Feature Importance method with Extra Classifier Tree model (RFI-ECT) is applied. The first 5, 8, and 12 features are selected based on the RFI-ECT to train the machine learning algorithms. As a result, the model is evaluated by precision, recall, and F1-score. In conclusion, the top five rank feature importance methods with the Extra Classifier Tree model are region, department, previous year rating, KPIs met and above 80%, and award won. The results suggest that SMOTE+ENN and Extreme Gradient Boost with eight features have the highest-performing model in this study.


Item Type: Article
Email / ID Num.
Shafie, Shahidan
Soek, Peng Ooi
Khai, Wah Khaw
Subjects: H Social Sciences > HD Industries. Land use. Labor > Labor. Work. Working class
Divisions: Universiti Teknologi MARA, Shah Alam > Arshad Ayub Graduate Business School (AAGBS)
Journal or Publication Title: Malaysian Journal of Computing (MJoC)
UiTM Journal Collections: UiTM Journal > Malaysian Journal of Computing (MJoC)
ISSN: 2600-8238
Volume: 8
Number: 1
Page Range: pp. 1264-1286
Keywords: Employee promotion prediction, hybrid sampling, imbalanced data machine learning
Date: April 2023
Edit Item
Edit Item


[thumbnail of 77298.pdf] Text

Download (1MB)

ID Number




Statistic details