Abstract
This project addresses the challenge of accurately pricing diamonds by leveraging machine learning techniques. Diamonds are priced based on various attributes such as carat, cut, color, clarity, depth, and table, which exhibit complex interrelationships. Traditional methods struggle to model these complexities effectively, necessitating adoption of advanced algorithms to improve accuracy. The aim of this project is to develop a Diamond Price Prediction System using Random Forest, designed to accurately predict diamond prices based on attributes. Project seeks to compare performance of Random Forest with other regression models using key performance metrics. Some of the broad steps of methodology involve data preprocessing, by means of which handling of missing values, outliers, and inconsistencies for quality were developed. Development for a customized Random Forest-based model and a library-based one is performed. In both versions, feature selection was done along with hyperparameter tuning to have a better performance for both models. Comparisons among the MAE, RMSE, and R2 on a custom-based, library-based model, along with other regression models, have been drawn on a comparative basis. The best balance was achieved at a 70:30 train-test split. Of the six regressors tried, Random Forest had the highest predictive accuracy. It outperformed those from Gradient Boosting and Decision Tree. On the contrary, SVR has the weakest performance among the six regressors. All in all, a library-based Random Forest model gives, consistently better accuracy compared to a custom-based one. That achieves lower MAE and RMSE which is 101.24 and 203.53. Besides the regression, the R2 score, namely 99%. The research indicates that a custom Random Forest model surpasses standard implementations when properly optimized. Its deeper predictive accuracy arises from tuning hyperparameters to better suit the dataset. The model, however, is resource-consuming, has the requirement for pseudo-expertise in tuning parameters, and is therefore less accessible for a layperson. Future work should focus on improving the interpretability and extending the models to capture localized diamond pricing trends in real-life transactions.
Metadata
Item Type: | Thesis (Degree) |
---|---|
Creators: | Creators Email / ID Num. Mohd Azmi, Nur Amirah 2023104431 |
Contributors: | Contribution Name Email / ID Num. Thesis advisor Tan, Gloria Jennis UNSPECIFIED |
Subjects: | Q Science > QA Mathematics > Instruments and machines > Electronic Computers. Computer Science > Algorithms |
Divisions: | Universiti Teknologi MARA, Terengganu > Kuala Terengganu Campus > Faculty of Computer and Mathematical Sciences |
Programme: | Bachelor of Computer Science (Hons) |
Keywords: | Diamond Price Prediction System, Forest Algorithm |
Date: | 2025 |
URI: | https://ir.uitm.edu.my/id/eprint/115275 |
Download
![[thumbnail of 115275.pdf]](https://ir.uitm.edu.my/style/images/fileicons/text.png)
115275.pdf
Download (102kB)
Digital Copy
Physical Copy
ID Number
115275
Indexing

