Implementation of Random Oversampling Technique in the K-Nearest Neighbor Method for Creditworthiness Analysis
Keywords: credit worthiness, KNN, imbalance class, ROS
Abstract
Banks are financial institutions, one of whose main activities is providing credit to their customers. The existence of credit granting activities requires the bank to know the feasibility of prospective debtors in receiving credit. Because in practice, credit granting activities still often have bad credit problems. The problem of bad credit can be overcome by analyzing the feasibility of granting credit to prospective debtors. The data used in this study consists of 10 independent variables and 1 dependent variable is collectibility (kol). The collectibility (col) data consists of 500 data for the current debtor class and 26 data for the non-current debtor class, this indicates an imbalance class. So in this study, the application of the random oversampling (ROS) technique is used to overcome the imbalance class with the K-Nearest Neighbor (KNN) method in classifying current and non-current debtor data. ROS was chosen because it can generally provide better results and does not eliminate information from existing data. The analysis results obtained show that the use of the KNN method with the application of ROS is better than the KNN model without ROS, with an accuracy of 84.91% at data testing. The KNN model with ROS can improve the model's ability to classify noncurrent debtor data or the specificity value of the model increases by 25%. In the KNN model without ROS the model cannot classify non-current debtor data correctly at all, this can endanger the bank in making decisions.
Downloads
References
Adi, S., & Winarko, E. (2015). Klasifikasi Data NAP (Nota Analisis Pembiayaan) untuk Prediksi Tingkat Keamanan Pemberian Kredit (Studi Kasus: Bank Syariah Mandiri Cabang Luwuk Sulawesi Tengah). IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 9(1), 1–12. https://doi.org/10.22146/ijccs.6635
Astuti, F. D., & Lenti, F. N. (2021). Implementasi SMOTE untuk mengatasi Imbalance Class pada Klasifikasi Car Evolution menggunakan K-NN. Jurnal Jupiter, 13(1), 89–98.
Bhatia, N. (2010). Survey of nearest neighbor techniques. ArXiv Preprint ArXiv:1007.0085, 8(2), 302–305. https://doi.org/10.48550/arXiv.1007.0085 Focus to learn more
Dahlan, I. A. (2022). Klasifikasi Cuaca Provinsi Dki Jakarta Menggunakan Algoritma Random Forest Dengan Teknik Oversampling. Jurnal Teknoinfo, 16(1), 87–92. https://doi.org/10.33365/jti.v16i1.1533
Ginting, J. A. (2019). Data Mining untuk Analisa Pengajuan Kredit dengan Menggunakan Metode Logistik Regresi. Jurnal Algoritma, Logika Dan Komputasi, 2(2), 164–169. https://doi.org/10.30813/j-alu.v2i2.1845
Hendrian, S. (2018). Algoritma Klasifikasi Data Mining Untuk Memprediksi Siswa Dalam Memperoleh Bantuan Dana Pendidikan. Faktor Exacta, 11(3), 266–276. https://doi.org/10.30998/faktorexacta.v11i3.2777
Martha, S., Andani, W., & Rizki, S. W. (2022). Perbandingan Metode k-Nearest Neighbor, Regresi Logistik Biner, dan Pohon Klasifikasi pada Analisis Kelayakan Pemberian Kredit. Euler: Jurnal Ilmiah Matematika, Sains Dan Teknologi, 10(2), 262–273. https://doi.org/10.34312/euler.v10i2.16751
Pramadhana, D. (2021). Klasifikasi Penyakit Diabetes Menggunakan Metode CFS Dan ROS dengan Algoritma J48 Berbasis Adaboost. Edumatic: Jurnal Pendidikan Informatika, 5(1), 89–98. https://doi.org/10.29408/edumatic.v5i1.3336
Saifudin, A., & Wahono, R. S. (2015). Pendekatan Level Data untuk Menangani Ketidakseimbangan Kelas pada Prediksi Cacat Software. IlmuKomputer. Com Journal of Software Engineering, 1(2), 76–85.
Santoso, B., Wijayanto, H., Notodiputro, K. A., & Sartono, B. (2017). Synthetic over sampling methods for handling class imbalanced problems: A review. IOP Conference Series: Earth and Environmental Science, 58(1), 12031. https://doi.org/10.1088/1755- 1315/58/1/012031
Wajhillah, R., Ubaidallah, I. H., & Bahri, S. (2019). Analisis Kelayakan Kredit Berbasis Algoritma K-Nearst Neighboar (Studi Kasus: Koperasi AKU). InfoTekJar (Jurnal Nas. Inform. Dan Teknol. Jaringan), 4(1), 121–125. https://doi.org/10.30743/infotekjar.v4i1.1264
Whidhiasih, R. N., Wahanani, N. A., & Supriyanto, S. (2013). Klasifikasi Buah Belimbing Berdasarkan Citra Red-Green-Blue Menggunakan Knn Dan Lda. PIKSEL: Penelitian Ilmu Komputer Sistem Embedded and Logic, 1(1), 29–35.
Wicaksono, H. (2017). Penilaian Hasil Kegiatan Belajar Mahasiswa Menggunakan Metode Cluster Non-Hierarki. Infoman’s, 11(1), 11–21. https://doi.org/10.24076/citec.2019v6i1.178
Wijayanti, N. P. Y. T., Kencana, E. N., & Sumarjaya, I. W. (2021). SMOTE: Potensi dan Kekurangannya pada Survei. E-Jurnal Matematika, 10(4), 235–240. https://doi.org/10.24843/MTK.2021.v10.i04.p348
Winata, T. A., Wiryawan, I. W., & Rudy, D. G. (2013). Kendala dalam Penyelesaian Kredit Macet pada PT. Bank Pembangunan Daerah Bali Cabang Denpasar. Jurnal Ilmu Hukum, 1(1), 1–9.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Copyright (c) 2024 Jurnal Matematika Sains dan Teknologi
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.