Studi Perbandingan Pemilihan Fitur untuk Support Vector Machine pada Klasifikasi Penilaian Risiko Kredit

  • Desri Kristina Silalahi Jurusan Pendidikan Matematika FIP Universitas Pelita Harapan
  • Hendri Murfi
  • Yudi Satria

Abstract

Credit scoring is a system or method used by banks or other financial institutions to determine the debtor feasible or not get a loan. One of credit scoring method is used to classify the characteristics of debtor is Support Vector Machine (SVM). SVM has an excellent generalization ability to solve classification problems in a large amount of data and can generate an optimal separator function to separate two groups of data from two different classes. One of the success using SVM method is dependent on features selection process that will affect the level of classification accuracy. Various methods have done to features selection, because not all the features are able to give best classification results. Features selection that used this study is Variance Threshold, Univariate Chi - Square, Recursive Feature Elimination (RFE) and Extra Trees Classifier (ETC). Data in this study use secondary data from the database in UCI machine learning responsitory. Based on simulations to compare the accuracy of using feature selection method on SVM in classification ofcredit riskscoring, obtained that Variance Threshold and Univariate Chi – Square method can decrease accuracy while RFE and ETC method can increase accuracy. RFE method gives better accuracy.


Keywords:
Credit scoring, Credit risk, Feature selection, Support vector machine

Author Biography

Hendri Murfi

Credit scoring is a system or method used by banks or other financial institutions to determine the debtor feasible or not get a loan. One of credit scoring method is used to classify the characteristics of debtor is Support Vector Machine (SVM). SVM has an excellent generalization ability to solve classification problems in a large amount of data and can generate an optimal separator function to separate two groups of data from two different classes. One of the success using SVM method is dependent on features selection process that will affect the level of classification accuracy. Various methods have done to features selection, because not all the features are able to give best classification results. Features selection that used this study is Variance Threshold, Univariate Chi - Square, Recursive Feature Elimination (RFE) and Extra Trees Classifier (ETC). Data in this study use secondary data from the database in UCI machine learning responsitory. Based on simulations to compare the accuracy of using feature selection method on SVM in classification ofcredit riskscoring, obtained that Variance Threshold and Univariate Chi – Square method can decrease accuracy while RFE and ETC method can increase accuracy. RFE method gives better accuracy.


Keywords:
Credit scoring, Credit risk, Feature selection, Support vector machine

Published
2017-02-14
How to Cite
Silalahi, D. K., Murfi, H., & Satria, Y. (2017). Studi Perbandingan Pemilihan Fitur untuk Support Vector Machine pada Klasifikasi Penilaian Risiko Kredit. EduMatSains : Jurnal Pendidikan, Matematika Dan Sains, 1(2), 119-136. https://doi.org/10.33541/edumatsains.v1i2.238
Section
Articles