Studi Perbandingan Pemilihan Fitur untuk Support Vector Machine pada Klasifikasi Penilaian Risiko Kredit
Abstract
Credit scoring is a system or method used by banks or other financial institutions to determine the debtor feasible or not get a loan. One of credit scoring method is used to classify the characteristics of debtor is Support Vector Machine (SVM). SVM has an excellent generalization ability to solve classification problems in a large amount of data and can generate an optimal separator function to separate two groups of data from two different classes. One of the success using SVM method is dependent on features selection process that will affect the level of classification accuracy. Various methods have done to features selection, because not all the features are able to give best classification results. Features selection that used this study is Variance Threshold, Univariate Chi - Square, Recursive Feature Elimination (RFE) and Extra Trees Classifier (ETC). Data in this study use secondary data from the database in UCI machine learning responsitory. Based on simulations to compare the accuracy of using feature selection method on SVM in classification ofcredit riskscoring, obtained that Variance Threshold and Univariate Chi – Square method can decrease accuracy while RFE and ETC method can increase accuracy. RFE method gives better accuracy.
Keywords: Credit scoring, Credit risk, Feature selection, Support vector machine
- View 830 times Download 830 times PDF