Analysis of Methods for Generating Classification Rules Applicable to Credit Risk
Keywords:classification rules, credit scoring, competitive neural networks, particle swarm optimization
Credit risk is defined as the probability of loss due to non-compliance by the borrower with the required payments in relation to any type of debt. When financial institutions select their customers correctly, they can reduce their credit risk. To achieve this, they use various classification methodologies to sort customers based on their risk, analyzing a set of variables such as reputation, leverage, income and so forth. The extensive analysis and processing of these variables is quite time-consuming, partly because the data to be analyzed are not homogeneous. In this paper, we present an alternative method that operates on nominal and numeric attributes, which allows obtaining a predictive model that uses a reduced set of classification rules aimed at reducing credit risk. When the number of rules used decreases, credit analysts need less time to make their decisions, which will also result in better customer service. The methodology proposed here was applied to two databases of the UCI repository and two real databases of Ecuadorian banks that grant various types of credit. The results obtained have been satisfactory. Finally, our conclusions are discussed and future research lines are suggested.
 C. Aggarwal, Data Mining: The Textbook. Springer International Publishing, 2015.
 I. Witten, E. Frank, and M. Hall, Data Mining: Practical Machine Learning Tools and Techniques. The Morgan Kaufmann Series in Data Management Systems, Elsevier Science, 2011.
 E. I. Altman, “Financial ratios, discriminant analysis and the prediction of corporate bankruptcy,” The Journal of Finance, vol. 23, no. 4, pp. 589–609, 1968.
 E. I. Altman and A. Saunders, “Credit risk measurement: Developments over the last 20 years,” Journal of Banking and Finance, vol. 21, no. 11-12, pp. 1721–1742, 1997.
 D. Duffie and K. Singleton, Credit Risk: Pricing, Measurement, and Management. Princeton Series in Finance, Princeton University Press, 2012.
 K. Roszbach, “Bank lending policy, credit scoring, and the survival of loans,” The Review of Economics and Statistics, vol. 86, no. 4, pp. 946–958, 2004.
 A. Saunders and L. Allen, Credit Risk Management In and Out of the Financial Crisis: New Approaches to Value at Risk and Other Paradigms. Wiley Finance, Wiley, 2010.
 K. Andric and D. Kalpic, “The effect of class distribution on classification algorithms in credit risk assessment,” in 2016 39th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 1241–1247, May2016.
 S. Birla, K. Kohli, and A. Dutta, “Machine learning on imbalanced data in credit risk,” in 2016 IEEE 7th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), pp. 1–6, Oct 2016.
 X. Mei and Y. Jiang, “Association rule-based feature selection for credit risk assessment,” in 2016 IEEE International Conference of Online Analysis and Computing Science (ICOACS), pp. 301–305, May 2016.
 J. R. Quinlan, C4.5: Programs for Machine Learning. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1993.
 Q. l. Chen and J. b. Lin, “Integrating of business intelligence and crm in banks: An empirical study of som applied in personal customer loans in taiwan,” in 2015 International Conference on Fuzzy Theory and Its Applications (iFUZZY), pp. 68–73, Nov 2015.
 E. Frank and I. H. Witten, “Generating accurate rule sets without global optimization,” in Proceedings of the Fifteenth International Conference on Machine Learning, ICML ’98, (San Francisco, CA, USA), pp. 144–151, Morgan Kaufmann Publishers Inc., 1998.
 Z. X. Li, “A new method of credit risk assessment of commercial banks,” in 2016 International Conference on Robots Intelligent System (ICRIS), pp. 34–37, Aug 2016.
 Z. Wang, X. Sun, and D. Zhang, A PSO-Based Classification Rule Mining Algorithm, pp. 377–384. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007.
 C. Hung and L. Huang, “Extracting rules from optimal clusters of self-organizing maps,” in 2010 Second International Conference on Computer Modeling and Simulation, vol. 1, pp. 382–386, Jan 2010.
 T. Kohonen, Self-Organizing Maps. Springer Series in Information Sciences, Springer Berlin Heidelberg, 2012.
 J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, (Berkeley, Calif.), pp. 281–297, University of California Press, 1967.
 J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Neural Networks, 1995. Proceedings., IEEE International Conference on, vol. 4, pp. 1942–1948 vol.4, Nov 1995.
 G. Venturini, “Sia: A supervised inductive algorithm with genetic search for learning attributes based concepts,” in Proceedings of the European Conference on Machine Learning, ECML ’93, (London, UK, UK), pp. 280–296, Springer-Verlag, 1993.
 L. Lanzarini, A. Villa Monte, G. Aquino, and A. De Giusti, Obtaining Classification Rules Using lvqPSO, pp. 183–193. Cham: Springer International Publishing, 2015.
 L. Lanzarini, A. Villa Monte, and F. Ronchetti, “Som+pso: A novel method to obtain classification rules,” Journal of Computer Science and Technology, vol. 15, pp. 15–22, 4 2015.
 P. Jimbo Santana, A. Villa Monte, E. Rucci, L. C. Lanzarini, and A. Fernández Bariviera, “An exploratory analysis of methods for extracting credit risk rules,” in XIII Workshop Bases de datos y Minerı́a de Datos (WBDMD). XXII Congreso Argentino de Ciencias de la Computación (CACIC 2016), pp. 834–841, Oct 2016.