Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Author "Saada, Adam"

Sort by: Order: Results:

  • Saada, Adam (2018)
    Logistic regression has been the most common credit scoring model for several decades. The purpose of a credit scoring model is to distinguish good applicants from bad applicants so that the consumer credit can be lent to a person who is likely to repay it. In Finland, households' indebtedness has increased while wage development has stagnated. In addition to mortgage, indebtedness has increased because of the rising number of consumer credit loans. Consumer credit is usually unsecured loans, which are provided by several financial institutions quickly and flexible. Consumer credit is considered to be one of the major causes of default. Systematic risks are still being avoided for now, but the increased number of customers and the fierce competition in the sector can bring new risks that should be anticipated, as insolvent customers are making losses to financial institutions. Developing and deploying new credit scoring models is one of the best ways to hedge against default risks. The prediction accuracy and performance of tree-based credit scoring models have been studied. In many cases, tree-based algorithms have performed better than traditional statistical models such as the earlier mentioned logistic regression. In this master's thesis classical logistic regression is compared to these tree-based algorithms. The most well-known tree-based algorithms have been chosen, which are random forest, discrete Adaboost, real Adaboost, LogitBoost, Gentle Adaboost and Gradient Boosting. These methods use the tree algorithm as the base learner but differ in their iterative processes. The data that has been gathered from a Finnish medium-sized financial company, consists of customer's personal information and their payment behavior of sales finance. It is important to compare how different models predict insolvency in the light of different test statistics. In this thesis, the best-performing models are logistic regression and the Gradient Boosting algorithm. From my research's point of view, it is recommended to develop a credit scoring model based on the Gradient Boosting algorithm. This algorithm discloses different explanatory variables compared to logistic regression. These variables can explain better the causes of insolvency. The results are robust and plausible, because the different tests give similar conclusions.