Abstract
Credit risk is a confounding economic threat to financial institutions and may result in irrecoverable consequences. Banks are exposed to a wide range of potential risks, ranging from those identified with the budgetary and technological structure to those relating to brand reputation, to those derived from the social and institutional environment. As evident, these risks are fundamentally unrelated and have a few crossing points that make them difficult to identify and isolate. The ideal control of the problem requires an accurate measurement approach. A practical and efficient solution for prediction of risk in banks can be critical in the reduction of losses attributable to defective business procedures. Tools for risk prediction can be used as instruments for assessing the evolution of credit operation risk and hence increase insolvency of the bank. From the viewpoint of machine learning, the problem has typically been approached as a problem of the supervised classification. In the present, study, the credit scoring problem is explored based on the data provided using MATLAB software. The study compares Bayesian Networks and Artificial Neural Networks to predict recovered value on a credit operation. The implementation of proposed intelligent systems incorporates tests and algorithms for validations to the proposed model.
Keywords: Credit Risk, Algorithms, Neural Network, Bayesian, MATLAB
Introduction
A practical case used to exhibit the applicability, efficiency, flexibility, and accuracy of data mining approaches in the modeling of ambiguous events related to the measurement of credit risk for financial institutions. Because of the level of technology associated with big data, the computing power, and data availability, most lending institutions have been compelled to renew their business models (Davutyan & Ozar p.18). Credit risk forecasts, active loan processing and monitoring the reliability of the model are vital for transparency and decision-making. In the paper, binary classifiers are built based on machine model to predict the probability of loan default, and the results are compared with those from Artificial Neural Network regarding reliability and efficiency. The application of the algorithm has always raised numerous ethical questions (Thomas, Edelman & Crook p.11; Sun & Shenoy p.753; Stibor p.509).
Aside from technical questions requisite in the understanding algorithm, they require referrals to the many discussions regarding the confidentiality issues raised from the use of personal data. Such problems continue to be addressed through conferences on artificial intelligence (Davis, Edelman & Gammerman p.51). The underlying concern has been on the use of personal data and the fear that an algorithm may take away the decisional power from a human being. These debates and questions are legitimate, and the paper places focus on the algorithms relevant to decision making in the financial sector. The algorithm can be used to simplify, increase fluidity and quicken a process (Steenackers & Goovaerts p.31; Spackman p.163; Smith & Warner p.161). Notwithstanding, they are sets of codes designed to attain defined objectives. For example, in a recruitment process, it introduces discrimination of a people based on their profile. The same approach is used in the loan provision, from an enterprise to a bank, where lending decisions are based on the algorithm used (Hand p.41). Therefore, it is critical to comprehend the underlying problems and establish ways to regulate the use of algorithms (Henley & Hand p.533).
The present article demonstrates that various algorithms can be used in parallel to address the issue, which is loan provision (Hellwig p.742). It is observed that there are multiple strategies for addressing the underlying objective of identifying the choice of features (or variables), the algorithm and the criteria, which provide a solution to the question. In the new Big Data and digital era, transparency is critical (Bradley p.43). The terms related to the field needs to be ethical, clear, transparent and well-known. Strategies are necessary to train data on the application based on deep learning, and machine learning algorithms and their use must be regulated to ensure accuracy. The focus of the paper is on credit risk scoring and the effect of distinct machine learning model in the identification of defaults by lenders (Sarkar & Sriram p.1475; Rosner p.11). Further, the stability of these models based on the choice of variables or subsets is examined. Although the method used by banks in the decisions to award loans remains unclear, the application of classical linear models in the banking sector is well-known. The transparent elastic approach is used as the benchmark, and its fitting and decision rule is compared.
1.1 Artificial Neural Network (ANN)
The study on Artificial Neural Network can be traced back to Frank Rosenblatt (1958) who focused on the perceptron algorithms, and the findings of the survey were deployed for the development of smart automated systems and software (Hellwig p.415). ANN has proved to a promising and practical approach in the prediction of outcomes. The method received tremendous support following the development of machine language. Odom and Sharda (1990) applied NNs in the evaluation of credit risk. Initially, the network was based on Hebb network that aimed at improving the input vector with perception focused on increasing the accuracy of the model. Back-propagation followed the perception neural network, and James McClelland and David E.Rumelhart developed it. Backpropagation has the component that refreshes its weight through maintenance of history, commonly known as neural processing. However, more interest was focused on deep learning with Angelini et al. 2008 performing the first credit risk analysis that the banking management used in the computation of capital requirement based on the drivers. The use of ANN was used in the calculation of variables necessary to evaluate credit risk.
Artificial Neural Network is a system or algorithm used in the computation and studying problems, particularly in the field of biology. ANNs entails processing of algorithms to model the brain of humans (Hanley & McNeil p.29). The purpose of ANN research is to develop a computation system with relatively lower computational cost and time. A range of tasks can be performed by the ANNs, including classification, pattern-matching, approximation, function optimization, data clustering, vector quantization, and so forth. Initially, the ANN was in studying the nervous system and the way the brain processes information. ANN properties include the following:
- The cycle or speed of implementation of ANN is in nanoseconds
- The processing time is rapid, and numerous operations can be performed simultaneously
- The complexity and size of the ANN is subject to the network design and the application in use
- The data in ANN is stored in contiguous memory sites which can be overload when the limit is exceeded.
- Learning is the primary property of ANN and is in two forms structure and parameter learning
Parameter learning modernizes the weight linked to the network, whereas the latter concentrates on the network topology and confirms whether there are any changes within the framework. Learning can either be supervised or unsupervised, as well as reinforcement learning, which is dependent on critic information (Revsine, Collins & Johnson p.22; Raymond p.18; Quinlan p.90). As supervised learning, reinforcement learning contains activation function useful for calculating the exact output. The feature is applied to the overall input to determine the total production of the network (Galindo & Tamayo p.133). There numerous types of activation function, including binary step function, hyperbolic tangent function, bipolar sigmoid curve function and identity function.
1.2 Naive Bayesian
The Naive Bayesian was initially studied through the application of Bayes theorem. The approach is the statistical classifier, and all data are allocated to classes. Mutual conditional probability distributions are used in the Bayesian classifier, allowing class-based conditional independencies to function between variables with a graphical model used to depict the underlying relationships (Provost, Fawcett & Kohavi p.453; Pang, Wang & Bai p.71; Palepu, Healy & Bernard, p.10). The random variables either form a continuous or discrete relationship. The attribute within the data can be real of invisible variables that formulate the relationship. Every arc in the acyclic graph is a representation of dependence probability, and all variables are independent of the non-descendant. The formula of the Bayes theorem is given as follows:
P (A|B) = P (B|A)P(A) / P(B)
Where in a sample referred to as A, the chances of all events occurring is h, the P (h|A) is consistent with Bayes theorem, which can be stated mathematically as the above equation. The classification is considered the most optimal one (Okan p.43; Ohlson p.109; Odom & Sharda, p.19). When the topology and the data of the network are given in the myriad variables contained in a sample, then the data training is candid. The variables are used in the determination of entries of continuous probability table (CDT). The approach is extremely effective regarding computational cost, and it is suited for problems where a strong relationship between variables exists (Moonasar p.20; Mitchell p.81; Mileris p.1084; Merton p.470). The approach is highly advanced in comparison to Support Vector Machine and also applicable to medical diagnosis. Studies based on Support Vector Machine have turned out to be promising and useful in the determination of credit risk assessment relative to other algorithms, for example, PSO, neural network and machine language algorithms. An Ahn et al. has proposed few SVM based examinations of problems on credit risk assessment. Mcculloch and Pitts, and others. The outcomes have been promising, and their principle work is comparable to genetic algorithms. Other critical methodologies incorporate the integration of fuzzy logic and SVM. The former was tested using fuzzy sets while Huang et al. focused on least square SVM. Moreover, Matoussi and Krichene used PSO for the selection of the optimal parameter for SVM. An equivalent study by these scholars demonstrated that SVM classifier could replace other algorithms in speed, reliability, and complexity.
Method
2.1 Research Data
The section describes the datasets, and the attributes used in the exploration to forecast credit risk. The dataset was collected from a financial institution. An aggregate of 200 records was accessed and retrieved. As there were some missing attributes, the dataset required some preprocessing. For purposes of replacing the missing attributes, the global mean approach was employed. The dataset was processed further to convert it to a binary set to yield a set comprising of 1s and 2s. The output of the processing represented the label as opposed to the value, with one being the credit risk whereas two was for security (Fawcett p.861). The output was in the range of between 0 and 1, and the outcome is marked 2 when the result ranged between 0.75 and 0 otherwise it was considered as...
Cite this page
Credit Risk Research Paper Example. (2022, Jul 07). Retrieved from https://proessays.net/essays/credit-risk-research-paper-example
If you are the original author of this essay and no longer wish to have it published on the ProEssays website, please click below to request its removal:
- A Management Decision-Making and Index Number Paper Example
- Assignment Example on Time Management
- Paper Example on Organizational Financial Sustainability: Book vs Journal
- Essay on Blue Collar Workers: Education as Key to Success in Job Market
- Essay on Preventing Workplace Violence: What I Learned at the Conference
- Essay Example on Pepsi Stock Undervalued by $59: Analysis Reveals
- Case Study Analysis Example: Treatment & Task Groups