The problem that I am currently looking at is the credit score rating. Credit scores are very important because it can control factors such as buying houses, taking out a loan, and employment. With the advancement of AI in the financial service industry, it can offer an unique opportunitie to imporve the fairness in the credit scoring. Scalable credit risk rating can be made possible by methods and algorithms based on machine learning and data science. The scalability can be built by data engineers to boost platform capacity and Data scientists can adjust and fit models to raise credit scores for the system and the individual users.
Variable name: Credit Scores Data type: Numerical Data Description: Given with a person's credit related information, we will be building a machine learning model that can classify the credit score Scale: 300 - 850 Intepretation: higher values on the scale indicate to lower risk and a higher credit
Variable name: Age Data Type: Numerical Data Description: this will be used to determine the relation between the age and credits Scale: Number of Years Range: positive integer value Intepretation: the older the person this, the higher the credibility might be due to their job experience
Variable name: Income Data Type: numerical Data Description: the annual income per person Scale: US dollars Intepretation: this will be used to determine the relation with a higher income which might mean a greater financial stability and therefore might have a higher credit score
Variable name: Number of Bank Accounts & number of credit cards Data Type: Numerical Data Description: this is the total number of bank accounts a person would have Scale: by count Range: positive values Intepretation: number of bank accounts could affect credit rating due to more credit cards meaning more money to pay back. Good thing or bad thing - depending on how much they are able to pay back.
Variable names: Interest Rate Data Type: numerical data Description: the interest rate on the credit account Scale: percentage Range: positive Intepretation: higher interest rate may result in making more consistent payments which might raise higher risk and potential higher cost
Variable names: Number of loans Data Type: numerical data Description: the number of loans a person may have Scale: count Range: positive integer Intepretation: the higher the loan, the higher the credit. It may indicate that one has the ability to pay off loans within a period of time
Variable names: type of loan Data Type: categorical Description: the type of loan a person may have Scale: depends Range: depends Intepretation: the level of loan may indicate one's financial stability
import pandas as pd
pdDf = pd.read_csv('test.csv')
pdDf.head()
ID | Customer_ID | Month | Name | Age | SSN | Occupation | Annual_Income | Monthly_Inhand_Salary | Num_Bank_Accounts | ... | Num_Credit_Inquiries | Credit_Mix | Outstanding_Debt | Credit_Utilization_Ratio | Credit_History_Age | Payment_of_Min_Amount | Total_EMI_per_month | Amount_invested_monthly | Payment_Behaviour | Monthly_Balance | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0x160a | CUS_0xd40 | September | Aaron Maashoh | 23 | 821-00-0265 | Scientist | 19114.12 | 1824.843333 | 3 | ... | 2022.0 | Good | 809.98 | 35.030402 | 22 Years and 9 Months | No | 49.574949 | 236.64268203272135 | Low_spent_Small_value_payments | 186.26670208571772 |
1 | 0x160b | CUS_0xd40 | October | Aaron Maashoh | 24 | 821-00-0265 | Scientist | 19114.12 | 1824.843333 | 3 | ... | 4.0 | Good | 809.98 | 33.053114 | 22 Years and 10 Months | No | 49.574949 | 21.465380264657146 | High_spent_Medium_value_payments | 361.44400385378196 |
2 | 0x160c | CUS_0xd40 | November | Aaron Maashoh | 24 | 821-00-0265 | Scientist | 19114.12 | 1824.843333 | 3 | ... | 4.0 | Good | 809.98 | 33.811894 | NaN | No | 49.574949 | 148.23393788500925 | Low_spent_Medium_value_payments | 264.67544623342997 |
3 | 0x160d | CUS_0xd40 | December | Aaron Maashoh | 24_ | 821-00-0265 | Scientist | 19114.12 | NaN | 3 | ... | 4.0 | Good | 809.98 | 32.430559 | 23 Years and 0 Months | No | 49.574949 | 39.08251089460281 | High_spent_Medium_value_payments | 343.82687322383634 |
4 | 0x1616 | CUS_0x21b1 | September | Rick Rothackerj | 28 | 004-07-5839 | _______ | 34847.84 | 3037.986667 | 2 | ... | 5.0 | Good | 605.03 | 25.926822 | 27 Years and 3 Months | No | 18.816215 | 39.684018417945296 | High_spent_Large_value_payments | 485.2984336755923 |
5 rows × 27 columns
The credit risk data can be used to evaluate the creditworthiness of borrowers and mamke more informed decisions about credit applciations, interst rates, and credit limits. This helps solve credit rating problems by minimizing the risk of default or non-payment on loans and credit accounts.