You have decided to pursue the role of data scientist for an important microfinance institution. You will work as a credit analyst, who is in charge of the loan approval for different customers. You quickly notice that their old classification methods of providing loans have a lot of failures, where the loans cannot be paid, causing serious financial problems for the institution and its clients. The CEO knows that change is necessary and has promised to devote resources to improve the process.
As a data scientist, you now know that you can use machine learning and design a model that better classifies loans. By leveraging features like credit history, income, employment length, and loan intent, you can predict which applicants are most likely to repay their loans. This improved classification system will not only reduce the institution's financial risks but also enable it to extend credit responsibly to underserved communities, fulfilling its mission of financial inclusion.
Goal: The goal of this project is to build a model that can accurately predict loan approvals.
The dataset for this competition (both train and test) was generated from a deep learning model trained on the Loan Approval Prediction dataset.
id: Unique identifierperson_age: Person ageperson_income: Person incomeperson_home_ownership: Person home ownershipperson_emp_length: Person employment lengthloan_intent: Loan intentloan_grade: Loan gradeloan_amnt: Loan amountloan_int_rate: Loan interest rateloan_percent_income: Loan percent incomecb_person_default_on_file: CB person default on filecb_person_cred_hist_length: CB person credit history lengthloan_status: Loan status (target variable)Data Source: Kaggle Playground Series S4E10
Download Train DataYour task is to build a classification model to predict loan approvals.
Python Libraries: Pandas, NumPy, scikit-learn, Matplotlib/Seaborn.