Machine Learning · Classification

Credit Risk Prediction

End-to-end ML system for predicting borrower default risk using financial, demographic, and credit-history attributes. Binary classification with structured preprocessing, probability estimation, explainability signals, and decision-oriented risk bands for lending workflows.

Credit RiskBinary ClassificationProbability ScoringExplainabilityCase Study
0.941ROC-AUC for ranking high-risk applicants
0.973Precision for reducing false high-risk flags
0.820F1 score balancing precision and recall

On this page

Overview

Why this project matters

Credit risk models convert borrower information into measurable default probability. That makes lending decisions more consistent, scalable, and explainable than manual review alone.

Supports faster underwriting by screening applicants with a standardized risk signal.
Enables threshold-based decision rules such as auto-approve, review, or decline.
Connects classification output to explainability and portfolio-level risk monitoring.
Stack

Implementation profile

The project is presented as a production-oriented scoring workflow rather than a notebook-only experiment.

PythonTraining, validation, inference logic
Scikit-learnPipelines, preprocessing, classifiers
FastAPI-readyLive risk scoring interface
Business context

Credit risk in lending

Lending institutions need to estimate the probability that a borrower will default. A calibrated probability supports underwriting, pricing, manual review policies, and aggregation of portfolio-level risk.

Why probability matters

Risk probability supports decision thresholds, pricing by risk tier, and downstream portfolio analytics beyond a simple yes or no label.

Explainability

Top drivers and feature importance help internal stakeholders and regulators understand why an applicant was flagged as higher risk.

Use cases

Typical applications include applicant screening, portfolio stress testing, pricing logic, and prioritization for manual review.

Dataset

Borrower and loan attributes

The dataset combines demographic, loan, and credit-history features commonly seen in real scoring pipelines.

Target: loan_status where 0 means lower risk and 1 means higher default risk.
Demographic: person_age, person_income, person_emp_length, person_home_ownership.
Loan characteristics: loan_intent, loan_grade, loan_amnt, loan_int_rate, loan_percent_income.
Credit history: cb_person_default_on_file, cb_person_cred_hist_length.
Target framing

Positive class definition

The positive class is the event we want to flag: higher credit risk or potential default. This matters because precision, recall, and threshold policy all depend on which class is treated as positive.

TaskBinary classification
Positive classHigher risk = 1
OutputProbability + class
UseDecision thresholds
Problem framing

From applicant data to default probability

The system predicts whether an applicant belongs to the higher-risk group, then turns that score into business-usable probability bands such as low, medium, or high risk.

01

Collect borrower features

Use structured demographic, loan, and credit-history inputs from the application and bureau-style signals.

02

Estimate higher-risk probability

Generate a calibrated probability in the range [0, 1] rather than a raw class label only.

03

Apply decision thresholds

Translate scores into operational actions such as approve, manual review, or decline.

04

Expose reasons

Provide top risk factors and importance signals to support explainability and communication.

Data preprocessing

Structured and deterministic input pipeline

Reliable risk scoring depends on consistent feature handling. The preprocessing layer validates the input schema, imputes missing values, encodes categories, and scales numeric features.

Schema validation

Ensure required fields such as person_*, loan_*, cb_*, and loan_status exist in expected structure.

Missing values

Impute numeric features with median and categorical features with most-frequent values.

Categorical encoding

Use OneHotEncoder for person_home_ownership, loan_intent, loan_grade, and default flags.

Scaling

Apply StandardScaler to numeric features to stabilize optimization and preserve pipeline consistency.

Feature groups

Risk signal categories

The model combines multiple signal families so that default risk is not inferred from a single isolated feature.

Demographic

person_age, person_income, person_emp_length, person_home_ownership

Loan characteristics

loan_intent, loan_grade, loan_amnt, loan_int_rate, loan_percent_income

Credit history

cb_person_default_on_file, cb_person_cred_hist_length

Interpretation

Why these groups matter

Together these features approximate repayment capacity, loan burden, borrowing context, and historical credit behavior.

Income and percent-of-income reflect affordability and debt burden pressure.
Interest rate and grade often capture lender-side assessment of risk profile.
Previous default and credit history length provide behavior and track-record context.
Modeling strategy

Balancing interpretability and performance

A simple baseline such as logistic regression establishes a performance floor, while tree-based models often improve ROC-AUC and produce native feature-importance signals.

Candidate models include Logistic Regression, Random Forest, and Gradient Boosting.
Model selection is guided by ROC-AUC, precision, recall, F1, and score reliability.
The final pipeline is saved for inference so preprocessing and modeling remain consistent in production.
Model profile

Operational choices

BaselineLogistic Regression
CandidatesRF and Gradient Boosting
SelectionROC-AUC + reliability
SplitStratified train/test
OutputProbability + risk band
ServingPipeline-based inference
Evaluation metrics

Classification quality from multiple perspectives

Credit risk models should not be evaluated with a single metric. High precision reduces false alarms, recall avoids missing risky applicants, and ROC-AUC measures ranking quality across thresholds.

ROC-AUC

0.941 Ranking quality across thresholds for imbalanced binary classification.

Precision

0.973 Of predicted high-risk cases, how many were actually risky.

Recall

0.708 Of all risky applicants, how many the model successfully flags.

F1 Score

0.820 Single summary metric balancing precision and recall.

Feature importance

Top drivers of default risk

Feature-importance analysis highlights the strongest signals that push the model toward higher-risk predictions.

loan_percent_income
9.36%
person_home_ownership
5.69%
loan_int_rate
5.64%
loan_intent
4.32%
person_income
4.18%
loan_grade
3.96%
person_emp_length
1.49%
loan_amnt
0.43%
person_age
0.39%
cb_person_cred_hist_length
-0.04%
cb_person_default_on_file
-0.01%
ROC and confusion

Threshold behavior and class outcomes

The ROC curve summarizes ranking power across thresholds, while the confusion matrix shows what happens once a specific decision threshold is chosen.

ROC curve

TPR vs FPRAUC = 0.941
1.00.80.60.40.20.0
0.00.20.40.60.81.0
Model ROC
Random baseline
4575True Negative
25False Positive
378False Negative
915True Positive
Business interpretation

Turning model output into lending decisions

A useful credit model does more than classify. It supports clear operational rules and stakeholder communication.

Risk bands

Use low, medium, and high-risk bands to route applications into auto-approval, manual review, or tighter underwriting.

Top factors

Feature-level explanations help justify decisions and show which borrower attributes drove the elevated risk score.

Portfolio use

Calibrated probabilities support portfolio aggregation, concentration monitoring, and scenario-based stress testing.

Live risk prediction tool

Estimate borrower default risk

Enter borrower and loan features below to simulate a production-style credit scoring request. The result includes a default probability, risk band, and top risk factors.

System architecture

How the risk scoring workflow operates

The case study is structured like a deployable ML system rather than a static report. The frontend consumes saved metrics and exposes a live scoring interface backed by the same preprocessing and model pipeline used in training.

1. Frontend case-study pageDisplays business context, metrics, feature importance, ROC analysis, confusion matrix, and a live prediction form.
2. FastAPI backendExposes endpoints such as POST /api/ml/risk/predict and read-only metric endpoints for artifacts and charts.
3. Risk serviceLoads the saved pipeline, validates incoming fields, runs inference, and returns probability plus top risk factors.
4. Artifacts and observabilityStores pipeline, metrics, feature importance, ROC data, confusion matrix, and metadata for portfolio dashboards and monitoring.