app.py Code Explanation - Smart Credit Risk System

📚 Table of Contents

1. Library Imports
2. Initialization & Model Loading
3. Routing
4. Prediction Logic
5. Feature Engineering
6. predict_proba() Explained
7. Complete Flow

1. Library Imports

from flask import Flask, render_template, request
import joblib
import pandas as pd
import os
            

What Each Library Does:

🔵 Flask

Our web framework that helps us create a web server:

Flask: Creates the web application
render_template: Displays HTML pages to users
request: Gets data from forms that users submit

📦 joblib

Used to load our pre-trained Machine Learning model. Think of it as opening a saved file that contains all the "knowledge" our model learned during training.

🐼 pandas

A powerful library for organizing data. We use it to create a DataFrame (like an Excel spreadsheet) that our model can understand.

💻 os

Operating system library. We use it to find the correct file path to our model, so the code works on any computer.

Why This Matters: These imports give us all the tools we need to create a web application that can receive user input, process it, and return predictions.

2. Initialization & Model Loading

# Create Flask application
app = Flask(__name__)

# Smart path loading (works anywhere)
current_dir = os.path.dirname(os.path.abspath(__file__))
model_path = os.path.join(current_dir, 'credit_risk_model.pkl')
model = joblib.load(model_path)
            

Step-by-Step Breakdown:

app = Flask(__name__): Creates our Flask web application. This is like starting a new website.
Finding the Model File:
- os.path.abspath(__file__): Gets the full path of the current file (app.py)
- os.path.dirname(...): Gets the folder where app.py is located
- os.path.join(...): Combines the folder path with the model filename
model = joblib.load(model_path): Loads the trained Random Forest model from the file.

💡 Why This Matters

We load the model once when the server starts, not every time someone makes a prediction
This makes our application much faster!
The smart path handling ensures the code works whether you run it locally or deploy it to a server

Important Note: The model file (credit_risk_model.pkl) must be in the same folder as app.py for this to work.

3. Routing

Routes are like different "pages" or "endpoints" in our web application. Each route handles a specific type of request.

Route 1: Home Page

@app.route('/')
def home():
    return render_template('index.html')
            

What This Does:

@app.route('/'): This decorator tells Flask: "When someone visits the homepage, run this function"
render_template('index.html'): Finds and displays the HTML file that contains our loan application form

Why This Matters: This is how users see the form where they can enter loan applicant information.

Route 2: Prediction Endpoint

@app.route('/predict', methods=['POST'])
def predict():
    # ... prediction logic here ...
            

What This Does:

@app.route('/predict'): Creates a route at /predict
methods=['POST']: Only accepts POST requests (when users submit the form)
This is where all the prediction magic happens!

Why This Matters: When a user clicks "Submit" on the form, their browser sends a POST request to /predict with all the form data. This function processes it and returns the prediction.

4. Prediction Logic - Step by Step

This is the heart of our application! Let's break it down step by step:

Step 1: Receive Data from the Form

person_age = float(request.form['person_age'])
person_income = float(request.form['person_income'])
person_emp_length = float(request.form['person_emp_length'])
loan_amnt = float(request.form['loan_amnt'])
loan_int_rate = float(request.form['loan_int_rate'])
cb_person_cred_hist_length = float(request.form['cb_person_cred_hist_length'])
            

What This Does:

request.form: Gets all the data from the HTML form
['person_age']: Extracts the specific field (must match the name attribute in HTML)
float(...): Converts text input to a number

Why This Matters: We need to convert the form data (which comes as text) into numbers that our model can understand.

Step 2: Receive Text Data (Dropdowns)

person_home_ownership = request.form.get('person_home_ownership', 'RENT')
loan_intent = request.form.get('loan_intent', 'PERSONAL')
loan_grade = request.form.get('loan_grade', 'A')
cb_person_default_on_file = request.form.get('cb_person_default_on_file', 'N')
            

What This Does:

.get(): Safely gets the value, with a default if it's missing
The second parameter (like 'RENT') is the default value if the field is empty

Why This Matters: Prevents errors if a dropdown field is missing and provides sensible defaults.

5. Feature Engineering - The Smart Calculations ⭐

# Feature Engineering
if person_income > 0:
    loan_percent_income = loan_amnt / person_income
    interest_burden = (loan_amnt * (loan_int_rate / 100)) / person_income
else:
    loan_percent_income = 0
    interest_burden = 0
            

🔑 What is Feature Engineering?

Feature engineering means creating new, meaningful features from existing data. Instead of just using raw numbers, we calculate relationships that help the model make better predictions.

Understanding interest_burden

The Formula:

interest_burden = (loan_amnt × (loan_int_rate / 100)) / person_income

Breaking Down the Formula:

loan_int_rate / 100: Converts percentage to decimal (e.g., 5% becomes 0.05)
loan_amnt × (loan_int_rate / 100): Calculates the annual interest amount
Divide by person_income: Shows what percentage of income goes to interest

📊 Real Example:

Loan amount: $10,000
Interest rate: 5%
Annual income: $50,000

Calculation:

Annual interest = $10,000 × (5 / 100) = $500
Interest burden = $500 / $50,000 = 0.01 (or 1% of income)

✅ Lower interest burden = Lower risk of default!

Understanding loan_percent_income

loan_percent_income = loan_amnt / person_income

What It Shows: What percentage of annual income the loan amount represents.

Example: If someone wants a $25,000 loan and earns $50,000/year:

loan_percent_income = $25,000 / $50,000 = 0.5 (or 50%)

Why This Matters: A loan that's 50% of someone's annual income is riskier than one that's 10%.

6. Prepare Data in DataFrame Format

input_data = pd.DataFrame({
    'person_age': [person_age],
    'person_income': [person_income],
    'person_home_ownership': [person_home_ownership],
    'person_emp_length': [person_emp_length],
    'loan_intent': [loan_intent],
    'loan_grade': [loan_grade],
    'loan_amnt': [loan_amnt],
    'loan_int_rate': [loan_int_rate],
    'loan_percent_income': [loan_percent_income],
    'cb_person_default_on_file': [cb_person_default_on_file],
    'cb_person_cred_hist_length': [cb_person_cred_hist_length],
    'interest_burden': [interest_burden]
})
            

What This Does:

Creates a pandas DataFrame (like a table with one row)
Each column name must match exactly what the model expects
The values are wrapped in [...] because DataFrame needs lists

Why This Matters:

Our trained model expects data in this exact format
Column names must match what was used during training
This is like filling out a form that the model can read

7. Making Predictions - predict() vs predict_proba() ⭐

# Get the prediction (answer)
prediction = model.predict(input_data)[0]

# Calculate probabilities (confidence scores)
probabilities = model.predict_proba(input_data)[0]

prob_safe = probabilities[0] * 100   # Safety percentage
prob_risk = probabilities[1] * 100   # Risk percentage
            

model.predict() - The Answer

What It Does:

Returns the final prediction: 0 (SAFE) or 1 (RISKY)
[0] gets the first (and only) result from the array

Example Output: 0 means "This applicant is SAFE"

model.predict_proba() - The Confidence Score ⭐

What It Does:

Returns probabilities for each possible outcome
For binary classification, it returns two probabilities that add up to 100%

Example Output: [0.85, 0.15] means:

85% chance of being SAFE (class 0)
15% chance of being RISKY (class 1)

💡 Why This Matters:

predict() gives us the answer: "SAFE" or "RISKY"
predict_proba() tells us how confident the model is
A 95% safety score means the model is very confident
A 55% safety score means the model is less certain

Real-World Example:

Prediction: SAFE
Safety Percentage: 92%
This means: "The model is 92% confident this applicant is safe"

8. Format and Return the Result

if prediction == 1:
    # If risky
    res_text = f"⚠️ خطر! احتمالية التعثر: {prob_risk:.1f}%"
    color = "#e74c3c" # Red
else:
    # If safe
    res_text = f"✅ عميل آمن (نسبة الأمان: {prob_safe:.1f}%)"
    color = "#2ecc71" # Green

return render_template('index.html', prediction_text=res_text, color=color)
            

What This Does:

Checks if prediction is 1 (risky) or 0 (safe)
Creates a user-friendly message with the percentage
Sets a color (red for risky, green for safe)
Sends the result back to the HTML page

Why This Matters:

Users see a clear, colored result
The percentage helps them understand the model's confidence
The result appears on the same page (better user experience)

9. Error Handling

except Exception as e:
    print(f"Error occurred: {e}")
    return render_template('index.html', 
                           prediction_text=f"حدث خطأ: {e}", 
                           color="black")
            

What This Does:

Catches any errors that might occur
Prints the error to the console (for debugging)
Shows a friendly error message to the user

Why This Matters:

Prevents the website from crashing
Helps developers debug issues
Provides a better user experience

10. Complete Flow Summary

User visits homepage → Sees the loan application form
User fills out form → Enters applicant information
User clicks Submit → Browser sends POST request to /predict
Backend receives data → Extracts all form fields
Feature engineering → Calculates interest_burden and loan_percent_income
Data preparation → Organizes data into DataFrame format
Model prediction → Uses predict() and predict_proba()
Result formatting → Creates user-friendly message with percentage
Return to user → Displays result on the webpage

⚡ Total Time: Usually less than 1 second!

🎯 Key Takeaways

Model Loading: Load once at startup for speed
Feature Engineering: Calculate meaningful relationships (like interest_burden)
Data Format: Must match exactly what the model expects
Two Predictions: Use both predict() (answer) and predict_proba() (confidence)
Error Handling: Always include try/except for production code
User Experience: Show clear, colored results with confidence percentages

💡 Tips for Understanding

                Think of the model as a trained expert: It learned patterns from thousands of loan examples
Feature engineering is like asking smart questions: Instead of "What's their income?", we ask "What percentage of income goes to loan payments?"
Probabilities show confidence: 95% is very confident, 60% is uncertain
The DataFrame is like a form: The model reads it like a spreadsheet with one row

            

Complete app.py Code Explanation

📚 Table of Contents

1. Library Imports

What Each Library Does:

🔵 Flask

📦 joblib

🐼 pandas

💻 os

2. Initialization & Model Loading

Step-by-Step Breakdown:

💡 Why This Matters

3. Routing

Route 1: Home Page

Route 2: Prediction Endpoint

4. Prediction Logic - Step by Step

Step 1: Receive Data from the Form

Step 2: Receive Text Data (Dropdowns)

5. Feature Engineering - The Smart Calculations ⭐

🔑 What is Feature Engineering?

Understanding interest_burden

Breaking Down the Formula:

📊 Real Example:

Understanding loan_percent_income

6. Prepare Data in DataFrame Format

7. Making Predictions - predict() vs predict_proba() ⭐

model.predict() - The Answer

model.predict_proba() - The Confidence Score ⭐

💡 Why This Matters:

8. Format and Return the Result

9. Error Handling

10. Complete Flow Summary

🎯 Key Takeaways

💡 Tips for Understanding