📚 Table of Contents

1. Library Imports

from flask import Flask, render_template, request import joblib import pandas as pd import os

What Each Library Does:

🔵 Flask

Our web framework that helps us create a web server:

  • Flask: Creates the web application
  • render_template: Displays HTML pages to users
  • request: Gets data from forms that users submit

📦 joblib

Used to load our pre-trained Machine Learning model. Think of it as opening a saved file that contains all the "knowledge" our model learned during training.

🐼 pandas

A powerful library for organizing data. We use it to create a DataFrame (like an Excel spreadsheet) that our model can understand.

💻 os

Operating system library. We use it to find the correct file path to our model, so the code works on any computer.

Why This Matters: These imports give us all the tools we need to create a web application that can receive user input, process it, and return predictions.

2. Initialization & Model Loading

# Create Flask application app = Flask(__name__) # Smart path loading (works anywhere) current_dir = os.path.dirname(os.path.abspath(__file__)) model_path = os.path.join(current_dir, 'credit_risk_model.pkl') model = joblib.load(model_path)

Step-by-Step Breakdown:

  1. app = Flask(__name__): Creates our Flask web application. This is like starting a new website.
  2. Finding the Model File:
    • os.path.abspath(__file__): Gets the full path of the current file (app.py)
    • os.path.dirname(...): Gets the folder where app.py is located
    • os.path.join(...): Combines the folder path with the model filename
  3. model = joblib.load(model_path): Loads the trained Random Forest model from the file.

💡 Why This Matters

  • We load the model once when the server starts, not every time someone makes a prediction
  • This makes our application much faster!
  • The smart path handling ensures the code works whether you run it locally or deploy it to a server

Important Note: The model file (credit_risk_model.pkl) must be in the same folder as app.py for this to work.

3. Routing

Routes are like different "pages" or "endpoints" in our web application. Each route handles a specific type of request.

Route 1: Home Page

@app.route('/') def home(): return render_template('index.html')

What This Does:

  • @app.route('/'): This decorator tells Flask: "When someone visits the homepage, run this function"
  • render_template('index.html'): Finds and displays the HTML file that contains our loan application form

Why This Matters: This is how users see the form where they can enter loan applicant information.

Route 2: Prediction Endpoint

@app.route('/predict', methods=['POST']) def predict(): # ... prediction logic here ...

What This Does:

  • @app.route('/predict'): Creates a route at /predict
  • methods=['POST']: Only accepts POST requests (when users submit the form)
  • This is where all the prediction magic happens!

Why This Matters: When a user clicks "Submit" on the form, their browser sends a POST request to /predict with all the form data. This function processes it and returns the prediction.

4. Prediction Logic - Step by Step

This is the heart of our application! Let's break it down step by step:

Step 1: Receive Data from the Form

person_age = float(request.form['person_age']) person_income = float(request.form['person_income']) person_emp_length = float(request.form['person_emp_length']) loan_amnt = float(request.form['loan_amnt']) loan_int_rate = float(request.form['loan_int_rate']) cb_person_cred_hist_length = float(request.form['cb_person_cred_hist_length'])

What This Does:

  • request.form: Gets all the data from the HTML form
  • ['person_age']: Extracts the specific field (must match the name attribute in HTML)
  • float(...): Converts text input to a number

Why This Matters: We need to convert the form data (which comes as text) into numbers that our model can understand.

Step 2: Receive Text Data (Dropdowns)

person_home_ownership = request.form.get('person_home_ownership', 'RENT') loan_intent = request.form.get('loan_intent', 'PERSONAL') loan_grade = request.form.get('loan_grade', 'A') cb_person_default_on_file = request.form.get('cb_person_default_on_file', 'N')

What This Does:

  • .get(): Safely gets the value, with a default if it's missing
  • The second parameter (like 'RENT') is the default value if the field is empty

Why This Matters: Prevents errors if a dropdown field is missing and provides sensible defaults.

5. Feature Engineering - The Smart Calculations ⭐

# Feature Engineering if person_income > 0: loan_percent_income = loan_amnt / person_income interest_burden = (loan_amnt * (loan_int_rate / 100)) / person_income else: loan_percent_income = 0 interest_burden = 0

🔑 What is Feature Engineering?

Feature engineering means creating new, meaningful features from existing data. Instead of just using raw numbers, we calculate relationships that help the model make better predictions.

Understanding interest_burden

The Formula:

interest_burden = (loan_amnt × (loan_int_rate / 100)) / person_income

Breaking Down the Formula:

  1. loan_int_rate / 100: Converts percentage to decimal (e.g., 5% becomes 0.05)
  2. loan_amnt × (loan_int_rate / 100): Calculates the annual interest amount
  3. Divide by person_income: Shows what percentage of income goes to interest

📊 Real Example:

  • Loan amount: $10,000
  • Interest rate: 5%
  • Annual income: $50,000

Calculation:

  • Annual interest = $10,000 × (5 / 100) = $500
  • Interest burden = $500 / $50,000 = 0.01 (or 1% of income)

✅ Lower interest burden = Lower risk of default!

Understanding loan_percent_income

loan_percent_income = loan_amnt / person_income

What It Shows: What percentage of annual income the loan amount represents.

Example: If someone wants a $25,000 loan and earns $50,000/year:

  • loan_percent_income = $25,000 / $50,000 = 0.5 (or 50%)

Why This Matters: A loan that's 50% of someone's annual income is riskier than one that's 10%.

6. Prepare Data in DataFrame Format

input_data = pd.DataFrame({ 'person_age': [person_age], 'person_income': [person_income], 'person_home_ownership': [person_home_ownership], 'person_emp_length': [person_emp_length], 'loan_intent': [loan_intent], 'loan_grade': [loan_grade], 'loan_amnt': [loan_amnt], 'loan_int_rate': [loan_int_rate], 'loan_percent_income': [loan_percent_income], 'cb_person_default_on_file': [cb_person_default_on_file], 'cb_person_cred_hist_length': [cb_person_cred_hist_length], 'interest_burden': [interest_burden] })

What This Does:

  • Creates a pandas DataFrame (like a table with one row)
  • Each column name must match exactly what the model expects
  • The values are wrapped in [...] because DataFrame needs lists

Why This Matters:

  • Our trained model expects data in this exact format
  • Column names must match what was used during training
  • This is like filling out a form that the model can read

7. Making Predictions - predict() vs predict_proba() ⭐

# Get the prediction (answer) prediction = model.predict(input_data)[0] # Calculate probabilities (confidence scores) probabilities = model.predict_proba(input_data)[0] prob_safe = probabilities[0] * 100 # Safety percentage prob_risk = probabilities[1] * 100 # Risk percentage

model.predict() - The Answer

What It Does:

  • Returns the final prediction: 0 (SAFE) or 1 (RISKY)
  • [0] gets the first (and only) result from the array

Example Output: 0 means "This applicant is SAFE"

model.predict_proba() - The Confidence Score ⭐

What It Does:

  • Returns probabilities for each possible outcome
  • For binary classification, it returns two probabilities that add up to 100%

Example Output: [0.85, 0.15] means:

  • 85% chance of being SAFE (class 0)
  • 15% chance of being RISKY (class 1)

💡 Why This Matters:

  • predict() gives us the answer: "SAFE" or "RISKY"
  • predict_proba() tells us how confident the model is
  • A 95% safety score means the model is very confident
  • A 55% safety score means the model is less certain

Real-World Example:

  • Prediction: SAFE
  • Safety Percentage: 92%
  • This means: "The model is 92% confident this applicant is safe"

8. Format and Return the Result

if prediction == 1: # If risky res_text = f"⚠️ خطر! احتمالية التعثر: {prob_risk:.1f}%" color = "#e74c3c" # Red else: # If safe res_text = f"✅ عميل آمن (نسبة الأمان: {prob_safe:.1f}%)" color = "#2ecc71" # Green return render_template('index.html', prediction_text=res_text, color=color)

What This Does:

  • Checks if prediction is 1 (risky) or 0 (safe)
  • Creates a user-friendly message with the percentage
  • Sets a color (red for risky, green for safe)
  • Sends the result back to the HTML page

Why This Matters:

  • Users see a clear, colored result
  • The percentage helps them understand the model's confidence
  • The result appears on the same page (better user experience)

9. Error Handling

except Exception as e: print(f"Error occurred: {e}") return render_template('index.html', prediction_text=f"حدث خطأ: {e}", color="black")

What This Does:

  • Catches any errors that might occur
  • Prints the error to the console (for debugging)
  • Shows a friendly error message to the user

Why This Matters:

  • Prevents the website from crashing
  • Helps developers debug issues
  • Provides a better user experience

10. Complete Flow Summary

  1. User visits homepage → Sees the loan application form
  2. User fills out form → Enters applicant information
  3. User clicks Submit → Browser sends POST request to /predict
  4. Backend receives data → Extracts all form fields
  5. Feature engineering → Calculates interest_burden and loan_percent_income
  6. Data preparation → Organizes data into DataFrame format
  7. Model prediction → Uses predict() and predict_proba()
  8. Result formatting → Creates user-friendly message with percentage
  9. Return to user → Displays result on the webpage

⚡ Total Time: Usually less than 1 second!

🎯 Key Takeaways

  • Model Loading: Load once at startup for speed
  • Feature Engineering: Calculate meaningful relationships (like interest_burden)
  • Data Format: Must match exactly what the model expects
  • Two Predictions: Use both predict() (answer) and predict_proba() (confidence)
  • Error Handling: Always include try/except for production code
  • User Experience: Show clear, colored results with confidence percentages

💡 Tips for Understanding

  • Think of the model as a trained expert: It learned patterns from thousands of loan examples
  • Feature engineering is like asking smart questions: Instead of "What's their income?", we ask "What percentage of income goes to loan payments?"
  • Probabilities show confidence: 95% is very confident, 60% is uncertain
  • The DataFrame is like a form: The model reads it like a spreadsheet with one row