Build a Simple AI-Based Phishing Detector (Beginner Tutorial)

Train and test a lightweight phishing detector end to end with synthetic email data, clear validation, and safety guardrails.

What You’ll Build

A small TF-IDF + Logistic Regression text classifier for phishing vs benign emails.
Reproducible dataset generation to avoid leaking real PII.
Validation after each step plus cleanup.

Prerequisites

macOS or Linux with Python 3.12+.
pip available; ~200 MB free disk.
No email access needed; we generate synthetic samples.

Safety and Legal

Never train on real mailbox data without explicit approval and PII scrubbing.
Avoid storing raw emails; keep hashes or redacted text when possible.
Keep humans in the loop for blocking decisions; start with “quarantine + review.”

Step 1) Create an isolated environment

Click to view commands

python3 -m venv .venv-phish
source .venv-phish/bin/activate
pip install --upgrade pip
pip install pandas scikit-learn joblib

Validation: `pip show scikit-learn | grep Version` should be 1.5.x or newer.

Common fix: If activation fails, run chmod +x .venv-phish/bin/activate.

Step 2) Generate a synthetic labeled dataset

Click to view commands

cat > make_dataset.py <<'PY'
import pandas as pd

phish_samples = [
    ("Your account is locked. Verify immediately at http://fake-bank.com", 1),
    ("Urgent: update payroll info now or your pay is delayed", 1),
    ("Security alert: login from unknown device. Download the attached form", 1),
    ("Package held: pay customs fee via gift card", 1),
    ("Congrats, you won a prize! Click to claim", 1),
]

benign_samples = [
    ("Team meeting notes and next sprint goals", 0),
    ("Invoice attached for approved purchase order", 0),
    ("Reminder: security training scheduled next week", 0),
    ("Quarterly newsletter and product updates", 0),
    ("Welcome to the platform—getting started guide", 0),
]

df = pd.DataFrame(phish_samples + benign_samples, columns=["text", "label"])
df.to_csv("emails.csv", index=False)
print("Wrote emails.csv with", len(df), "rows")
PY

python make_dataset.py

Validation: `cat emails.csv` should show 10 rows with `text,label`.

Step 3) Train and evaluate the classifier

Click to view commands

cat > train_and_eval.py <<'PY'
import json
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.metrics import classification_report, confusion_matrix

df = pd.read_csv("emails.csv")
X_train, X_test, y_train, y_test = train_test_split(df["text"], df["label"], test_size=0.3, random_state=42, stratify=df["label"])

pipeline = Pipeline([
    ("tfidf", TfidfVectorizer(ngram_range=(1, 2), min_df=1)),
    ("clf", LogisticRegression(max_iter=400, class_weight="balanced")),
])

pipeline.fit(X_train, y_train)
preds = pipeline.predict(X_test)

report = classification_report(y_test, preds, target_names=["benign", "phish"], digits=3, output_dict=True)
cm = confusion_matrix(y_test, preds, labels=[0, 1])

with open("model.json", "w") as f:
    json.dump({"params": pipeline.get_params(deep=False)}, f, indent=2)

print("Confusion matrix [[TN, FP], [FN, TP]]:", cm.tolist())
print("Precision/Recall/F1:", json.dumps(report, indent=2))
PY

python train_and_eval.py

Validation: Expect high precision/recall on this toy set. Sample confusion matrix could be `[[2,0],[0,1]]`. If metrics are poor, increase samples or adjust `ngram_range`.

Common fixes:

ValueError: empty vocabulary => ensure emails.csv is not empty and min_df ≤ sample size.
If class imbalance arises, keep class_weight="balanced" or add more phishing examples.

Step 4) Add a simple scoring script with safety checks

Click to view commands

cat > score_email.py <<'PY'
import sys
import joblib
import pandas as pd
from sklearn.pipeline import Pipeline

MODEL_PATH = "model.pkl"

def load_model():
    return joblib.load(MODEL_PATH)

def main():
    if len(sys.argv) < 2:
        print("Usage: python score_email.py 'email text'")
        sys.exit(1)
    text = sys.argv[1]
    model: Pipeline = load_model()
    proba = model.predict_proba([text])[0][1]
    print(f"phish_probability={proba:.3f}")
    if proba > 0.7:
        print("Action: quarantine and send to human review")

if __name__ == "__main__":
    main()
PY

Save the trained model:

Click to view commands

pip install joblib
python - <<'PY'
import joblib
from sklearn.pipeline import Pipeline
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression

df = pd.read_csv("emails.csv")
pipe = Pipeline([
    ("tfidf", TfidfVectorizer(ngram_range=(1, 2), min_df=1)),
    ("clf", LogisticRegression(max_iter=400, class_weight="balanced")),
])
pipe.fit(df["text"], df["label"])
joblib.dump(pipe, "model.pkl")
print("Saved model.pkl")
PY

python score_email.py "Please reset your password at http://fake.com/reset"

Validation: Output should include `phish_probability` near 0.7–0.9 for the phishing-like sample.

Step 5) Add non-ML controls (defense in depth)

Enforce SPF/DKIM/DMARC on inbound mail; reject or quarantine failures.
Strip or rewrite links; sandbox attachments separately.
Log decisions and top contributing features for analyst review (use pipeline["tfidf"].get_feature_names_out() and model coefficients).
Rate-limit scoring API to prevent prompt flooding or model abuse.

Cleanup

Click to view commands

deactivate || true
rm -rf .venv-phish emails.csv make_dataset.py train_and_eval.py score_email.py model.pkl

Validation: `ls .venv-phish` should fail with “No such file or directory”.

Quick Reference

Use synthetic/redacted data; keep humans in the decision loop.
Validate with precision/recall; watch false positives before blocking.
Pair ML with email-auth controls and attachment/link sandboxing.
Keep models versioned (model.pkl) and log every scored message.

← Previous: Build a Basic Vulnerability Scanner in Rust (2026 Edition) Next: Build Your First AI-Powered Log Analyzer for SOC Operations →

FAQs

Can I use these labs in production?

No—treat them as educational. Adapt, review, and security-test before any production use.

How should I order or follow the lessons?

Lessons are listed in a consistent order on the Learn page. Start from the top and progress down, or jump to any topic and use Previous/Next to navigate.

What if I don't have test data or a lab?

Use synthetic data and local containers. Never point tools at networks or data you don't own or have written permission to test.

Can I share these materials?

Yes, but keep attribution and follow any licensing terms for included tools or datasets.

What You’ll Build

Prerequisites

Safety and Legal

Step 1) Create an isolated environment

Step 2) Generate a synthetic labeled dataset

Step 3) Train and evaluate the classifier

Step 4) Add a simple scoring script with safety checks

Step 5) Add non-ML controls (defense in depth)

Cleanup

Quick Reference

Similar Topics

FAQs