Learn in Public unlocks on Jan 1, 2026

This lesson will be public then. Admins can unlock early with a password.

Build a Simple AI-Based Phishing Detector (Beginner Tutorial)
Learn Cybersecurity

Build a Simple AI-Based Phishing Detector (Beginner Tutorial)

Train a lightweight phishing classifier with text features, evaluate accuracy, and add anti-spoofing safeguards.

phishing detection ml ai security text classification email security

Train and test a lightweight phishing detector end to end with synthetic email data, clear validation, and safety guardrails.

What You’ll Build

  • A small TF-IDF + Logistic Regression text classifier for phishing vs benign emails.
  • Reproducible dataset generation to avoid leaking real PII.
  • Validation after each step plus cleanup.

Prerequisites

  • macOS or Linux with Python 3.12+.
  • pip available; ~200 MB free disk.
  • No email access needed; we generate synthetic samples.
  • Never train on real mailbox data without explicit approval and PII scrubbing.
  • Avoid storing raw emails; keep hashes or redacted text when possible.
  • Keep humans in the loop for blocking decisions; start with “quarantine + review.”

Step 1) Create an isolated environment

Click to view commands
python3 -m venv .venv-phish
source .venv-phish/bin/activate
pip install --upgrade pip
pip install pandas scikit-learn joblib
Validation: `pip show scikit-learn | grep Version` should be 1.5.x or newer.

Common fix: If activation fails, run chmod +x .venv-phish/bin/activate.

Step 2) Generate a synthetic labeled dataset

Click to view commands
cat > make_dataset.py <<'PY'
import pandas as pd

phish_samples = [
    ("Your account is locked. Verify immediately at http://fake-bank.com", 1),
    ("Urgent: update payroll info now or your pay is delayed", 1),
    ("Security alert: login from unknown device. Download the attached form", 1),
    ("Package held: pay customs fee via gift card", 1),
    ("Congrats, you won a prize! Click to claim", 1),
]

benign_samples = [
    ("Team meeting notes and next sprint goals", 0),
    ("Invoice attached for approved purchase order", 0),
    ("Reminder: security training scheduled next week", 0),
    ("Quarterly newsletter and product updates", 0),
    ("Welcome to the platform—getting started guide", 0),
]

df = pd.DataFrame(phish_samples + benign_samples, columns=["text", "label"])
df.to_csv("emails.csv", index=False)
print("Wrote emails.csv with", len(df), "rows")
PY

python make_dataset.py
Validation: `cat emails.csv` should show 10 rows with `text,label`.

Step 3) Train and evaluate the classifier

Click to view commands
cat > train_and_eval.py <<'PY'
import json
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.metrics import classification_report, confusion_matrix

df = pd.read_csv("emails.csv")
X_train, X_test, y_train, y_test = train_test_split(df["text"], df["label"], test_size=0.3, random_state=42, stratify=df["label"])

pipeline = Pipeline([
    ("tfidf", TfidfVectorizer(ngram_range=(1, 2), min_df=1)),
    ("clf", LogisticRegression(max_iter=400, class_weight="balanced")),
])

pipeline.fit(X_train, y_train)
preds = pipeline.predict(X_test)

report = classification_report(y_test, preds, target_names=["benign", "phish"], digits=3, output_dict=True)
cm = confusion_matrix(y_test, preds, labels=[0, 1])

with open("model.json", "w") as f:
    json.dump({"params": pipeline.get_params(deep=False)}, f, indent=2)

print("Confusion matrix [[TN, FP], [FN, TP]]:", cm.tolist())
print("Precision/Recall/F1:", json.dumps(report, indent=2))
PY

python train_and_eval.py
Validation: Expect high precision/recall on this toy set. Sample confusion matrix could be `[[2,0],[0,1]]`. If metrics are poor, increase samples or adjust `ngram_range`.

Common fixes:

  • ValueError: empty vocabulary => ensure emails.csv is not empty and min_df ≤ sample size.
  • If class imbalance arises, keep class_weight="balanced" or add more phishing examples.

Step 4) Add a simple scoring script with safety checks

Click to view commands
cat > score_email.py <<'PY'
import sys
import joblib
import pandas as pd
from sklearn.pipeline import Pipeline

MODEL_PATH = "model.pkl"

def load_model():
    return joblib.load(MODEL_PATH)

def main():
    if len(sys.argv) < 2:
        print("Usage: python score_email.py 'email text'")
        sys.exit(1)
    text = sys.argv[1]
    model: Pipeline = load_model()
    proba = model.predict_proba([text])[0][1]
    print(f"phish_probability={proba:.3f}")
    if proba > 0.7:
        print("Action: quarantine and send to human review")

if __name__ == "__main__":
    main()
PY
Save the trained model:
Click to view commands
pip install joblib
python - <<'PY'
import joblib
from sklearn.pipeline import Pipeline
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression

df = pd.read_csv("emails.csv")
pipe = Pipeline([
    ("tfidf", TfidfVectorizer(ngram_range=(1, 2), min_df=1)),
    ("clf", LogisticRegression(max_iter=400, class_weight="balanced")),
])
pipe.fit(df["text"], df["label"])
joblib.dump(pipe, "model.pkl")
print("Saved model.pkl")
PY

python score_email.py "Please reset your password at http://fake.com/reset"
Validation: Output should include `phish_probability` near 0.7–0.9 for the phishing-like sample.

Step 5) Add non-ML controls (defense in depth)

  • Enforce SPF/DKIM/DMARC on inbound mail; reject or quarantine failures.
  • Strip or rewrite links; sandbox attachments separately.
  • Log decisions and top contributing features for analyst review (use pipeline["tfidf"].get_feature_names_out() and model coefficients).
  • Rate-limit scoring API to prevent prompt flooding or model abuse.

Cleanup

Click to view commands
deactivate || true
rm -rf .venv-phish emails.csv make_dataset.py train_and_eval.py score_email.py model.pkl
Validation: `ls .venv-phish` should fail with “No such file or directory”.

Quick Reference

  • Use synthetic/redacted data; keep humans in the decision loop.
  • Validate with precision/recall; watch false positives before blocking.
  • Pair ML with email-auth controls and attachment/link sandboxing.
  • Keep models versioned (model.pkl) and log every scored message.

Similar Topics

FAQs

Can I use these labs in production?

No—treat them as educational. Adapt, review, and security-test before any production use.

How should I follow the lessons?

Start from the Learn page order or use Previous/Next on each lesson; both flow consistently.

What if I lack test data or infra?

Use synthetic data and local/lab environments. Never target networks or data you don't own or have written permission to test.

Can I share these materials?

Yes, with attribution and respecting any licensing for referenced tools or datasets.