Build a Simple AI-Based Phishing Detector (Beginner Tutorial)

Train and test a lightweight phishing detector end to end with synthetic email data, clear validation, and safety guardrails.

What You’ll Build

A small TF-IDF + Logistic Regression text classifier for phishing vs benign emails.
Reproducible dataset generation to avoid leaking real PII.
Validation after each step plus cleanup.

Prerequisites

macOS or Linux with Python 3.12+.
pip available; ~200 MB free disk.
No email access needed; we generate synthetic samples.

Safety and Legal

Never train on real mailbox data without explicit approval and PII scrubbing.
Avoid storing raw emails; keep hashes or redacted text when possible.
Keep humans in the loop for blocking decisions; start with “quarantine + review.”

Step 1) Create an isolated environment

Click to view commands

python3 -m venv .venv-phish
source .venv-phish/bin/activate
pip install --upgrade pip
pip install pandas scikit-learn joblib

Validation: `pip show scikit-learn | grep Version` should be 1.5.x or newer.

Common fix: If activation fails, run chmod +x .venv-phish/bin/activate.

Step 2) Generate a synthetic labeled dataset

Click to view commands

cat > make_dataset.py <<'PY'
import pandas as pd

phish_samples = [
    ("Your account is locked. Verify immediately at http://fake-bank.com", 1),
    ("Urgent: update payroll info now or your pay is delayed", 1),
    ("Security alert: login from unknown device. Download the attached form", 1),
    ("Package held: pay customs fee via gift card", 1),
    ("Congrats, you won a prize! Click to claim", 1),
]

benign_samples = [
    ("Team meeting notes and next sprint goals", 0),
    ("Invoice attached for approved purchase order", 0),
    ("Reminder: security training scheduled next week", 0),
    ("Quarterly newsletter and product updates", 0),
    ("Welcome to the platform—getting started guide", 0),
]

df = pd.DataFrame(phish_samples + benign_samples, columns=["text", "label"])
df.to_csv("emails.csv", index=False)
print("Wrote emails.csv with", len(df), "rows")
PY

python make_dataset.py

Validation: `cat emails.csv` should show 10 rows with `text,label`.

Step 3) Train and evaluate the classifier

Click to view commands

cat > train_and_eval.py <<'PY'
import json
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.metrics import classification_report, confusion_matrix

df = pd.read_csv("emails.csv")
X_train, X_test, y_train, y_test = train_test_split(df["text"], df["label"], test_size=0.3, random_state=42, stratify=df["label"])

pipeline = Pipeline([
    ("tfidf", TfidfVectorizer(ngram_range=(1, 2), min_df=1)),
    ("clf", LogisticRegression(max_iter=400, class_weight="balanced")),
])

pipeline.fit(X_train, y_train)
preds = pipeline.predict(X_test)

report = classification_report(y_test, preds, target_names=["benign", "phish"], digits=3, output_dict=True)
cm = confusion_matrix(y_test, preds, labels=[0, 1])

with open("model.json", "w") as f:
    json.dump({"params": pipeline.get_params(deep=False)}, f, indent=2)

print("Confusion matrix [[TN, FP], [FN, TP]]:", cm.tolist())
print("Precision/Recall/F1:", json.dumps(report, indent=2))
PY

python train_and_eval.py

Validation: Expect high precision/recall on this toy set. Sample confusion matrix could be `[[2,0],[0,1]]`. If metrics are poor, increase samples or adjust `ngram_range`.

Common fixes:

ValueError: empty vocabulary => ensure emails.csv is not empty and min_df ≤ sample size.
If class imbalance arises, keep class_weight="balanced" or add more phishing examples.

Step 4) Add a simple scoring script with safety checks

Click to view commands

cat > score_email.py <<'PY'
import sys
import joblib
import pandas as pd
from sklearn.pipeline import Pipeline

MODEL_PATH = "model.pkl"

def load_model():
    return joblib.load(MODEL_PATH)

def main():
    if len(sys.argv) < 2:
        print("Usage: python score_email.py 'email text'")
        sys.exit(1)
    text = sys.argv[1]
    model: Pipeline = load_model()
    proba = model.predict_proba([text])[0][1]
    print(f"phish_probability={proba:.3f}")
    if proba > 0.7:
        print("Action: quarantine and send to human review")

if __name__ == "__main__":
    main()
PY

Save the trained model:

Click to view commands

pip install joblib
python - <<'PY'
import joblib
from sklearn.pipeline import Pipeline
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression

df = pd.read_csv("emails.csv")
pipe = Pipeline([
    ("tfidf", TfidfVectorizer(ngram_range=(1, 2), min_df=1)),
    ("clf", LogisticRegression(max_iter=400, class_weight="balanced")),
])
pipe.fit(df["text"], df["label"])
joblib.dump(pipe, "model.pkl")
print("Saved model.pkl")
PY

python score_email.py "Please reset your password at http://fake.com/reset"

Validation: Output should include `phish_probability` near 0.7–0.9 for the phishing-like sample.

Step 5) Add non-ML controls (defense in depth)

Enforce SPF/DKIM/DMARC on inbound mail; reject or quarantine failures.
Strip or rewrite links; sandbox attachments separately.
Log decisions and top contributing features for analyst review (use pipeline["tfidf"].get_feature_names_out() and model coefficients).
Rate-limit scoring API to prevent prompt flooding or model abuse.

Cleanup

Click to view commands

deactivate || true
rm -rf .venv-phish emails.csv make_dataset.py train_and_eval.py score_email.py model.pkl

Validation: `ls .venv-phish` should fail with “No such file or directory”.

Quick Reference

Use synthetic/redacted data; keep humans in the decision loop.
Validate with precision/recall; watch false positives before blocking.
Pair ML with email-auth controls and attachment/link sandboxing.
Keep models versioned (model.pkl) and log every scored message.

Learn in Public unlocks on Jan 1, 2026

Build a Simple AI-Based Phishing Detector (Beginner Tutorial)

What You’ll Build

Prerequisites

Safety and Legal

Step 1) Create an isolated environment

Step 2) Generate a synthetic labeled dataset

Step 3) Train and evaluate the classifier

Step 4) Add a simple scoring script with safety checks

Step 5) Add non-ML controls (defense in depth)

Cleanup

Quick Reference

Similar Topics

FAQs