ScamRadar+ — AI-Powered Scam Detection

Back to home

Model Metrics · v5 Production

PERFORMANCE

Calibrated Logistic Regression · 46 360 messages · 8 026 features · threshold = 0.47

0.00

Accuracy

+16.39% from v1 baseline

0.00

F1 Score

+13.3% from v1 baseline

0.00

AUC-ROC

+10.88% from v1 baseline

0.00

Precision

+2.47% from target

0.00

Recall

+2.12% from target

0.00

Scam Types

+4% from v4

Signal Analysis · Live Variance

MODEL SIGNALS

Confidence separation, precision–recall tradeoff, and training convergence over time

Confidence Separation

Score deviation from decision threshold (t = 0.47)

97.1%

Scam side

2.4%

Overlap

zero baselineConfidence

Precision–Recall Balance

P − R gap across threshold sweep (zero = balanced)

t = 0.1

High recall

t = 0.9

High prec.

zero baselinePrecision–Recall

Training Convergence

Accuracy gain Δ% per gradient step

81.0%

v1 start

97.4%

converged

zero baselineTraining

Prediction Quality · Threshold Analysis

DIAGNOSTIC CURVES

How model behaviour shifts across operating points and confidence levels

Precision vs Recall

As threshold rises — precision climbs, recall drops

Precision Recall▲ optimal @ 0.47

Confidence Distribution

% of messages per score bucket — scam vs legit

Scam % Legit %

F1 Score by Channel

Detection quality across email, URL, SMS, Reddit

Dashed line = overall accuracy · All channels ≥ 99%

Classifier Quality · Test Set

ROC & CONFUSION

How well the model separates scam from legitimate messages at every threshold

ROC Curve

AUC = 0.9958 · Near-perfect separation

Logistic Regression · calibratedAUC = 0.9958

Confusion Matrix

Predictions on held-out test set (9 272 messages)

Predicted: Legit

Predicted: Scam

Actual:
Legit

Actual:
Scam

4,675

True Negative

50.4%

112

False Positive

1.2%

129

False Negative

1.4%

4,356

True Positive

47.0%

Test set · 9,272 messages · threshold = 0.47

Classifier Benchmarking

MODEL COMPARISON

Three classifiers trained on the same feature set — Logistic Regression selected for production

Model	Accuracy	Precision	Recall	F1	AUC-ROC
Logistic RegressionPROD	97.39%	97.47%	97.12%	97.30%	99.58%
Random Forest	97.09%	97.01%	96.97%	96.99%	99.32%
Decision Tree	95.91%	95.91%	95.63%	95.77%	95.90%

Channel Breakdown

PER-CHANNEL ACCURACY

Detection quality across the four communication channels in the dataset

Accuracy · Precision · Recall · F1 — by Channel

Variable Relationships

FEATURES & DATA

Which signals drive decisions and where the training data comes from

Feature Importance — Top 10

Relative weight of numerical features in the production model

Training Dataset Composition

46 360 messages across 8 data sources — scam vs legit split

Iterative Improvement · v1 → v5

MODEL EVOLUTION

How each pipeline upgrade compounded into a 16.4pp accuracy gain over the baseline

Accuracy & AUC-ROC Progression

Each version adds a new feature tier to the previous one

Coverage · 17 Scam Categories

SCAM TYPE DETECTION

Rule-based type classifier with regex patterns across all known scam vectors

Detection Confidence by Scam Type

Estimated detection rate (%) per category

Coverage Table

All 17 scam types with channel and detection rates

Scam Type	Channel	Example Patterns	Detection
Phishing	Email / URL	Brand impersonation + verify/login lures	98%
Credential Phishing	Email	IT dept. spear-phish, student portal spoofs	97%
Prize Fraud	SMS / Email	Lottery winners, gift card prizes	99%
Bank Impersonation	SMS / Email	IRS threats, refund claims	97%
Job Scam	Email / SMS	WFH offers, $500/week no experience	96%
Investment Scam	SMS / Email	Crypto bots, guaranteed returns	98%
Romance Scam	SMS	Military catfish, dating app grooming	95%
Advance Fee	Email	Nigerian prince, inheritance funds	98%
Delivery Scam	SMS	USPS/DHL customs fee lures	99%
Social Media	SMS / Email	Link in bio, passive income schemes	97%
Emergency Scam	SMS	Grandparent scam, bail money	98%
Threat Scam	Email	Sextortion, IRS arrest warrants	97%
Pig ButcheringNEW	SMS	Slow crypto grooming + withdrawal trap	95%
QR PhishingNEW	SMS	Scan QR to verify / pay / login	98%
Refund ScamNEW	Email / SMS	Overpayment → gift card return demand	98%
SIM SwapNEW	SMS	Social engineering to extract OTP codes	98%
General Spam	All	Low-confidence catch-all	89%

System Architecture

HOW IT WORKS

9-stage inference pipeline from raw text to calibrated verdict

Preprocess

Unicode · emoji · HTML · l33t

→

Tone Score

Urgency · Fear · Reward · Threat

→

URL Check

TLD · keywords · IP · lookalike

→

Phrase Match

217 scam phrases (exact)

→

TF-IDF

5 000 word + 3 000 char n-grams

→

FAISS

k=10 scam vector proximity

→

Inference

LR · 8 026 features

→

Calibrate

Isotonic regression

→

Verdict

SCAM / SUSPICIOUS / LEGIT