Skip to content

eKYC Encyclopedia

Model Drift & Retraining

Model Drift & Retraining¶

Definition¶

Model drift occurs when a production model's performance degrades over time due to changes in input data distribution, new attack types, or evolving user demographics.

Types of Drift¶

Type	Cause	Example in eKYC
Data drift	Input distribution changes	New phone models with different cameras
Concept drift	Relationship between input and label changes	New attack type not in training data
Label drift	Class distribution changes	Spoof ratio increases due to new tool availability
Upstream drift	Changes in preprocessing	SDK update changes image quality/format

Detection Methods¶

Method	How It Works
Score distribution monitoring	Track model output distribution — shift indicates drift
Performance monitoring	Track accuracy metrics (requires delayed ground truth)
Population Stability Index (PSI)	Compare current vs baseline feature distributions
Statistical tests	KS test, chi-squared test on feature/score distributions

Retraining Strategies¶

Strategy	When to Use
Scheduled	Retrain monthly/quarterly regardless
Trigger-based	Retrain when drift metric exceeds threshold
Continuous	Ongoing training on new data
Incremental	Fine-tune on new data only

Key Takeaways¶

Summary

Model drift is inevitable in eKYC — new phones, new attacks, new demographics
Score distribution monitoring is the fastest drift indicator (no labels needed)
Trigger-based retraining balances freshness with cost
Always validate retrained model on held-out test sets before deployment