Skip to content

eKYC Encyclopedia

Edge AI Deployment

Edge AI Deployment¶

Definition¶

Deploying ML models on user devices (phones, tablets, IoT) for real-time, privacy-preserving, offline-capable eKYC processing.

Deployment Targets¶

Platform	Runtime	Model Format	GPU Access
iOS	CoreML	.mlmodel	Neural Engine, GPU
Android	TFLite / ONNX	.tflite / .onnx	NNAPI, GPU Delegate
Android	NCNN	.param + .bin	Vulkan GPU
Cross-platform	ONNX Runtime Mobile	.onnx	Platform-specific
Web	ONNX Runtime Web	.onnx	WebGL, WASM

Conversion Pipeline¶

graph LR
    A[PyTorch Model] --> B[Export to ONNX]
    B --> C{Target Platform}
    C -->|iOS| D[CoreML Tools → .mlmodel]
    C -->|Android| E[TFLite Converter → .tflite]
    C -->|Cross-platform| F[ONNX Runtime → .onnx]
    C -->|Web| G[ONNX Web → WASM]

Key Takeaways¶

Summary

ONNX is the most portable format — convert once, deploy everywhere
CoreML provides best iOS performance via Neural Engine
Quantize to INT8 before mobile deployment — 2-4x faster, 4x smaller
Always benchmark on actual target devices — simulator performance differs from real hardware