Pipeline architecture
Requests flow through an async queue backed by Redis, ensuring burst traffic never overloads the translation backends. Each job records source language, target locales, and completion telemetry for later analytics.
- Extensible worker adapters for HuggingFace, Ollama, and custom inference endpoints.
- Automatic retries with exponential backoff and configurable TTLs.
- Per-locale quality thresholds with optional human review hooks.