Four ML components power this system. Each has a model selection rationale, a monitoring strategy, a drift detection mechanism, a retraining trigger, and a documented response to the hardest pushbacks a GCP architect or ML engineer would raise in a design review.
"Kal mujhe chutti chahiye, casual leave" mixes Hindi and English in a single utterance. Whisper handles this natively. (3) Vendor dependency for a cost-critical component.BaseLLM in LangGraph). Swapping to a self-hosted Llama 3.1 70B on a dedicated Cloud Run GPU instance at that scale is a config change, not a rewrite. The migration trigger is defined: when monthly LLM cost exceeds the cost of a dedicated L4 GPU instance (~$180/month).text-embedding-004, and stored in Supabase pgvector. Every Leave Agent query retrieves the top-3 relevant chunks and constructs the prompt context dynamically.active_policy_version updated in Firestore · old chunks moved to archive tablegemini-1.5-flash-002) rather than the latest alias. Version upgrades are deliberate, tested events — not automatic.text-embedding-004 and the index was built with an earlier version, cosine similarity scores become unreliable — the index is effectively stale even though the policy document hasn't changed.