Platform & Infrastructure — The Autonomous HR

01 — GCP reference architecture

All services.
One region. asia-south1.

All compute, storage, and managed services are deployed in GCP asia-south1 (Mumbai) for data residency compliance with India's DPDP Act 2023 and for latency to the Indian workforce. External services (Meta WhatsApp API, Supabase, Exotel) communicate over HTTPS — no VPN peering required.

External channels (outside GCP)

Meta WhatsAppCloud API · HTTPS webhooks

→

Exotel / PlivoIndia SIP · IVR gateway

→

Supabasepgvector · AWS ap-south-1

GCP Project: autohr-prod · asia-south1 (Mumbai)

Cloud Run · webhook-gatewayInbound WA + IVR webhooks · HTTPS · IAM-authenticated

→

Cloud Pub/SubINBOUND_MSG topic · event fabric

Cloud Run · stt-serviceWhisper large-v3 · T4 GPU spot · min-instances: 1

Cloud Run · nllb-serviceNLLB-200 3.3B · CPU · bundled with STT

→

Pub/Sub · STT_RESULTtranscript + language + confidence

Cloud Run · intent-routerGemini Flash · stateless · scale-to-zero

→

Cloud Run · leave-agentLangGraph · stateful · Firestore sessions

→

Cloud Run · hitl-managerHITL orchestration · webhook listener

Firestoreemployees · leave_requests · audit_log · sessions

Cloud Storagepolicy PDFs · audio backups · model artifacts

Artifact Registrycontainer images · Whisper versions

Secret ManagerWA token · Exotel key · Supabase URL

Cloud Monitoringalerting · dashboards · uptime checks

Cloud BuildCI/CD · container builds · tf plan/apply

Vertex AIGemini Flash · text-embedding-004

02 — Service catalogue

Every service.
Every configuration decision.

Twelve services. Each deployed with a specific configuration chosen for cost, latency, or compliance reasons. None of these are defaults — every parameter is a deliberate decision.

Cloud Run
webhook-gateway

GCP managed

Ingress · Channel normalisation

Receives inbound WhatsApp webhooks and IVR audio streams. Validates Meta signature header. Publishes normalised event to Pub/Sub.

min-instances: 1 (always warm — inbound messages cannot wait)
max-instances: 10
cpu: 1 vCPU · memory: 512Mi
ingress: all (public HTTPS endpoint)
auth: --no-allow-unauthenticated + Meta signature verification

Cloud Run
stt-service

OSS on GCP

Speech-to-text · Whisper large-v3

Transcribes audio. Detects language. Returns confidence score. GPU T4 spot — 83% cheaper than on-demand. Fallback to Google STT on spot unavailability.

min-instances: 1 warm (eliminates cold-start on calls)
gpu: nvidia-tesla-t4 · spot
cpu: 4 vCPU · memory: 16Gi
timeout: 30s per request
ingress: internal-only (Pub/Sub push subscription)

Cloud Run
intent-router

Gemini Flash

Orchestrator · Intent classification

Stateless. Classifies intent, extracts entities, routes to specialist agent via Pub/Sub. Fast path — must complete in under 1 second.

min-instances: 0 (scale-to-zero · warm in 800ms)
max-instances: 20
cpu: 1 vCPU · memory: 512Mi
concurrency: 80 requests per instance
ingress: internal-only

Cloud Run
leave-agent

LangGraph

Specialist agent · Leave lifecycle

Stateful LangGraph agent. Session state persisted in Firestore. Handles leave application, balance query, approval, denial, HITL escalation. Longest-running agent — average 4s per interaction.

min-instances: 0 (scale-to-zero)
max-instances: 10
cpu: 2 vCPU · memory: 1Gi
timeout: 60s (covers full state machine execution)
ingress: internal-only

Firestore
(Native mode)

GCP managed

Primary database · All entities

Stores employee records, leave ledger, HITL queue, sessions, and audit log. Append-only security rules on /audit_log — no service account holds update/delete permissions on this collection.

mode: Native (not Datastore mode)
location: asia-south1 (single-region · DPDP compliant)
backup: daily managed backup · 7-day retention
security rules: audit_log collection append-only

Cloud Pub/Sub

GCP managed

Event bus · Service decoupling

Five topics: INBOUND_MSG, STT_RESULT, AGENT_EVENT, HITL_ESCALATION, NOTIFICATION_OUT. All subscriptions use push delivery to Cloud Run endpoints. Dead-letter topics on all critical subscriptions.

delivery: push (not pull · lower latency)
ack deadline: 60s (covers worst-case agent execution)
dead-letter: enabled · max delivery attempts: 5
retention: 7 days (for replay on failures)

Secret Manager

GCP managed

Credentials · Zero secrets in code

Stores all credentials: WhatsApp API token, Exotel API key, Supabase connection string, Gemini API key. Each Cloud Run service accesses only the secrets it needs via its service account IAM binding.

rotation: automated 90-day rotation on WA token
access: per-service-account binding · principle of least privilege
audit: all accesses logged to Cloud Audit Logs
versions: previous version retained 30 days on rotation

Vertex AI
Gemini Flash

GCP managed

LLM inference · Intent + reasoning

Accessed via Vertex AI SDK from Leave Agent and Intent Router. Model version pinned — not using 'latest' alias. Region: asia-south1 for data residency. Quota: 60 QPM on Gemini Flash adequate for SMB scale.

model: gemini-1.5-flash-002 (pinned)
region: asia-south1 (data residency)
quota: 60 QPM · 1M TPM (adequate for ≤ 500 emp)
safety: harm_block_threshold: BLOCK_NONE (HR context)

Cloud Storage

GCP managed

Object storage · PDFs · artifacts

Three buckets: policy-documents (HR PDFs, versioned), audio-archive (voice recordings, 90-day TTL), model-artifacts (Whisper container layers). All buckets: asia-south1, uniform bucket-level access.

policy-documents: versioning enabled · lifecycle: archive after 1yr
audio-archive: TTL 90 days (DPDP minimum retention)
model-artifacts: versioning enabled · no TTL
access: uniform bucket-level (no per-object ACLs)

Artifact Registry

GCP managed

Container registry · Image versioning

All Cloud Run container images stored here. Two repositories: autohr-services (application containers) and autohr-models (Whisper fine-tuned images with model weights baked in). Vulnerability scanning enabled on push.

format: Docker
location: asia-south1
scanning: automated vulnerability scan on push
retention: keep last 10 versions per service · auto-clean older

Supabase
(pgvector)

External managed

Vector store · Policy RAG

Postgres + pgvector extension. Hosts HR Policy PDF embeddings. AWS ap-south-1 (Mumbai) — same Indian jurisdiction as GCP asia-south1. Connection string stored in Secret Manager. HNSW index for sub-10ms retrieval.

tier: Free (500MB · adequate for SMB policy docs)
region: aws ap-south-1 (IN jurisdiction · DPDP compliant)
index: HNSW · ef_construction: 128 · m: 16
migration trigger: > 400MB used → Supabase Pro ($25/mo)

Cloud Build
+ Cloud Deploy

GCP managed

CI/CD · Terraform pipeline

Cloud Build runs on every push to main: lint, test, container build, push to Artifact Registry. Cloud Deploy manages progressive delivery to staging then production. Terraform plan/apply runs in Cloud Build with remote state in GCS.

triggers: push to main · PR to main (plan only)
tf state: GCS bucket · versioned · state locking
approval: manual gate between staging → production
rollback: Cloud Run revision rollback · < 30s

03 — Terraform IaC

Infrastructure as code.
No console. No exceptions.

Every GCP resource in this system is defined in Terraform. The repo contains a complete /terraform directory that can provision the entire infrastructure from a cold GCP project. Select modules shown below — full source in the GitHub repo.

Cloud Run — Leave Agent terraform/modules/cloud_run/leave_agent.tf

resource "google_cloud_run_v2_service" "leave_agent" { name = "leave-agent" location = var.region # asia-south1 ingress = "INGRESS_TRAFFIC_INTERNAL_ONLY" template { service_account = google_service_account.leave_agent_sa.email scaling { min_instance_count = 0 max_instance_count = 10 } containers { image = "${var.region}-docker.pkg.dev/${var.project_id}/autohr-services/leave-agent:${var.image_tag}" resources { limits = { cpu = "2" memory = "1Gi" } } # Secrets from Secret Manager — no env var credentials env { name = "GEMINI_API_KEY" value_source { secret_key_ref { secret = google_secret_manager_secret.gemini_key.secret_id version = "latest" } } } env { name = "GEMINI_MODEL" value = "gemini-1.5-flash-002" # pinned — not latest } env { name = "RAG_CONFIDENCE_THRESHOLD" value = "0.80" } startup_probe { http_get { path = "/health" } initial_delay_seconds = 10 failure_threshold = 3 } } } }

Firestore Security Rules terraform/modules/firestore/security_rules.tf

# Firestore security rules — audit_log is append-only resource "google_firestore_document" "security_rules" { project = var.project_id collection = "_security_rules" document_id = "rules" # Inline rules — enforced at Firestore layer # No service account can bypass these } # The actual Firestore rules (firestore.rules) # deployed via firebase-tools in CI/CD: /* rules_version = '2'; service cloud.firestore { match /databases/{database}/documents { // Audit log: service accounts may CREATE only // UPDATE and DELETE are denied for ALL identities match /audit_log/{entry} { allow create: if request.auth != null; allow read: if request.auth != null; allow update: if false; // immutable allow delete: if false; // immutable } // Leave requests: agents may create and read // Workers may read their own records only match /leave_requests/{id} { allow create: if request.auth != null; allow read: if request.auth.uid == resource.data.employee_id || request.auth.token.role == "agent"; allow update: if request.auth.token.role == "agent"; allow delete: if false; } // Sessions: 24h TTL enforced via scheduled Cloud Function match /sessions/{id} { allow read, write: if request.auth.token.role == "agent"; } } } */

IAM — Service Account bindings terraform/modules/iam/service_accounts.tf

# Each service gets its own SA — minimum permissions resource "google_service_account" "leave_agent_sa" { account_id = "leave-agent-sa" display_name = "Leave Agent Service Account" } # Firestore: read + write (not admin) resource "google_project_iam_member" "leave_agent_firestore" { project = var.project_id role = "roles/datastore.user" member = "serviceAccount:${google_service_account.leave_agent_sa.email}" } # Secret Manager: access to leave-agent secrets only resource "google_secret_manager_secret_iam_member" "leave_agent_gemini" { secret_id = google_secret_manager_secret.gemini_key.id role = "roles/secretmanager.secretAccessor" member = "serviceAccount:${google_service_account.leave_agent_sa.email}" } # Pub/Sub: publish to AGENT_EVENT topic only resource "google_pubsub_topic_iam_member" "leave_agent_pubsub" { topic = google_pubsub_topic.agent_event.name role = "roles/pubsub.publisher" member = "serviceAccount:${google_service_account.leave_agent_sa.email}" } # Vertex AI: invoke Gemini models only resource "google_project_iam_member" "leave_agent_vertex" { project = var.project_id role = "roles/aiplatform.user" member = "serviceAccount:${google_service_account.leave_agent_sa.email}" } # NOTE: STT service account does NOT get Vertex AI role # Pub/Sub service account does NOT get Firestore role # No cross-domain access — enforced at IAM layer

Cloud Run — STT Service (GPU) terraform/modules/cloud_run/stt_service.tf

resource "google_cloud_run_v2_service" "stt_service" { name = "stt-service" location = var.region ingress = "INGRESS_TRAFFIC_INTERNAL_ONLY" template { service_account = google_service_account.stt_sa.email scaling { min_instance_count = 1 # warm — eliminates GPU cold start max_instance_count = 5 } node_selector { accelerator = "nvidia-tesla-t4" } containers { image = "${var.region}-docker.pkg.dev/${var.project_id}/autohr-models/whisper-large-v3:${var.whisper_tag}" resources { limits = { cpu = "4" memory = "16Gi" "nvidia.com/gpu" = "1" } startup_cpu_boost = true } env { name = "WHISPER_MODEL" value = "large-v3" } env { name = "FALLBACK_TO_GOOGLE_STT" value = "true" # automatic fallback on error } env { name = "MIN_CONFIDENCE_THRESHOLD" value = "0.85" } } } }

Pub/Sub topics + subscriptions terraform/modules/pubsub/topics.tf

# Five topics — one per pipeline stage locals { topics = [ "inbound-msg", "stt-result", "agent-event", "hitl-escalation", "notification-out", ] } resource "google_pubsub_topic" "topics" { for_each = toset(local.topics) name = "autohr-${each.key}" # Retain messages 7 days for replay on failure message_retention_duration = "604800s" } # Dead-letter topic for all critical subscriptions resource "google_pubsub_topic" "dead_letter" { name = "autohr-dead-letter" } # Push subscription: inbound-msg → stt-service resource "google_pubsub_subscription" "inbound_to_stt" { name = "inbound-to-stt" topic = google_pubsub_topic.topics["inbound-msg"].name push_config { push_endpoint = "${google_cloud_run_v2_service.stt_service.uri}/transcribe" oidc_token { service_account_email = google_service_account.pubsub_invoker_sa.email } } ack_deadline_seconds = 60 dead_letter_policy { dead_letter_topic = google_pubsub_topic.dead_letter.id max_delivery_attempts = 5 } }

Cloud Monitoring — Alerting policies terraform/modules/monitoring/alerts.tf

# Alert: STT P95 latency > 6s resource "google_monitoring_alert_policy" "stt_latency" { display_name = "STT P95 latency exceeded" combiner = "OR" conditions { display_name = "STT P95 > 6000ms" condition_threshold { filter = "resource.type=\"cloud_run_revision\" AND metric.type=\"run.googleapis.com/request_latencies\"" comparison = "COMPARISON_GT" threshold_value = 6000 duration = "300s" aggregations { alignment_period = "300s" per_series_aligner = "ALIGN_PERCENTILE_95" } } } notification_channels = [google_monitoring_notification_channel.oncall.name] } # Alert: monthly spend projection > ₹1500 resource "google_billing_budget" "monthly_budget" { billing_account = var.billing_account_id display_name = "AutoHR monthly budget" budget_filter { projects = ["projects/${var.project_id}"] } amount { specified_amount { currency_code = "INR" units = "2000" # hard ceiling } } threshold_rules { threshold_percent = 0.75 } # alert at ₹1500 threshold_rules { threshold_percent = 1.0 } # alert at ₹2000 }

04 — Security model

Threat model for
an SMB HR system.

The threat model is not a regulated enterprise — it is a 52-person textile business. The security priorities reflect this: protect employee PII, prevent unauthorised leave manipulation, ensure the audit log cannot be tampered with. Zero-trust networking and hardware security modules are not in scope at this stage.

Control 01 · Identity

Mobile number as employee identity

WhatsApp sender number verified by Meta API on every inbound message. IVR caller ID verified by Exotel. No password, no token, no app login required. The phone number IS the credential — already verified by the telco and by Meta's onboarding process.

Threat mitigated: employee impersonation via web portal
Not mitigated: SIM swap — accepted risk at SMB threat level

Control 02 · Authorisation

Per-service IAM — minimum permissions

Each Cloud Run service has a dedicated Service Account with permissions scoped to exactly the resources it needs. The STT service cannot read Firestore. The Leave Agent cannot access grievance records. Lateral movement is prevented at the IAM layer — a compromised container can only access its own domain.

Enforced in Terraform service_accounts.tf
Verified: gcloud iam service-accounts list --format=json

Control 03 · Secrets

Zero credentials in code

All secrets (WhatsApp API token, Exotel key, Supabase connection string, Gemini API key) stored in Secret Manager. No credential appears in source code, Dockerfiles, environment variable defaults, or CI/CD configuration. Secrets are injected at Cloud Run startup via IAM-authenticated Secret Manager API calls.

Enforced: pre-commit hook scans for credential patterns
Rotation: automated 90-day rotation on WA token via Secret Manager

Control 04 · Data integrity

Append-only audit log

Firestore security rules deny update and delete on /audit_log for all identities — including the owner service account. No past record can be modified. In a labour dispute, the complete, unmodified history of every HR decision is available. This is the system's legal defence — and it is enforced at the database layer, not at the application layer.

rules_version = '2'
allow update: if false; // immutable
allow delete: if false; // immutable

Control 05 · Transport

HTTPS everywhere — Google-managed TLS

All Cloud Run endpoints are HTTPS with Google-managed TLS certificates. Pub/Sub uses TLS transport. Supabase connection uses TLS with certificate pinning in the connection string. No plaintext communication on any path.

Cloud Run: TLS termination at Google Front End
Pub/Sub: TLS 1.3 enforced on all subscriptions

Control 06 · Compliance

DPDP Act 2023 — data residency

All GCP resources in asia-south1 (Mumbai). Supabase on AWS ap-south-1 (Mumbai). Vertex AI inference pinned to asia-south1. No personal data leaves Indian jurisdiction. Audio recordings retained maximum 90 days per DPDP minimum period. Employee records retained per labour law requirements (employment duration + 3 years).

Terraform: provider region = "asia-south1"
Org policy: constraints/gcp.resourceLocations enforced

05 — CI/CD pipeline

From commit to production.
Automated. Gated. Reversible.

Every change to the system — application code, Terraform configuration, or Whisper model weights — goes through the same pipeline. No console deployments. No manual Cloud Run deploys. The pipeline is the only path to production.

Code push to feature branch

Developer pushes to feature/ branch · pre-commit hooks run: credential scan, linting, unit tests · Cloud Build trigger: PR validation pipeline

Cloud Build · pre-commit

Pull request — Terraform plan

terraform plan runs against staging state · diff posted as PR comment · no apply until merge · manual code review required

Terraform Cloud Build trigger

Merge to main — container build

Cloud Build builds container image · docker build with layer caching · vulnerability scan via Artifact Registry · image pushed with git_sha tag

Cloud Build · Artifact Registry

Staging deployment — automated

terraform apply to staging workspace · Cloud Run revision deployed · integration tests run against staging endpoints · Pub/Sub end-to-end smoke test

Cloud Deploy · staging

Manual promotion gate

Cloud Deploy promotion requires manual approval · staging test results reviewed · Terraform plan against production state reviewed · one-click promote or reject

Cloud Deploy · approval gate

Production deployment — traffic split

New Cloud Run revision deployed with 10% traffic · error rate monitored for 15 minutes · automated full promotion or rollback based on error threshold

Cloud Run · traffic splitting

Rollback — 30 seconds

Previous Cloud Run revision retained for 30 days · instant rollback via gcloud run services update-traffic · Terraform state rolled back in separate operation

Cloud Run revisions

Service account	Firestore	Pub/Sub	Secret Manager	Vertex AI	Cloud Storage	Artifact Registry
webhook-gateway-sa	—	publish: inbound-msg	WA token (read)	—	—	—
stt-service-sa	—	publish: stt-result	—	—	audio-archive (write)	—
intent-router-sa	read: employees	publish: agent-event	Gemini key (read)	aiplatform.user	—	—
leave-agent-sa	read/write: leave_requests, employees, sessions · create: audit_log · read: audit_log	publish: hitl-escalation, notification-out	Gemini key · Supabase URL (read)	aiplatform.user	—	—
hitl-manager-sa	read/write: hitl_queue · create: audit_log	publish: notification-out	WA token (read)	—	—	—
notification-sa	—	subscribe: notification-out	WA token (read)	—	—	—
rag-indexer-sa	read/write: policy_versions	—	Supabase URL · Gemini key (read)	aiplatform.user	policy-documents (read)	—
cicd-cloudbuild-sa	—	—	—	—	tf-state bucket (read/write)	reader + writer

GCP reference architecture.Infrastructure as code.

GCP reference architecture.
Infrastructure as code.