Cost Methodology — The Autonomous HR

01 — Component by component

Six cost components.
Four are zero.

The architecture is designed so that the majority of components fall entirely within free tiers at SMB scale. Real costs accrue on exactly three line items: compute, voice, and LLM inference. Everything else is structurally zero at under 200 employees.

Calculation

Interactions / month 500

Avg tokens per interaction (input + output) ~1,000 tokens

Total tokens / month ~500,000 tokens

Gemini 1.5 Flash input price $0.075 / 1M tokens

Gemini 1.5 Flash output price $0.30 / 1M tokens

Blended avg (70% input / 30% output) ~$0.143 / 1M tokens

Monthly cost ~$0.04

Why Gemini 1.5 Flash

Gemini 1.5 Flash is retained as a managed GCP service rather than replaced with an open-source alternative because latency SLA matters here. A worker waiting for a leave confirmation on a phone call cannot tolerate an 8-second cold-start on a CPU-only container. Gemini Flash on Vertex AI delivers P99 latency under 2 seconds for these workloads.

At $0.075/1M input tokens, it is the cheapest capable hosted LLM available from any major provider as of Q1 2026. Gemini 2.5 Flash-Lite ($0.10/1M) and GPT-4o-mini ($0.15/1M) are both more expensive at equivalent quality for this task type.

Source: Google Cloud — Vertex AI Pricing · MetaCTO — Gemini Pricing Guide 2026

Calculation

Voice calls / month 50 calls

Average call duration 45 seconds

Total audio / month 37.5 minutes

Whisper on T4 spot GPU · ~$0.0003/min ~$0.01

Cloud Run GPU spot surcharge (amortised) ~$0.01

Comparison: Google STT v2 at $0.006/15s would cost ~$0.09

Monthly cost (Whisper) ~$0.02

Why Whisper over Google STT

OpenAI Whisper large-v3 achieves word error rates below 8% across 99 languages, with particular strength in Indic languages, code-switching (Hinglish, Tenglish), and regional accents that Google STT v2 handles less consistently for the dialects spoken in South and Southeast Asian manufacturing centres.

Deployed as a containerised Cloud Run service with a minimum-instance warm configuration, cold start is eliminated. On failure, the service falls back automatically to Google STT — maintaining uptime SLA at managed-service levels.

Cost saving vs Google STT: ~78% at this interaction volume.

Source: Google Cloud Run Pricing · CloudChipr — Cloud Run Pricing Guide 2025

Calculation

CPU rate (Tier 1 · asia-south1) $0.000024 / vCPU-sec

Memory rate $0.0000025 / GiB-sec

Free tier (monthly) 180K vCPU-sec · 2M req

Estimated active compute / month ~220K vCPU-sec

Billable after free tier ~40K vCPU-sec

Whisper min-instance warm keep (1 vCPU) ~$3.20 / month

Remaining active compute cost ~$2.00

Monthly cost ~$5.20

Scale-to-zero architecture

Cloud Run is the single largest cost item — and intentionally so. Every service scales to zero when idle. A 50-person retail business does not generate HR interactions at 3am. The cost aligns exactly with usage.

The largest sub-cost is the Whisper minimum instance: keeping one warm container avoids a 3–8 second GPU cold start when a worker calls. This is a deliberate design trade: $3.20/month buys consistent sub-4-second transcription response rather than occasional cold-start delays that break the voice UX.

Cloud Run's free tier (180,000 vCPU-seconds/month) absorbs roughly the first 180 interactions entirely. At <200 employees, a significant portion of month-end compute may remain in the free tier.

Source: Google Cloud — Cloud Run Pricing · ProsperOps — Cloud Run Cost Optimisation

Calculation

Voice calls / month 50 calls

Average call duration 45 seconds

Total minutes / month 37.5 minutes

Exotel India SIP rate (approx) ~₹0.30–0.40 / min

Equivalent in USD ~$0.004 / min

Comparison: Twilio Voice (US billing) $0.013 / min — 3× more

Monthly cost ~$1.30

Why a local India SIP provider

Exotel and Plivo both operate India-local SIP infrastructure with data centres in Mumbai and Chennai. Using a local PSTN interconnect instead of routing through a US-billed provider like Twilio eliminates the international leg cost and reduces call latency.

Exotel's credit-based pricing (1 credit = ₹1) and startup tiers make it practical for a 50-person business without a contract commitment. For deployments outside India, the same architecture substitutes the local SIP provider for the equivalent regional operator — the IVR layer is fully provider-agnostic.

Note: Exotel does not publish per-minute rates on its public pricing page; the ₹0.30–0.40/min figure is derived from their credit-based plan structure and third-party comparisons. Exact rates require a sales quote and vary by call type and volume tier.

Source: Exotel — Plans & Pricing · CloudTalk — Exotel Pricing Guide 2026

Calculation

Total interactions / month 500

Nature of interactions User-initiated service

User-initiated service messages (Meta fee) $0.00 (free)

Utility templates in 24h service window $0.00 (free, from Jul 2025)

Business-initiated utility (India rate) $0.0107 / msg if outside window

Estimated business-initiated msgs / month ~0 (workers initiate)

Monthly cost $0.00

Why WhatsApp is structurally free for this use case

Meta's July 2025 pricing shift from conversation-based to per-message billing was a significant change — but one that favours this architecture specifically.

The HR system is almost entirely worker-initiated: the employee sends the message, the system replies. Under Meta's model, user-initiated service conversations carry zero Meta fee. Utility template replies sent within the 24-hour service window are also free from July 2025 onwards.

The only scenario incurring a charge would be a proactive HITL escalation message to the owner — a business-initiated utility template at ~$0.011 per message in India, or ~$0.05 in Europe. At <50 escalations/month, this cost rounds to negligible ($0.55/month worst case for India).

This is not a loophole — it is the intended model. Meta explicitly exempts worker-initiated support interactions from fees to encourage business adoption.

Source: Spur — WhatsApp Business API Pricing 2026 · Latenode — WhatsApp API Pricing Guide 2025 · SleekFlow — WhatsApp Pricing Reference

Free tier limits vs. our usage

Firestore free reads / day 50,000 · we use ~600

Firestore free writes / day 20,000 · we use ~170

Pub/Sub free messages / month 10 GB · we use ~0.1 MB

Cloud Functions free invocations / month 2,000,000 · we use ~1,000

Supabase free DB storage 500 MB · policy doc ~1–2 MB

Supabase free pgvector queries Unlimited on free tier

Monthly cost $0.00

Structural free tier fit

A 50-person business generating 500 HR interactions per month sits one to two orders of magnitude below the free tier thresholds of every data service in the stack.

Firestore's free tier alone covers 50,000 reads per day — roughly 83× our estimated daily read volume. Pub/Sub's 10GB free monthly allowance covers our ~100KB of event payloads with 99.99% headroom.

Supabase's free tier (500MB Postgres, including pgvector) is sufficient to index any HR policy document a small business would produce — a 20-page policy PDF produces roughly 500–800 chunks at 256 tokens each, consuming under 2MB total. The RAG store is structurally free at SMB scale.

These costs begin to appear at 500+ employees and 50,000+ interactions/month — well beyond the target segment for this architecture phase.

Source: Google Cloud — Firestore Pricing · Google Cloud — Pub/Sub Pricing · Supabase — Pricing

Solution	Monthly cost · 50 employees	Employee channel	Multi-language	Autonomous decisions	Deskless-native
The Autonomous HR	$6.56	WhatsApp + Voice IVR	✓ 200 languages	✓ RAG-governed	✓ Designed for it
GreytHR SMB	$60–100	Web portal + mobile app	✗ English only	✗ Human HR required	✗ Requires smartphone app
Keka Starter	$100–150	Web + mobile app	✗ English only	✗ HR admin required	✗ Desktop-first UX
Darwinbox	$300+	Web + mobile app	Partial · limited Hindi	Partial · workflow rules	✗ Enterprise-grade complexity
Workday HCM	$800–1,500	Web + mobile app	Yes · enterprise	Partial · rules engine	✗ Requires trained HR staff
Part-time HR assistant	$400–800	WhatsApp (unstructured)	Depends on person	✗ Manual, inconsistent	Varies
Manual (paper + WhatsApp)	$0 + HR staff time	WhatsApp (unstructured)	Any	✗ None	Partially

Methodology notes & caveats

All pricing sourced from vendor documentation as of Q1 2026. Cloud pricing changes frequently; treat these figures as directional estimates rather than contractual quotes. Each component links to its primary source above.

Exotel voice rates are not published as a standard per-minute rate. The ₹0.30–0.40/min figure is derived from Exotel's credit-based plan structure (exotel.com/pricing) and third-party pricing analyses. Exact rates require a direct sales quote and vary by call type, destination, and volume tier.

WhatsApp pricing changed materially on 1 July 2025 when Meta shifted from conversation-based to per-message billing. The $0.00 cost for this use case reflects that worker-initiated service messages and utility templates sent within the 24-hour customer service window carry no Meta fee under the new model. Source: Spur — WhatsApp Business API Pricing 2026.

Gemini 1.5 Flash pricing ($0.075/1M input, $0.30/1M output) is sourced from Google Cloud Vertex AI Pricing. Note that Google has released Gemini 2.5 Flash and 3.x models since this architecture was designed; newer Flash variants may offer better price-performance as they reach stable GA status on Vertex AI.

Token estimates (1,000 avg tokens/interaction) are conservative approximations. Simple leave balance queries may consume 200–400 tokens. Complex policy RAG lookups with context may consume 2,000–4,000 tokens. The 1,000 token average assumes a mixed workload typical of a 50-person business.

Currency note: All costs are shown in USD for international comparability. For India deployments, GCP india-south1 pricing and Exotel INR rates apply. The $6.56/month baseline is approximately ₹550/month at current exchange rates — less than the cost of one day of a part-time contract HR assistant.

$0.13/employee/month.
Here is every number.

Six cost components.
Four are zero.

What happens when scale changes.

What else could you buy for this money.

$0.13/employee/month.Here is every number.

Six cost components.Four are zero.

What happens when scale changes.

What else could you buy for this money.

$0.13/employee/month.
Here is every number.

Six cost components.
Four are zero.