Glossary — The Autonomous HR

A

A2A Protocol Agent

A way for AI agents to talk to each other.

Agent-to-Agent communication protocol. Defines the message format, contract, and routing mechanism by which specialist agents in a multi-agent system exchange structured data. In this system, A2A happens via Cloud Pub/Sub topics — each agent publishes events that other agents subscribe to.

→ See: Agent Architecture

ADK (Agent Development Kit) Infra

Google's toolkit for building AI agents on GCP.

Google's open-source framework for building multi-agent systems on Google Cloud. Provides agent lifecycle management, tool calling abstractions, and native GCP service bindings. This system uses LangGraph instead of ADK — see ADR-003 for the rationale.

→ See: ADR-003

ADR (Architecture Decision Record) Process

A document recording why a major technical decision was made.

A lightweight document format for recording significant architecture decisions. Each ADR captures: the context that required a decision, the decision made, alternatives considered, consequences (positive and negative), and responses to anticipated pushbacks. This portfolio contains 8 ADRs plus 5 MLOps decision records.

→ See: Page 10 — ADRs

Artifact Registry Infra

GCP's container image storage service.

Google Cloud's managed container registry. Stores Docker images for all Cloud Run services. Provides vulnerability scanning on push and version retention policies. Images are tagged with the git SHA at build time — providing a deterministic link between source code and deployed container.

Audit Log HR

A permanent, tamper-proof record of every HR decision the system has ever made.

An append-only Firestore collection (/audit_log) where every agent action, policy lookup, HITL escalation, and decision outcome is recorded before any outbound notification is sent. Firestore security rules deny update and delete on this collection for all identities — making it immutable at the infrastructure layer. Retained 7 years for labour law compliance.

→ See: Agent Guardrails

C

Canary Deployment Infra

Releasing a new version to a small percentage of users first to check for problems.

A deployment strategy where a new Cloud Run revision receives a small percentage of live traffic (10% in this system) while the previous version handles the rest. Error rates are monitored for a defined period; if they exceed baseline + 2%, the deployment is automatically rolled back. Full promotion to 100% is triggered manually after canary success.

Casual Leave (CL) HR

Short-notice leave for personal matters — typically 1 or 2 days.

A leave type with an annual entitlement of 12 days, credited on 1 April each year. Subject to a maximum of 2 consecutive days without Managing Director approval. AutoHR approval conditions defined in §4.2.1 of the HR Policy PDF. Does not carry over or encash.

→ See: HR Policy PDF §4.2

Cloud Run Infra

GCP's serverless compute service that runs containers and charges only when they're processing requests.

Google Cloud's fully managed container execution environment. Scales to zero when idle (no idle cost) and scales automatically under load. Supports GPU instances for ML workloads (used for Whisper STT). All compute in this system runs on Cloud Run — no GKE cluster. 99.95% SLA. Free tier: 180,000 vCPU-seconds/month.

→ See: Service Catalogue · ADR-005

Confidence Score ML

A number between 0 and 1 that tells the system how certain it is about its answer.

A scalar value (0.0–1.0) produced by the Policy RAG retrieval step representing the cosine similarity between the query embedding and the retrieved policy chunk embedding. Used as the primary HITL trigger: if confidence < 0.80, the Leave Agent transitions unconditionally to HITL_PENDING regardless of other conditions. This threshold is a graph edge condition, not a prompt instruction.

→ See: Leave Agent State Machine

D

Deskless Workforce Business

Workers who do their jobs without a desk, laptop, or company email — the majority of the world's workforce.

2.7 billion workers globally (80% of the total workforce) employed in industries including manufacturing, agriculture, retail, construction, healthcare, and hospitality. They have no access to company email, limited computer access during the workday, and have been systematically excluded from enterprise HR software. Source: Emergence Capital / Deskless Workforce Report.

→ See: Page 02 — The Problem

DPDP Act 2023 Compliance

India's new data privacy law that governs how personal data must be handled.

The Digital Personal Data Protection Act, 2023. India's primary data protection legislation. Requires explicit consent for processing personal data, restricts cross-border data transfers, and mandates data retention limits. This system is compliant: all GCP resources in asia-south1 (Mumbai), Supabase on AWS ap-south-1 (Mumbai), and explicit consent collected during AutoHR onboarding.

Drift (Model) ML

When an AI model's accuracy gradually gets worse because the real world has changed.

The degradation of ML model performance over time due to changes in the input data distribution, the environment, or the model's external dependencies. This system monitors four drift types: acoustic drift (Whisper STT), policy drift (RAG document update), LLM version drift (Gemini Flash release), and embedding model drift (text-embedding-004 version change). Each has a distinct detection mechanism and automated response.

→ See: MLOps — Drift Detection

E

Earned Leave (EL) HR

Leave that builds up over time the longer you work — can be saved up and paid out when you leave.

Leave that accrues at 1.25 days per month of completed service (15 days/year). Begins accruing after the 90-day probation period. Can be carried over up to 30 days maximum. EL exceeding 30 days at year-end is automatically encashed at the basic daily rate. Applications require 5 working days' advance notice. See HR Policy §4.4.

Embedding ML

Converting text into a list of numbers that captures its meaning — so similar texts produce similar number lists.

A dense vector representation of text produced by a neural network (text-embedding-004 in this system). Vectors are 768-dimensional. Semantically similar texts produce vectors with high cosine similarity. Used in the RAG pipeline to index HR policy chunks and to find the most relevant chunk for any worker query. Stored in Supabase pgvector with HNSW indexing for sub-10ms retrieval.

→ See: MLOps — RAG Pipeline

ESIC Compliance

India's employee health insurance scheme — mandatory for eligible workers.

Employees' State Insurance Corporation. A statutory social security scheme under the ESI Act, 1948 providing medical, sickness, maternity, and disability benefits. Applicable to employees earning below the wage ceiling. Employee contribution: 0.75% of gross salary. Employer contribution: 3.25%. Registration required within 10 days of joining.

F

Firestore Infra

GCP's serverless database — no server to manage, pays only for what you use.

Google Cloud Firestore in Native mode. A serverless NoSQL document database. Used as the primary operational database: employee records, leave requests, HITL queue, conversation sessions, and the audit log. Free tier: 50,000 reads/day and 20,000 writes/day — comfortably covers SMB workloads. Append-only security rules on /audit_log enforced at database layer.

Fine-tuning (LoRA) ML

Training an existing AI model to be better at a specific task, without retraining the whole thing.

Low-Rank Adaptation — a parameter-efficient fine-tuning technique that trains a small set of adapter weights rather than the full model. Used to adapt Whisper large-v3 for regional dialect accuracy. Training targets the query and value projection matrices (q_proj, v_proj). LoRA rank: 16, alpha: 32. Training runs on Cloud Run GPU in approximately 40 minutes when triggered by drift detection.

→ See: MLOps — Drift and Retraining

G

Gemini 1.5 Flash ML

Google's fast, affordable AI model used for intent classification and agent reasoning in this system.

A production-grade LLM from Google, available via Vertex AI. Input: $0.075/1M tokens. Output: $0.30/1M tokens — the most cost-effective capable hosted LLM as of Q1 2026. Used for intent routing and Leave Agent reasoning. Model version pinned at gemini-1.5-flash-002 (not 'latest'). Region: asia-south1 for data residency.

→ See: MLOps Decision D3

Guardrail Agent

A hard limit on what an AI agent is allowed to do — enforced by the infrastructure, not just instructions.

An architectural constraint preventing an agent from taking certain actions, enforced at the infrastructure layer rather than via prompt instructions. Examples: the audit log is append-only via Firestore security rules (not a prompt), the HITL trigger on confidence < 0.80 is a state machine guard condition (not a prompt), per-agent IAM prevents cross-domain access. Prompt-only guardrails can be argued around by the LLM; infrastructure guardrails cannot.

→ See: Agent Guardrails

H

HITL (Human-in-the-Loop) Agent

A designed checkpoint where a human must review and decide before the system continues.

A first-class architectural state in the Leave Agent state machine (not a fallback). Triggered by four conditions: RAG confidence < 0.80, all leave balances exhausted, statutory coverage threshold breach, or DB read failure. The HITL Manager composes a structured brief and sends it to the owner via WhatsApp. 4-hour re-escalation, 24-hour auto-deny. Every HITL event is written to the audit log.

→ See: HITL Specification

HNSW (Hierarchical Navigable Small World) ML

An index structure that makes searching through millions of vectors fast.

An approximate nearest neighbour algorithm used to index vector embeddings in pgvector. HNSW builds a layered graph structure enabling sub-10ms similarity search at SMB scale (under 5,000 vectors). Configuration: ef_construction=128, m=16. Alternative to IVF-Flat which requires a training step. Preferred for the policy RAG store because it supports online inserts (no re-training on policy update).

I

IAM (Identity and Access Management) Infra

GCP's system for controlling who (or what service) can access which resources.

GCP's access control system. Each Cloud Run service in this system has a dedicated Service Account with minimum permissions scoped to exactly the resources it needs. The STT service cannot read Firestore. The Leave Agent cannot access grievance records. All IAM bindings are defined in Terraform — no manual console grants. The absence of a permission is as deliberate as its presence.

→ See: IAM Matrix

IVR (Interactive Voice Response) Channel

A phone system that lets workers call a number and speak naturally to apply for leave or ask questions.

A telephony gateway that accepts inbound calls, streams audio to the Whisper STT service via WebSocket, and returns TTS audio responses to the caller. Implemented via Exotel (India) or Plivo. The IVR provides feature parity with the WhatsApp channel — all agent flows support both channels identically. Acts as a fallback for workers who prefer calling over messaging.

→ See: ADR-001

L

LangGraph Agent

The framework used to build the AI agents in this system — it defines exactly what each agent can do and when it must ask a human.

An open-source Python framework by LangChain Inc. for building stateful, graph-based agent workflows. Provides explicit state machine primitives: nodes (states), edges (transitions), conditional routing, and persistent state via checkpointers (Firestore in this system). Used because it enforces HITL as a graph edge condition rather than a prompt instruction — a critical distinction for employment decision systems.

→ See: ADR-003

LOP (Loss of Pay) HR

A salary deduction for days absent without approved leave.

A payroll deduction applied for: absence without leave approval, leave taken beyond sanctioned entitlement, and accumulated late arrivals (per §2.3). Daily rate: Gross Monthly Salary ÷ Working Days in Month. Calculated automatically by the Payroll Agent from attendance records. Half-day LOP = 50% of daily rate.

M

MLOps ML

The engineering practices for keeping AI models working correctly in production over time.

Machine Learning Operations — the discipline of deploying, monitoring, and maintaining ML models in production. Covers model versioning, drift detection, retraining pipelines, A/B testing, rollback mechanisms, and observability. In this system: model registry in Firestore, drift detection via Cloud Monitoring, LoRA retraining on Cloud Run GPU, canary deployment via Cloud Run traffic splitting.

→ See: Page 08 — MLOps

N

NLLB-200 ML

Meta's open-source translation model that supports 200 languages, including all major Indian languages.

No Language Left Behind — a 3.3B parameter multilingual translation model released by Meta AI. Purpose-built for low-resource languages underrepresented in standard training corpora. Used to normalise code-switched speech (Hinglish, Tenglish) to structured English intent before LLM processing. Bundled in the STT Cloud Run container — zero marginal cost. Replaces Google Cloud Translation API.

→ See: MLOps Decision D2

O

Orchestrator (Agent) Agent

The agent that receives every message first and decides which specialist agent should handle it.

The Intent Router — a stateless Gemini Flash agent that is the sole entry point for all inbound interactions. Classifies intent across 6 categories (leave request, balance query, policy question, grievance, onboarding, general). Extracts entities. Routes to the appropriate specialist agent via Pub/Sub event publish. Never makes decisions — only routes.

→ See: Agent Topology

P

pgvector ML

A Postgres extension that lets the database store and search through AI vectors.

An open-source extension for PostgreSQL that adds vector data types and similarity search operators. Used via Supabase to store HR Policy PDF embeddings (768-dimensional vectors from text-embedding-004) and perform cosine similarity search at query time. HNSW index enables sub-10ms retrieval. Supabase free tier (500MB) is sufficient for any SMB policy document.

→ See: RAG Pipeline

Pub/Sub Infra

GCP's messaging system — services publish messages to topics, and other services receive them without needing to talk to each other directly.

Google Cloud Pub/Sub — a managed asynchronous messaging service. This system uses 5 topics: INBOUND_MSG, STT_RESULT, AGENT_EVENT, HITL_ESCALATION, NOTIFICATION_OUT. Push subscriptions deliver messages to Cloud Run services. 7-day message retention. Dead-letter topics on all critical subscriptions. Free tier covers entire SMB message volume. The event fabric that decouples all services.

→ See: ADR-004

R

RAG (Retrieval-Augmented Generation) ML

Making an AI answer questions based on a specific document, rather than just its general training.

A technique that augments LLM generation with information retrieved from a document store at inference time. In this system: the HR Policy PDF is chunked (256 tokens, 32 overlap), embedded, and stored in pgvector. At query time, the worker's question is embedded and the most relevant policy chunks (top-3 by cosine similarity) are inserted into the LLM prompt as context. The LLM reasons against the retrieved chunks — not its training data.

→ See: RAG Pipeline · HR Policy PDF

Rathi Textiles Pvt. Ltd. Business

The fictional anchor client used to ground every architecture decision in a real business context.

A composite fictional profile representing the archetype of the Indian small retail/manufacturing employer. Nagpur, Maharashtra. 52 employees across 3 retail outlets and 1 weaving unit. ₹2.8 Cr annual revenue. Owner: Priya Rathi. 6 languages spoken on the floor. Zero company email addresses. Previously attempted Keka — abandoned in 6 weeks. All architecture decisions in this portfolio are grounded in Rathi Textiles' specific constraints.

→ See: Page 05 — Client Brief

S

Scale-to-zero Infra

The system shuts down when nobody is using it and starts up again when needed — so you pay nothing when it's idle.

A Cloud Run billing behaviour where instances are terminated when there are no active requests, incurring zero compute cost during idle periods. A 50-person business generates HR interactions in bursts (morning check-ins, lunch-time leave requests) rather than continuously. Scale-to-zero aligns cost with actual usage — the most important infrastructure decision for SMB deployments. Mitigated cold starts via min-instances: 1 on the STT service.

→ See: ADR-005

Secret Manager Infra

GCP's secure vault for storing API keys and passwords — so they never appear in code.

Google Cloud Secret Manager. All credentials (WhatsApp API token, Exotel key, Supabase connection string, Gemini API key) are stored here and injected into Cloud Run services at runtime via IAM-authenticated API calls. No credential appears in source code, Dockerfiles, environment variable defaults, or CI/CD configuration. Automated 90-day rotation on WhatsApp token.

State Machine Agent

A formal description of every possible state an agent can be in and every condition that moves it from one state to another.

A formal model where an agent exists in exactly one of a finite set of states at any time, transitions between states are triggered by events and evaluated by guard conditions, and entry/exit actions are defined for each state. The Leave Agent has 7 states: IDLE, FETCHING, RAG_LOOKUP, EVALUATING, APPROVED, DENIED, HITL_PENDING. Implemented in LangGraph. Every transition is logged to the audit trail.

→ See: Leave Agent State Machine

STT (Speech-to-Text) ML

Converting a worker's spoken voice note into written text the system can understand.

The process of transcribing audio to text. Implemented via Whisper large-v3 (OSS) on Cloud Run GPU T4 spot. Produces transcript, detected language, and confidence score. P99 latency with min-instance warm: <4 seconds. Fallback to Google Cloud STT v2 on service error. 83% cheaper than Google STT v2 at SMB interaction volumes.

→ See: MLOps Decision D1

T

Terraform Infra

Code that describes exactly what cloud infrastructure should exist — so it can be built automatically and repeatably.

An open-source Infrastructure as Code (IaC) tool by HashiCorp. Every GCP resource in this system is defined in .tf files: Cloud Run services, Firestore security rules, IAM bindings, Pub/Sub topics, Secret Manager secrets, Cloud Monitoring alerts, and billing budgets. State stored in GCS. Plan runs on PR; apply on merge to main after manual promotion gate.

→ See: Terraform IaC

TOGAF ADM Architecture

A formal method for designing enterprise architecture in phases — used to ensure nothing important is missed.

The Open Group Architecture Framework Architecture Development Method. A structured approach to enterprise architecture covering: Phase A (Vision), Phase B (Business Architecture), Phase C (Information Systems Architecture), Phase D (Technology Architecture), Phase E (Migration Planning). This portfolio documents Phases A–E for The Autonomous HR, grounding every technology decision in business requirements.

→ See: Page 06 — TOGAF ADM

V

Vertex AI Infra

GCP's managed AI platform — provides access to Gemini models and embedding APIs with enterprise SLAs.

Google Cloud's unified ML platform. Used in this system for: Gemini 1.5 Flash inference (intent routing, agent reasoning), text-embedding-004 (HR policy chunk embeddings). Region locked to asia-south1 for DPDP Act 2023 data residency. Model version pinned (gemini-1.5-flash-002) — not using 'latest' alias to prevent unexpected behaviour changes.

W

WER (Word Error Rate) ML

A measure of how many words the speech-to-text system gets wrong — lower is better.

The primary evaluation metric for STT models. Calculated as: (Substitutions + Deletions + Insertions) / Total Words in Reference. Expressed as a percentage. Whisper large-v3 achieves <8% WER across 99 languages. The retraining trigger fires when rolling 7-day mean confidence drops below 0.85, which correlates with WER regression. The fine-tuning job rolls back if new WER exceeds baseline + 5%.

WhatsApp Business API Channel

Meta's official way for businesses to send and receive WhatsApp messages at scale.

Meta's Cloud API for enterprise WhatsApp messaging. Enables programmatic send/receive of text, audio, documents, and template messages. As of July 2025, uses per-message pricing (replacing conversation-based). User-initiated service messages and utility templates within the 24-hour service window: $0.00. Business-initiated HITL alerts (India): ~$0.011/message. ~$0.00 total monthly cost at SMB scale.

→ See: Cost Methodology C5 · ADR-001

Whisper large-v3 ML

OpenAI's open-source voice recognition model — used to transcribe workers' voice notes in any language.

OpenAI's open-source multilingual ASR model. 1.55B parameters. Achieves WER <8% across 99 languages. Deployed as a containerised Cloud Run service on NVIDIA T4 spot GPU. Min-instances: 1 (warm) to eliminate cold start on IVR calls. Automatic fallback to Google STT v2 on service error. 83% cheaper than Google STT v2 at SMB interaction volumes. Fine-tuned via LoRA when drift is detected.

→ See: MLOps Decision D1

Every term.Plain English first.

Every term.
Plain English first.