The Autonomous Author / Page 07 — Glossary

Glossary of
terms and concepts.

Every term used across the nine pages of this portfolio — defined precisely, tagged by domain, and cross-referenced to the page where it first appears. Terms are defined as they are used in this architecture, not as general industry definitions where the two differ.

A
Agent Agent
A discrete, autonomous software component that receives a typed input schema, makes one or more LLM API calls under a defined system prompt, and returns a typed output schema plus an XAI reasoning card. In this pipeline, each agent has single responsibility — it performs exactly one function and hands off to the next agent via shared pipeline state. Agents are pure async functions: (input) → (output + XAICard). They have no side effects and do not communicate with each other directly.
First defined: PG 03 — Phase D · Full spec: PG 04 — Agent Design
Agile mode Concept
A pipeline workflow configuration for Persona 1 (feature release writer) where the session is delta-aware — the writer can indicate that the output is an update to an existing document, and the Draft Agent produces a patch (changed sections only) rather than a full document. Contrasted with Waterfall mode. Selected at session start and stored in the DocBrief.
First referenced: PG 02 — BR-05
Ambiguity Detector Agent A-04
The P2-exclusive agent that receives the DraftDocument as an external artefact and critiques it for four categories of defect: vague quantifiers (undefined measurements), undefined terms (proper nouns or concepts used without prior definition), missing error states (functional requirements without paired failure paths), and implicit assumptions (conditions assumed true without specifying the failure case). Produces an AmbiguityReport. Operates in critique mode — its system prompt explicitly states it is a critic, not an author, to avoid the known failure mode of LLM self-critique rationalisation.
AmbiguityReport Data
The typed output of the Ambiguity Detector. Contains arrays of Flag objects grouped by category (vague_terms, undefined_terms, missing_error_states), each with location (section + excerpt), severity (HIGH/MED/LOW), and a one-line suggested rewrite. Persisted in IndexedDB as part of the DocumentSession. Rendered as a distinct panel in the Review UI, separate from the ComplianceReport.
Architecture Decision Record (ADR) Concept
In this portfolio, ADRs are presented as Rebuttals & Pushbacks sections on the Agent Design, MLOps, and Infrastructure pages. Each documents: the challenge raised against a decision, the tempting alternative, why the alternative was rejected, and the trade-off accepted. This format makes architectural reasoning persistent and auditable — not just the decision, but the path not taken and why.
Format defined: PG 03 — Rebuttals
C
Compliance Agent Agent A-05
The pipeline agent responsible for checking the DraftDocument against the 80-rule Google Developer Style Guide compliance set encoded in rules.json. Executes three check types in sequence: PATTERN (client-side regex, fast, deterministic), STRUCTURAL (section presence checks), and SEMANTIC (LLM call for nuanced violations). Every violation in the output ComplianceReport carries a rule_id, excerpt, and fix_suggestion. No violation appears without all three fields. Runs on every session regardless of persona or workflow mode.
ComplianceReport Data
The typed output of the Compliance Agent. Contains a violations array (each with rule_id, category, excerpt, severity, fix_suggestion), total violation_count, the rules_version of the rules.json used, and checked_at timestamp. Persisted in IndexedDB. Rendered in the Compliance Viewer panel in the Review UI, sorted by severity. The rules_version field enables session history audit — past reports can be compared against newer rule versions.
Confidence score Concept
A self-reported float (0.0–1.0) included in every agent's output, representing the agent's assessment of output quality given available context. Not a statistical confidence interval — it is the LLM's structured self-assessment, which has known calibration limitations. Combined with the uncertainties[] field in the XAI card, it serves as a triage signal for Maya's review: low-confidence outputs should receive more scrutiny. Confidence scores below 0.7 trigger a yellow indicator in the XAI panel. The calibration limitation is disclosed in this Glossary and in the Review UI tooltip.
First defined: PG 04 — XAI Layer · Calibration: PG 04 — Pushback 02
Content Security Policy (CSP) Infrastructure
An HTTP response header that controls what resources the browser is permitted to load and what network connections it is permitted to make. In The Autonomous Author, the CSP enforces: default-src 'self' (no external resources by default), connect-src https://api.groq.com (API calls to Groq only), script-src 'self' (no inline scripts, no eval()). This is a technical control — not a code-review promise — that structurally prevents XSS attacks from exfiltrating data to third-party servers.
ContextPack Data
The typed output of the Research Agent. Contains context_text (assembled context from available inputs), gaps (array of string descriptions of unknown context items), clarifications (array of Q&A pairs from Maya's optional answers), and confidence (float). Consumed by the Draft Agent as its primary context source. Gap flags in ContextPack correspond to [REQUIRES INPUT:] placeholders in the DraftDocument in P2 mode.
D
DDLC (Documentation Development Lifecycle) Concept
The end-to-end process of producing a technical document — from the initial signal (ticket, feature brief, or intent statement) through intake, research, drafting, review, compliance checking, and publication. The Autonomous Author automates the intake-through-draft portion of the DDLC and provides a structured human review gate before the publication stage. The DDLC is the documentation equivalent of the SDLC (Software Development Lifecycle).
First referenced: PG 01 — The Problem
DDLC-ADM Concept
Documentation Development Lifecycle Architecture Development Method. A methodology framework created for this portfolio, adapting TOGAF's Architecture Development Method phases (Preliminary, A through F, Requirements Management) to the documentation domain. Used on Page 03 to provide a structured, traceable framework for the design process — replacing TOGAF's enterprise architecture scope with a documentation pipeline scope.
Defined and applied: PG 03 — Design & Architecture
DDD (Document-Driven Development) Concept
A software development practice where technical specifications and design documents are written before any code is produced. The document is the contract — developers build what the spec says, not what they infer the intent to be. In this pipeline, Persona 2 (the DDD Author) operates in this mode. DDD documents require higher rigour than feature docs because ambiguity in the spec becomes ambiguity in the code. The Ambiguity Detector is the pipeline component specifically designed for DDD quality enforcement.
First introduced: PG 01 — The Problem · Persona defined: PG 02 — Maya Profile
DocBrief Data
The typed output of the Intake Agent. The foundational data object for a pipeline session. Contains: persona (P1|P2), doc_type (enum), feature_name, audience, workflow_mode (agile|waterfall), existing_doc_id (optional, delta mode), and raw_input (the original writer input). Every downstream agent reads from DocBrief. DocBrief is immutable once produced — downstream agents cannot modify it.
Schema: PG 03 — Phase C · Produced by: PG 04 — A-01
DraftDocument Data
The typed output of the Draft Agent. Contains: content_md (the full Markdown draft), doc_type, placeholders (array of Placeholder objects at [REQUIRES INPUT:] locations), version, and word_count. In P2 mode, the placeholders array is populated for every context gap identified by the Research Agent. The DraftDocument is the primary artefact passed to both the Ambiguity Detector (P2) and the Compliance Agent.
Drift detection MLOps
In the context of this LLM-native pipeline, drift detection is the monitoring of output quality degradation over time — not the monitoring of statistical distribution shift in training data (which does not apply, since no model is trained). Three drift sources are monitored: LLM model version changes (detected via API response headers), output quality regression (detected via session log aggregation in IndexedDB), and input distribution shift (detected via unknown doc_type values in the Intake Agent output). Each source has a defined detection mechanism and response protocol.
E
Evaluation harness MLOps
The test suite for the pipeline. Consists of 50 labelled fixture pairs (input → expected output), scored against 8 quality dimensions: schema conformance, compliance detection rate, ambiguity detection rate, placeholder insertion rate, XAI card completeness, false positive rate, confidence calibration, and pipeline latency (p95). All 8 dimensions must pass their threshold for a prompt change to be eligible for merge. The eval harness is the primary gate between a prompt change and production deployment. It runs as GitHub Actions Job 3 on every PR.
export_ready flag Data
A boolean field in the active PipelineState (sessionStorage) that controls whether the Export Panel is enabled in the Review UI. Initialised to FALSE by the Review Prep Agent. Set to TRUE only when Maya has explicitly actioned (accepted, edited, or ignored-with-reason) every HIGH-severity item in both the ComplianceReport and the AmbiguityReport (P2). This flag is the technical enforcement of the human gate principle (AR-02, P-02) — it cannot be bypassed by the writer without actively dismissing each HIGH item.
G
Google Developer Style Guide Concept
The open-source style guide published by Google for technical documentation written for developer audiences. Used as the compliance standard for the Compliance Agent. The guide covers voice (second person), tense (present), punctuation, heading capitalisation, code formatting, terminology, and accessibility. Not all rules are machine-checkable — the 80 rules encoded in rules.json represent the high-impact, binary-verifiable subset. The remaining guide content requires human editorial judgment and is not automated by this pipeline.
Groq Infrastructure
The inference API provider used by The Autonomous Author. Selected for its free tier, inference speed (300+ tokens/second on Llama 3.1 70B), and reliability. The writer provides their own Groq API key, which is stored encrypted in localStorage. Groq is the only external service that receives document content (during active pipeline sessions only). The pipeline is designed with a model-swap abstraction layer — Together AI is the documented fallback if Groq changes its free tier. Selection rationale: ADR-002.
H
Human gate Concept
The mandatory review and approval checkpoint between the pipeline's agent-generated output and the writer's export action. Implemented as a UI state machine node: the Export Panel is disabled until Maya has actioned all HIGH-severity items in the Review UI. The human gate is non-bypassable — it is enforced by the export_ready flag in PipelineState, not by a UI warning or a user preference. This is a binding architecture principle (P-02, AR-02) and is documented in every agent spec that produces output for human review. The gate exists because The Autonomous Author's value proposition is augmenting the writer, not replacing their authorship.
Principle: PG 01 — Philosophy II · Implementation: PG 04 — A-06
I
IndexedDB Infrastructure
The browser-native key-value database used for persistent session storage. The Autonomous Author uses a single IndexedDB database (taa_sessions) to store the complete DocumentSession record for every pipeline run — including agent outputs, XAI cards, compliance reports, ambiguity reports, review decisions, and the approved draft. Data persists across page refreshes and browser restarts until the writer explicitly clears it. Scoped to the page origin — no other website or tab can access the TAA IndexedDB store.
Intake Agent Agent A-01
The first agent in the pipeline. Receives Maya's raw input (any format — Jira ticket, PR description, feature brief, free text) and produces the structured DocBrief that all downstream agents consume. The only agent that handles raw, unstructured writer input. Operates at temperature 0.1 for near-deterministic extraction. Detects P2 indicators (imperative language, "the system shall" patterns) and can suggest persona selection before confirming with Maya. Produces the session's foundational data contract.
L
LangGraph-pattern Concept
A deterministic pipeline architecture pattern inspired by LangGraph (a Python library for stateful multi-agent systems) but implemented in vanilla JavaScript for the browser. Defines a directed graph of agent nodes where each node has a typed input, a typed output, and a single connection to the next node. The graph is evaluated in a fixed sequence — not dynamically by the LLM. No agent decides which agent fires next. Contrasted with AutoGPT-style autonomous loops, where the LLM selects the next tool or agent. The LangGraph-pattern is chosen because it is deterministic, testable, and XAI-compatible. ADR-005.
localStorage Infrastructure
Browser-native synchronous key-value storage used for persistent user settings in The Autonomous Author. Stores: the AES-256 encrypted Groq API key, UI preferences (default persona, workflow mode, theme), the expected Groq model version (for drift detection), and the settings_version for schema migration. Distinguished from IndexedDB (session data) and sessionStorage (active pipeline state) by its persistence (survives browser restarts) and synchronous API. The API key is encrypted before write using the Web Crypto API with a device-derived key.
M
MLOps (in this context) MLOps
The operational discipline applied to The Autonomous Author's LLM-native pipeline. Since no models are trained, MLOps concerns are reframed: system prompts are treated as models (versioned, tested, governed), rules.json is treated as training data (authored, validated, versioned), the evaluation harness is treated as the test suite, output quality monitoring is treated as drift detection, and the model upgrade protocol is treated as the promotion gate. The concern mapping is exact even when the implementation differs from traditional MLOps. Contested in Rebuttals & Pushbacks — see PG 05.
Model upgrade protocol MLOps
The defined process for evaluating and adopting a new Groq model version. Triggered when Groq announces a new model or when a model version mismatch is detected in the API response header. Protocol: (1) run full 50-fixture eval harness against the new model in parallel with the current model, (2) compare all 8 dimensions to baseline — new model must match or exceed on ALL dimensions, (3) if all pass → update pipeline-config.json and deploy, (4) if partial pass → prompt tuning and re-evaluation, (5) if fail → hold, document failure dimensions, monitor next release. Three outcomes: PROMOTE, PROMPT TUNING, HOLD.
MoSCoW prioritisation Concept
A requirements prioritisation framework used on the Client Brief page. Must Have = binding architectural constraints, non-negotiable. Should Have = high-value requirements addressed in H1–H2 roadmap. Could Have = desirable features for future iterations. Won't Have (this iteration) = explicitly out of scope. Every requirement in the requirements catalogue is tagged with a MoSCoW priority and a pipeline component that addresses it.
P
Persona 1 (P1) Concept
The feature release writer persona. A technical writer who documents new features for a public or internal developer audience, working downstream of code in an Agile sprint cadence. In the pipeline, P1 mode configures the Draft Agent to produce a feature doc structure (Overview, Prerequisites, Steps, Code example, Parameters, Error codes, Related links) in second-person, present-tense, active-voice. P1 sessions do not invoke the Ambiguity Detector. The pipeline target for P1 is 15 minutes from intake to review-ready draft.
Persona 2 (P2) Concept
The DDD author persona. A technical writer who produces system specification documents that function as the contract for downstream code development, working in a Waterfall cadence. In the pipeline, P2 mode configures the Draft Agent to produce a DDD spec structure (Purpose, Scope, Definitions, Actors, Functional requirements in "the system shall" form, Non-functional requirements, Error states, Open questions, Requirements traceability table) in third-person imperative voice, temperature 0.1. P2 sessions invoke the Ambiguity Detector after the Draft Agent. The pipeline target for P2 is 20 minutes from intake to review-ready draft.
Pipeline Orchestrator Infrastructure
The central coordination module (pipeline.js) that sequences agent execution, manages the PipelineState in sessionStorage, collects XAI cards from each agent and emits them to the PipelineMonitor UI component, verifies prompt SHA hashes on init, and activates the human gate when the ReviewBundle is ready. The orchestrator does not perform any LLM calls itself — it delegates to agent modules. Implements the LangGraph-pattern: a fixed-order graph of async function calls with typed data flowing between nodes.
Architecture: PG 06 — Full Stack
Placeholder Concept
A structured marker inserted by the Draft Agent in P2 mode wherever the ContextPack contains a gap_flag and context is insufficient to populate the field. Format: [REQUIRES INPUT: reason] where reason is a one-sentence description of what is needed. Placeholders are surfaced as required action items in the Review UI — the export_ready flag remains FALSE until all placeholders are resolved or explicitly deferred. The placeholder-over-inference principle (P-10, AR-08) prevents the pipeline from producing plausible-but-incorrect DDD content that could mislead downstream developers.
Principle: PG 03 — P-10 · Implementation: PG 04 — A-03 Draft Agent P2
Prompt governance MLOps
The operational discipline applied to system prompt files in the /prompts/ directory. Each prompt file is version-controlled with a semver tag and a SHA-256 hash. Changes follow a PR-gated workflow: propose change → run eval harness → pass all 8 gates → PR review → merge → deploy. SHA hashes are regenerated by GitHub Actions on every merge and written to prompt-manifest.json. At runtime, the Pipeline Orchestrator verifies each prompt's SHA against the manifest — a mismatch prevents pipeline startup. Treated as equivalent to model training governance in traditional MLOps.
R
Research Agent Agent A-02
The second agent in the pipeline. Receives DocBrief and produces ContextPack. Primary capability: gap detection — identifying unknown proper nouns, missing architectural context, and unresolved ambiguities in the brief before drafting begins. Surfaces up to 3 clarification questions to Maya (non-blocking — the pipeline continues regardless of whether Maya answers). Gap flags in ContextPack become [REQUIRES INPUT:] placeholders in the DraftDocument in P2 mode. Operates at temperature 0.2 for structured extraction with moderate creativity in question formulation.
Review Prep Agent Agent A-06
The final pipeline agent. Receives all upstream outputs (DraftDocument, ComplianceReport, AmbiguityReport if P2, all XAI cards) and assembles the ReviewBundle presented to Maya in the Review UI. Does not generate new content — it organises, prioritises (HIGH severity first), and annotates (inline violation markers on draft text). Sets export_ready to FALSE. Its XAI card summarises the entire pipeline session. Activates the human gate on completion.
ReviewBundle Data
The typed output of the Review Prep Agent. The single data object that drives the entire Review UI. Contains: draft (Markdown with inline annotations), compliance_report (sorted by severity), ambiguity_report (P2 only), xai_cards (all cards in chronological order), and ready_for_review (always true — indicates pipeline completion). The export_ready flag lives in PipelineState (sessionStorage), not in ReviewBundle — it is mutable by Maya's review actions; ReviewBundle is immutable.
rules.json MLOps
The versioned static JSON file containing the 80 Google Developer Style Guide compliance rules used by the Compliance Agent. Each rule object contains: id, category, rule_text, check_type (PATTERN/STRUCTURAL/SEMANTIC), regex pattern (if PATTERN), severity, fix_template, style_guide_ref, positive_fixture, and negative_fixture. Treated as the "training data" equivalent in the MLOps framing — it encodes the pipeline's compliance intelligence. Governed by its own lifecycle: authoring → schema validation → fixture testing → false positive rate check → version bump → deploy. Current version: 1.5.3.
S
sessionStorage Infrastructure
Browser-native synchronous key-value storage used for the active PipelineState. Tab-scoped and cleared when the tab is closed. Contains the live pipeline execution state: current agent, intermediate agent outputs (DocBrief, ContextPack, DraftDocument), accumulated XAI cards, ReviewBundle, and the export_ready flag. Cleared on tab close to prevent stale state from a previous session contaminating a new session. Distinct from IndexedDB (persistent session records) and localStorage (user settings).
Single Responsibility Principle (applied to agents) Concept
Architecture Principle P-04 and Architecture Requirement AR-06. Each agent in the pipeline has exactly one job, one input contract, and one output schema. No agent performs two pipeline functions. This constraint is enforced as a design rule — collapsing agents to reduce API calls is explicitly rejected (ADR-003). Benefits: each agent is independently testable in isolation, independently replaceable without affecting other agents, and independently explainable in its XAI card. The eval harness tests each agent independently using its fixture subset.
System prompt MLOps
The primary instruction document sent to the Groq API with every agent call, defining the agent's role, behaviour, output format requirements, and known failure modes to avoid. In the MLOps framing, system prompts are treated as models — they encode the pipeline's intelligence and are governed accordingly: versioned, SHA-hashed, PR-gated, eval-tested. Each agent has one system prompt file in the /prompts/ directory. Breaking changes (changes that alter the output schema) require a major version bump and downstream agent compatibility checks.
X
XAI (Explainable AI) Concept
As applied in this portfolio: the principle that every AI decision in the pipeline must be visible, legible, and challengeable by the writer before it influences the published document. Implemented as structured reasoning cards emitted by each agent, confidence scores on every output, and the uncertainties[] field that surfaces specific items the agent was not confident about. In the context of The Autonomous Author, XAI is not a post-hoc explanation layer — it is designed into each agent's output schema before the agent is implemented. Architecture Principle P-01 and Architecture Requirement AR-01.
Principle: PG 01 — Philosophy II · Implementation: PG 04 — XAI Layer
XAI Reasoning Card Data
The structured transparency artefact emitted by every agent after it completes, before the next agent fires. Contains four fields: understood (what the agent interpreted from its input), decided (what output it produced and why), why (the reasoning and rule citations behind the decision), and uncertainties (specific items the agent flagged as uncertain, requiring Maya's attention). Also contains confidence (0.0–1.0) in the header. XAI cards are displayed in the Pipeline Monitor in real time as each agent completes, and are stored in the ReviewBundle for Maya's session-level audit.
Z
Zero-server guarantee Infrastructure
The architectural property that The Autonomous Author operates no server-side infrastructure that receives, processes, or stores document content. All pipeline execution happens in the writer's browser. Document content is sent only to Groq's API (writer's own key, active session only). No TAA-operated server exists to receive data — this is a structural guarantee, not a privacy policy promise. Enforced by the combination of: GitHub Pages static hosting (no server-side execution), client-side JS pipeline (no server round-trips), and CSP that blocks connections to all origins except Groq. Architecture Principle P-05 and Architecture Requirement AR-03.