Select a scenario
Runs against the live HuggingFace Spaces backend. If the backend is unavailable, the pipeline animation runs in simulated mode with no retrieval result.
Active scenario
Happy path — procedure found
Technician Marcus is standing next to the Haas VF-2SS CNC on Line 3. The machine threw an E-04 fault at 07:14. He holds his phone to his mouth and speaks the query. VaultRAG normalises the voice input, retrieves the matching procedure from the indexed manual, and returns a cited step-by-step answer in under 8 seconds.
Guardrail Pipeline 00.00s
Voice Input
Web Speech API captures query from device microphone
G1 · Query Normaliser
Denoise transcription, reformat as structured query
G2 · Scope Guard
Check query relevance against document corpus
ChromaDB Retrieval
Semantic search · top-k=3 · cosine similarity
G3 · Confidence Threshold
Validate similarity scores ≥ 0.70
G4 · Safety Flag
Scan chunks for LOTO / hazard keywords
Llama 3.2 Generation
Structured prompt · step-by-step · citation mandatory
G5 · Citation Enforcer
Validate source reference present in output
Response Delivered
Cited answer returned to technician's phone
Voice Input — Web Speech API
Select a scenario and press Run simulation
ChromaDB Retrieval Results
Guardrail Pipeline Status
G1Query NormaliserWaiting
G2Scope GuardWaiting
G3Confidence ThresholdWaiting
G4Safety FlagWaiting
G5Citation EnforcerWaiting
Response
Pipeline reference

Nine steps through the guardrail pipeline

From voice input to cited response, the pipeline executes nine steps in sequence. The simulator above traces each step with timing. The steps below describe the mechanism at each stage.

01
Technician speaks

One-handed. One button. Web Speech API captures audio from the device microphone, no install required.

Channel: Browser mic
02
G1 normalises

Raw transcription ("uh E zero four error on the Haas") is cleaned and reformulated into a structured query by the LLM with a strict system prompt.

< 0.8s
03
G2 checks scope

Top-1 similarity pre-check against the corpus. If nothing is remotely relevant, refuse immediately before spending retrieval budget.

< 0.3s
04
ChromaDB retrieves

Top-3 chunks by cosine similarity. nomic-embed-text embeds the query. Procedural sections returned with metadata: doc title, section, page range.

< 1.2s
05
G3 validates confidence

Best chunk similarity must be ≥ 0.70. Below threshold: refuse with explanation. Never generate a response the corpus cannot support.

< 0.1s
06
G4 flags safety

Scans retrieved chunks for LOTO, lockout, high voltage, pressure vessel, hazmat keywords. If triggered, mandatory safety prefix prepended to response.

< 0.2s
07
Llama 3.2 generates

Structured prompt enforces step-by-step format, maximum 5 steps, citation required. Response generated entirely locally — no external API call.

< 4.5s
08
G5 enforces citation

Validates source reference is present in output. If missing, one retry with stricter prompt. If still absent, response is blocked and refused.

< 0.2s
09
Answer delivered

Cited, structured procedure appears on the technician's phone. Section number, page range, document title. Traceable to source.

Total: < 8s (target)