Deeper · 03.C
Every call, audited.
The AI is not allowed to drift. A Vercel cron pulls every transcript every 15 minutes and runs assertions.
01
The cron
/api/elevenlabs-quality-scanruns every 15 minutes on Vercel. It pulls all completed calls in the last hour, downloads transcripts, and runs assertions.
02
The assertions
- 01dispatch_lead was calledA call without dispatch_lead is a wasted lead. Hard fail.
- 02No date-of-birth requestedPrivacy red line. The receptionist must never ask for DOB.
- 03No specific clinic address givenIf the agent reveals the provider address, the customer can bypass. Hard fail.
- 04No "None" / placeholder output after goodbyeSmell test that the prompt is producing real content end-to-end.
- 05Correct service routingA plomberie call must dispatch to plumbing, not dental. Routing matrix check.
03
Mandatory pre-deploy simulations
Any change to an ElevenLabs agent prompt, tool, voice or model must pass three simulation scenarios before going live:
- Dental emergency at 2am.
- Off-topic caller (asking the wrong question).
- Plomberie routing (multi-vertical confusion test).
All three must call dispatch_lead, none must give DOB or address, none must end on "None". If any fails, the prompt is rolled back. A/B testing exists but is reserved for non-urgent prompt tweaks, never for fixes or provider-data requirements.
Cost of a real test call: ~$0.04. Cost of a missed lead: €100 plus the partnership. Skipping the simulation is mathematically irrational.