Engineering
How we shipped sub-second whisper latency on ElevenLabs Speech Engine.
Sub-second whisper latency was not a single optimization — it was twelve. We cut median end-to-end from 1.4 seconds down to 412 milliseconds across eighty live calls, and the path there ran through chunked diarization, streaming token boundaries, and a brutal opinion about what counts as "ready to speak."
Dani ParkMay 22, 2026