Together AI debuts unified voice agent infrastructure with Deepgram and Cartesia integrations, targeting enterprise deployments with end-to-end latency under 700msTogether AI debuts unified voice agent infrastructure with Deepgram and Cartesia integrations, targeting enterprise deployments with end-to-end latency under 700ms

Together AI Launches Voice Agent Platform With Sub-700ms Latency

2026/03/13 09:57
3 min read
For feedback or concerns regarding this content, please contact us at [email protected]

Together AI Launches Voice Agent Platform With Sub-700ms Latency

Lawrence Jengar Mar 13, 2026 01:57

Together AI debuts unified voice agent infrastructure with Deepgram and Cartesia integrations, targeting enterprise deployments with end-to-end latency under 700ms.

Together AI Launches Voice Agent Platform With Sub-700ms Latency

Together AI rolled out a unified voice agent platform that keeps speech-to-text, language models, and text-to-speech processing on the same infrastructure cluster. The $3.3 billion AI cloud startup claims the setup delivers end-to-end latency under 700 milliseconds—fast enough for natural conversation flow.

The platform integrates natively with Deepgram for transcription and Cartesia for voice synthesis, both running on Together's co-located servers rather than bouncing audio across multiple cloud providers.

Why Co-Location Matters for Voice

Most production voice systems stitch together separate vendors for each pipeline stage. Audio hits one provider for transcription, routes to another for the LLM response, then bounces to a third for speech synthesis. Each handoff adds network latency and failure points.

Together's pitch: keep everything in the same datacenter. The company reports sub-500ms latency in optimal conditions, though the 700ms figure represents their stated ceiling for end-to-end processing.

"Voice agents live or die by latency, and every network hop between providers is a place where the experience breaks down," said Abe Pursell, Deepgram's VP of Partnerships.

Model Flexibility Without the Patchwork

The platform supports Whisper Large v3, Minimax Speech 2.6 Turbo, Rime Arcana, and Kokoro alongside Together's full LLM catalog. Developers can swap components without rebuilding integrations—useful for teams testing different voice characteristics or transcription accuracy for specific use cases.

Cartesia brings its Sonic-3 and Sonic-2 TTS models to the platform. Deepgram contributes Nova-3, Nova-3 Multilingual for transcription, Flux for conversational STT, and Aura-2 for synthesis.

Unlike opaque speech-to-speech systems, Together's modular approach preserves access to intermediate transcripts and response text. Teams can inspect, modify, and route data mid-stream—a requirement for many enterprise compliance workflows.

Enterprise Requirements and Production Use

The platform targets regulated industries with zero data retention options, SOC 2 Type II certification, HIPAA compliance, and dedicated data residency. Decagon, which runs customer support voice agents handling billing inquiries and technical troubleshooting, already operates on the stack.

Together AI raised $305 million in February 2025 at a $3.3 billion valuation, with reports suggesting the company is now in talks to raise at $7.5 billion. The company has surpassed 450,000 developers and crossed $100 million in annualized revenue.

The voice platform launch represents Together's expansion beyond its core LLM inference business into the growing voice AI market, where latency and reliability remain persistent pain points for production deployments.

Image source: Shutterstock
  • together ai
  • voice agents
  • ai infrastructure
  • deepgram
  • cartesia
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.