voice-ai

47 items

CASEOpenAI Blog·5/7/2026

Parloa builds service agents customers want to talk to

Parloa leverages OpenAI models to power scalable, voice-driven AI customer service agents. This enables enterprises to design, simulate, and deploy reliable, real-time interactions.

customer-experience AI customer service OpenAI Enterprise AI

RESEARCHDEV.to AI·7d ago

VibeVoice Technical Report

This technical report details the architecture, methodologies, and performance metrics of the VibeVoice system. It provides an in-depth analysis of its underlying technology and implementation.

AI system speech technology audio processing technical report

DOCDEV.to AI·14d ago

🎤 Building a Real-Time Voice AI Assistant Using Open Source Tools

This project details the creation of a real-time Voice AI assistant using entirely open-source tools and APIs, focusing on building a full voice conversation pipeline. The author emphasizes understanding the underlying mechanics, addressing challenges like latency to make conversations feel natural, and provides a free-to-build solution.

Open Source AI assistant tutorial real-time AI

ARTICLEDEV.to AI·4/26/2026

Building Voice AI for Vertical SaaS: Lessons from Vet Clinics and Auto Shops

The content details lessons learned from developing Voice AI receptionists for vertical SaaS in veterinary clinics and auto repair shops. It emphasizes the need for vertical-specific NLU and intent classifiers to handle high call volumes and complex triage requirements, addressing revenue loss from missed calls.

AI applications customer service Vertical SaaS NLU

DOCDEV.to AI·22d ago

I Built a Voice AI Tutor in 200 Lines of Code (and Zero Backend)

This article demonstrates how to build a voice AI tutor in just 200 lines of code, with no backend. It explains the core architecture of voice AI: converting audio to text, sending it to an AI brain, and turning the reply back into audio.

learning Speech-to-Text Text-to-Speech browser AI

ARTICLEDEV.to AI·4/25/2026

I Built an AI That Makes Real Phone Calls — Here's the Architecture

The author built an AI agent called CallPilot capable of making real phone calls, such as ordering food, at very low costs. The article will detail the architecture behind this open-source tool, including the WebSocket bridge and the RAG layer.

real-time communication AI architecture automation AI agents

ARTICLEDEV.to AI·20d ago

Voice AI metrics no one writes about but every production team tracks

This article highlights the critical importance of end-to-end latency in voice AI systems, noting that delays beyond a quarter-second severely degrade user experience. It proposes a comprehensive method for tracking latency from microphone driver to TTS engine, accounting for network jitter and I/O overhead beyond just model inference time.

Performance Metrics production user experience latency

ARTICLEDEV.to AI·5/7/2026

Voice AI for jobsite estimating: a developer perspective

The main challenge in developing voice AI for jobsite estimating is not the technology itself, but rather the user experience in blue-collar environments. This article details the technical and UX decisions made by a company to optimize voice interfaces for blue-collar workers, aiming to prevent common mistakes.

UX/UI developer guide Speech Recognition voice-ai

DOCDEV.to AI·5/4/2026

Voice AI for Jobsite Estimating: A Developer's Practical Guide

This article provides a practical guide for developers on deploying voice AI solutions for construction jobsite estimating. It details technical and UX lessons learned from over 50 real jobsites, highlighting how voice AI addresses the unique challenges of field workers compared to traditional SaaS.

Estimating construction UX developer guide

ARTICLEDEV.to AI·4/15/2026

voice- Agent model

This article details the creation of a modern, responsive Voice-Controlled AI Agent, capable of understanding context and performing complex technical tasks. It outlines the architecture, which includes leveraging the Groq LPU Inference Engine and Whisper Large V3 for ultra-fast Speech-to-Text transcription.

Whisper AI agent Groq LPU Speech-to-Text

DOCDEV.to AI·4/26/2026

GPT-4o Voice on a Real Phone Line: Connecting OpenAI Realtime API to Actual Calls

This post details the architecture and code to integrate OpenAI's GPT-4o Realtime API with actual phone lines. It addresses the challenges of connecting WebSockets (OpenAI) to RTP/SIP (phone networks) and introduces VoIPBin as an audio translation solution.

telephony real-time GPT-4o API Integration

ARTICLEDEV.to AI·4/13/2026

🎤 Building a Voice AI Assistant using STT, LLM, and Gradio

This article details the construction of a Voice AI Assistant, integrating speech-to-text (AssemblyAI) and a local LLM (Ollama) for intent detection. The system allows users to perform actions such as creating files and generating code through spoken commands.

AI assistant Gradio STT LLM

ARTICLEDEV.to AI·4/12/2026

Building a Voice-Controlled AI Agent with Groq and Streamlit

The content describes building a local voice-controlled AI agent that performs tasks on a computer without requiring a GPU. It highlights the use of the Groq API for its remarkable inference speed, which is essential for ensuring a low-latency experience crucial for voice agents.

Groq AI agent Streamlit LLM

CASEDEV.to AI·4/27/2026

We Built a Voice AI Receptionist in 8 Weeks — Every Decision We Made and Why

Autor's team built a voice AI receptionist for healthcare clinics in just 8 weeks, going from concept to a production system handling live patient calls 24/7. The system processes thousands of calls monthly, addressing the issue of missed after-hours calls and freeing up staff time.

customer-service-automation AI implementation healthcare AI voice-ai

CASEOpenAI Blog·5/6/2026

Uber uses OpenAI to help people earn smarter and book faster

Uber is utilizing OpenAI to power AI assistants and voice features, aiming to help drivers earn smarter and riders book faster. This enhances the efficiency of its global real-time marketplace.

Uber OpenAI AI Assistants Ride-sharing

NEWSOpenAI Blog·5/7/2026

Advancing voice intelligence with new models in the API

OpenAI is introducing new realtime voice models in its API that can reason, translate, and transcribe speech. These advancements enable more natural and intelligent voice experiences.

OpenAI Translation API Speech-to-Text

CASEOpenAI Blog·5/4/2026

How OpenAI delivers low-latency voice AI at scale

OpenAI rebuilt its WebRTC stack to power real-time voice AI with low latency, global scale, and seamless conversational turn-taking. This technical endeavor focused on optimizing their infrastructure for highly responsive AI interactions.

OpenAI low-latency WebRTC real-time AI

ARTICLEAnalytics Vidhya·4/30/2026

Grok Voice Think Fast 1.0: Build Voice AI Agents That Actually Think

xAI launched Grok Voice Think Fast 1.0, an innovative voice assistant that enables rational, uninterrupted spoken dialogues. This model quickly became the top performer on the τ-voice Bench leaderboard in April 2026.

Grok xAI Benchmarking AI agents

DOCGoogle for Developers (YouTube)·4/29/2026

Building Voice Agents with Gemini Live API and Agora’s Conversational AI

This content focuses on building interactive voice agents using the Gemini Live API and Agora's Conversational AI technology. It provides a practical guide for developers interested in creating advanced voice solutions.

development Agora Gemini API Conversational AI

Building Voice Agents with Gemini Live API and Agora’s Conversational AI

ARTICLEDEV.to AI·28d ago

Digitalisation des devis BTP : comment l'IA vocale change le quotidien des artisans

Small construction businesses lose hours manually preparing quotes due to inadequate digital tools on-site. Voice AI can optimize this process, allowing artisans to dictate quotes directly on the construction site in real-time.

construction productivity Digitalization AI