CASE STUDY

Voice AI Bot for HR Screening & Workflow Automation

Industry Human Resources / Recruitment

Region International

Timeline Full-cycle engagement

Team Trembit dedicated engineering team

Trembit Services

Software Development R&D WebRTC Development

Voice AI

Speech-to-Text Text-to-Speech

LLM

OpenAI Gemini APIs

Language

Python

The Problem

A staffing company processing thousands of applications per month was bottlenecked at the initial screening stage. Every candidate who passed the resume filter needed a fifteen-minute phone call where a recruiter asked standardized qualifying questions, took notes, and decided whether to advance them. The questions and criteria were well-defined, but each recruiter could handle maybe twenty calls per day. During hiring surges, qualified candidates waited days for a screening call, and the best ones accepted offers elsewhere before a recruiter reached them. They needed a system that could make these screening calls autonomously — conducting scripted voice conversations, understanding spoken responses, summarizing answers, rating candidates against criteria, and delivering a prioritized list so human recruiters could focus on the candidates who needed personal attention. Text-based screening had failed because candidates expected a phone call and response rates were far lower.

Why Building a Voice AI Screening Bot for Recruitment Is Hard

Automating phone-based candidate screening combines real-time voice processing with conversational AI and domain-specific evaluation — each with challenges that multiply when they must work together in a live call:

Natural-sounding voice interaction that does not feel robotic — candidates form an impression within seconds; a synthetic voice, mechanical pacing, or awkward pauses cause disengagement and drop the screening rate below a chatbot's
Accurate speech-to-text in real-world phone conditions — candidates call from cars, coffee shops, and construction sites with varied accents and audio quality, and the system must extract the content of answers regardless
Conversational flow management in voice — the bot must detect when the candidate has finished speaking (not just paused), handle interruptions, ask follow-ups when answers are incomplete, and recover from off-script moments without feeling like an interrogation
Consistent candidate evaluation from unstructured spoken responses — two candidates can give equivalent answers in completely different words; the AI must evaluate substance against criteria, not pattern-match keywords, to assess everyone fairly
Compliance and candidate experience in automated calls — the system must identify itself as AI, handle opt-outs immediately, respect calling-time restrictions across time zones, and record only with consent
Integration with existing HR workflows — screening results must flow into the ATS as structured data recruiters can filter and act on without listening to every recording

What We Did

Voice Conversation Engine

Built the voice conversation engine that conducts screening calls end-to-end — initiating outbound calls, greeting candidates with a natural voice, identifying itself as an AI assistant, and navigating the script with contextual branching
Implemented real-time speech-to-text using ASR optimized for phone-quality audio — handling compression artifacts, background noise, and accents with accent-adaptive processing
Developed natural text-to-speech and turn-taking with silence detection — distinguishing a thinking pause from a finished answer, handling interruptions, and recovering when audio drops

Screening Script & Conversational AI

Designed the dynamic screening script engine — configurable question sets per role with conditional branching where follow-up questions depend on previous answers
Implemented conversational AI using OpenAI and Gemini APIs — processing transcribed responses in real time, deciding whether an answer needs a follow-up, and handling off-script questions with prepared responses
Developed answer completeness detection and multi-language support — generating natural follow-up prompts for vague responses and switching the conversation to a supported language when needed

Response Analysis & Candidate Rating

Built the response summarization engine — condensing each candidate's answers into structured summaries capturing experience, skills, availability, and salary expectations
Implemented candidate scoring against role-specific criteria with weighted, calibrated ratings, plus consistency calibration so equivalent answers receive equivalent scores regardless of phrasing
Built sentiment and engagement analysis capturing soft signals (enthusiasm, confidence) as supplementary data points for recruiters

HR Dashboard, Prioritization & Deployment

Built the HR prioritization dashboard — screened candidates ranked by composite score, filterable by role and qualifications, with one-click access to summary, transcript, recording, and scoring breakdown
Implemented automated prioritization into tiers (advance, review, does not meet criteria) and ATS integration pushing summaries, scores, and recordings into the existing workflow
Built scheduling and compliance management plus call-quality monitoring — permitted-hours calling, consent recording, audit trail, and alerts when metrics drift

Discuss Your Project

Key Results

10x throughput Hundreds of candidates screened per day versus twenty per recruiter

Same-day Candidates contacted within hours of application, not days

Standardized Every candidate assessed against identical criteria with calibrated scoring

80%+ time saved Reduction in initial screening hours, freeing recruiters for high-value interviews

Professional Natural voice interaction with opt-out, rescheduling, and consent management

In Their Words

Trembit built us a voice bot that screens candidates the way our best recruiters do — it asks the right questions, listens to the answers, and tells us who to talk to next. We went from a three-day screening backlog to same-day turnaround, and we are not losing top candidates to competitors anymore.

Staffing company VP of Talent Acquisition

Their proactive team gets things done as if it were their own project.

Trembit client

What We Learned

Turn-taking in voice AI is harder than understanding the words

Speech recognition accuracy matters, but conversational rhythm makes or breaks the candidate experience. Early versions had accurate transcription but awkward timing — speaking before the candidate finished or leaving uncomfortable silences. We built a multi-signal turn-taking model combining audio energy, speech cadence, syntactic completeness, and prosody to predict the right moment to respond. Call completion rates improved immediately.

Candidate rating consistency requires evaluation against criteria, not comparison between candidates

The initial approach ranked candidates relative to the pool — so good answers scored low in a strong pool and mediocre answers scored high in a weak one. We switched to absolute criteria evaluation, scoring each response against role requirements independently. The rating a candidate received on Monday was the same on Friday, and HR teams trusted scores because they were stable and explainable.

The highest-impact feature is the structured summary, not the call itself

If a recruiter has to listen to the recording to evaluate a candidate, the bot has not saved time — it has shifted the work. The summarization engine distills each answer into two or three sentences, flags qualifications, and notes concerns in a format recruiters scan in thirty seconds. Recruiters told us they made better decisions faster from the summaries than from their own handwritten notes during live calls.

Building something similar? See our entertainment app development services →

Need a Voice AI Bot?

Book a 30-minute architecture session — we'll discuss your HR automation requirements and the infrastructure decisions that matter most. No pitch deck. Just engineering clarity.