Voice Verification catches identity fraud in AI-led interviews by comparing the voice on the AI Call against a voiceprint built from the candidate's earlier qualifying video answers. The result appears as a badge on the candidate profile alongside TalkScore — giving your team a clear, structured identity-confidence signal on every completed interview.
Note: This feature requires a Company Admin to enable Voice Verification for your tenant. Owner access is required to configure which questions capture the voiceprint. Contact your Account Manager to request activation.
The Problem We Designed Around
Most candidates completing AI assessments today are doing it on a mobile phone. In markets like the Philippines, that number is close to 90%.
The cheating methods that most proctoring tools are built to catch — screen switching, opening a second tab, toggling to another app — require a desktop. On mobile, there is nothing to switch to. The signal those tools generate in a mobile-first context is not meaningful.
The method that actually works today costs almost nothing: two phones. The candidate holds one up to the camera. A second person answers the AI Call on the other. No screen switch happens. No visual anomaly appears. The recording looks clean.
There was a second constraint we did not want to ignore. A full-session video interview carries roughly 200 MB of data. For a candidate completing an assessment on a mobile data plan, that is a real cost — and it shows up directly in completion rates. Asking every candidate to absorb that overhead to catch a threat the tool is not actually detecting is not a trade-off worth making.
These were the two things we designed around: catch the cheat that is actually happening, without making it harder — or more expensive — for candidates to complete the assessment.
A Better Approach: Voice Verification
Voice Verification is built around the cheating method that is actually happening. Multi-speaker diarization analyses every AI Call recording for the number of distinct voices present. If three or more are detected — the candidate, the AI agent, and a third speaker — the system hears the handed-off interview and flags it, regardless of what the camera shows. No screen needs to be watched. No visual cue needs to appear.
Enrollment uses the qualifying video the candidate already records as part of your campaign. There is no separate proctoring session, no additional step for the candidate, and no extra bandwidth overhead. The AI Call itself is voice-only. In areas where connectivity is poor, the assessment can also run over the phone. The architecture was chosen specifically so that protecting interview integrity does not come at the candidate's expense.
Detecting the handed-off interview — three or more distinct voices on a call trigger an automatic mismatch, regardless of what appears on camera.
Removing the bandwidth burden — short qualifying video for enrollment plus a voice-only AI Call uses a fraction of a full video interview session's data footprint.
Catching cheap voice clone attempts as a side effect — a low-quality instant voice clone still carries detectable differences that the verification model picks up when compared against the enrolled voiceprint.
Surfacing results where your team already works — the three verification attributes plug into existing filters, exports, and rules engines with no integration changes required.
What About Screen Capture?
It is a natural instinct to want both — voice verification and screen capture running alongside each other. We considered it and chose not to ship screen capture as part of this feature.
The fraud pattern screen capture is meant to deter — someone else answering on behalf of the candidate — is already caught by multi-speaker diarization. If a third voice appears on the AI Call recording, the system flags it regardless of what the camera shows. There is no additional signal that screen capture adds for this scenario.
What screen capture does add is cost: bandwidth overhead, friction at the point of assessment, and a privacy concern for candidates. We chose not to ship it because it would ask more of every candidate without giving your team anything that Voice Verification does not already provide.
Voice Verification and AI-Generated Voices
Voice Verification can flag low-quality voice cloning attempts when the AI Call voice does not sufficiently match the enrolled voiceprint. Comparisons between a real voice and its Text-to-Speech-generated version have resulted in a “no match” outcome under the same verification process.
Voice Verification compares the voice on the AI Call against the enrolled voiceprint captured from the candidate’s earlier qualifying answers. Differences introduced by synthetic or altered speech may reduce the confidence score and lead to a mismatch result.
More advanced synthetic voices or higher-quality cloning techniques may still pass if they closely resemble the enrolled voiceprint. The system is designed to verify voice consistency between the AI Call and the original enrollment samples.
How It Works
Once Voice Verification is enabled for your tenant, the three steps below run automatically for every candidate who has a voiceprint enrolled.
Candidate answers a flagged question. When a candidate responds to an audio or video question with Capture voiceprint enabled, their voice is captured silently in the background. The candidate sees no change in their experience.
AI Call is completed. After the candidate finishes their AI Call, the full recording is automatically sent for analysis using multi-speaker diarization.
Result appears on the candidate profile. A badge appears in the candidate profile header, showing the verification outcome and confidence score.
Reading the badge
Badge | What it means |
✅ Voice verified · N% | High-confidence match — voice on the AI Call matches the enrolled voiceprint (≥85% confidence). No action needed. |
🟡 Voice review · N% | Borderline match — recruiter judgment required (55–84% confidence). |
🔴 Voice mismatch · N% | Different voice detected or low confidence (<55%). Flag for follow-up. |
🔴 Voice mismatch | Vendor failure — no confidence score returned. Flag for follow-up. |
Note: "Voice verified" means the voice on the AI Call matched the candidate's earlier qualifying answers above the confidence threshold. It does not mean a government ID was verified.
How It Looks for the Candidate
Nothing changes for the candidate. Whether the feature is on or off, the application flow looks identical — no extra prompt, no additional disclosure, no new step.
How It Looks in Talkpush
Flag the questions that capture the voiceprint
Once Voice Verification is enabled, you can choose which audio or video questions will be used to build the voiceprint.
Open the Templates Tab and go to Questions.
Edit an audio or video question.
Tick the Capture voiceprint checkbox.
Click Save.
Note: The Capture voiceprint checkbox only appears on audio and video question types. Text, dropdown, and other non-speech questions cannot have voiceprint enrollment enabled.
Review results on the candidate profile
Once a candidate completes an AI Call and a voiceprint exists for their application, the badge populates automatically.
The following attributes are also generated and available in your existing campaign filters, exports, and rules engines. No integration changes are required.
Attribute | Definition |
Voice Verification | The outcome of the verification check. Possible values: |
Voice Verification Confidence | A percentage score (0–100) indicating how closely the AI Call voice matched the enrolled voiceprint. Higher means stronger match. |
Voice Verification At | The date and time the verification result was generated, recorded when the AI Call analysis completed. |
Q: What happens if voiceprint enrollment fails on a candidate's answer? A short error message appears under that specific answer in the candidate profile. The AI Call pipeline is never blocked — if no voiceprint exists when the call completes, the call finishes normally and no badge is shown.
Q: Do existing integrations need to be updated? No. If you receive the AI Call webhook (push_ai_call), the three verification values appear automatically in the application.others block once the feature is enabled — no changes required on your end.
Q: Is any legal review needed before enabling this feature? Voiceprints are biometric data in several jurisdictions. If you operate in the EU, UK, or other regions with biometric data regulations, consult your Account Manager before enabling Voice Verification. |






