The Insights tab is your quality control center — it checks whether the interviews themselves were good. The Reports tab is where you review individual candidates. This article walks through both.

Insights Tab

Insights has four tabs: Health, Feedback, Quality, and Alerts. Together, they give you a complete picture of agent behavior, scoring accuracy, and systemic issues.

Health tab

Health gives you the big-picture view of quality across all your AI agents. Three headline metrics sit at the top:

Metric	What it tells you
With Issues (%)	Percentage of conversations that triggered at least one quality flag. One conversation can trigger multiple flags.
With Bias (%)	Percentage of conversations where a bias or fairness flag was detected. Tracked separately because bias issues carry higher risk.
Agents with Issues	How many active agents had at least one flag. If most agents are flagged, the problem is likely systemic. If only 1–2, it's specific to those agents.

Below these, a ranked list shows every flag type with its count and category. The full flag taxonomy is at the end of this article — skip down to Flag taxonomy reference when you need to look one up.

Flags are assigned one of three severity levels:

Severity	What it means	What to do
🚨 Critical	Requires immediate action.	Contact your Talkpush representative immediately. Do not dismiss without investigation.
⚠️ Warning	Needs monitoring and investigation.	Review flagged candidate reports and monitor for recurrence.
ℹ️ Info	A new or infrequent flag has appeared.	Monitor for recurrence. Escalate if the flag appears across multiple candidates.

Important: Bias flags are always Critical severity. The nine bias types monitored include gender, racial, age, socioeconomic, disability, cultural/nationality, language proficiency, accent/dialect, and interview format bias. Even a single bias flag requires professional review — do not assume it is a false positive. Contact your Talkpush representative immediately.

When you see an alert — do's and don'ts

✅ Do	❌ Don't
Read alert details and review flagged reports	Dismiss Critical alerts without investigation
Note alert type, severity, affected agent, time period	Assume bias flags are false positives
Contact your Talkpush representative for Critical alerts	Tell candidates about quality flags or alerts
Report recurring Warning or Info alerts	Attempt to change agent configuration yourself — contact your representative instead

Below the flags, an Agent Performance table shows per-agent calls, completion rate, average score, std. deviation, average duration, and last call time. Use this to identify which specific agents are generating the most issues.

Feedback tab

Feedback captures human reviews of AI-generated scores. When a recruiter or QA reviewer looks at a candidate report and disagrees with (or confirms) the score, they can submit a Score Opinion.

Each entry shows the reviewer name, assessment, candidate, label (e.g., "Score Opinion"), comment (e.g., "overscored: Professionalism and Courtesy"), date, and status (New or Reviewed). Click Report to jump directly to the candidate's full assessment.

Why this matters: Human feedback grounds-truths the AI scoring. If reviewers consistently flag a particular skill as overscored, that's a strong signal to tighten the rubric criteria. If reviews consistently confirm scores, you have validation that the rubric is working well. Share patterns you spot with your Talkpush representative.

Quality tab

The Quality tab is a searchable, filterable log of every individual quality flag raised across all conversations. While Health shows aggregated counts, Quality lets you drill into specifics.

Filters available: time period, assessment (specific AI agent), campaign, category (Agent, Scoring, Technical, Candidate), and issue type (Hallucination, Repetition, etc.). Each row shows the date, candidate name and ID, campaign, category, issue type, and a plain-language explanation of what happened.

Use it to:

Investigate a specific flag type — filter by Hallucination to see all instances. Read the explanations to understand the pattern (placeholder values read aloud? fabricated job details?).
Audit a specific agent — filter by assessment to see all flags for one agent. Repeated flags point to a systemic configuration issue.
Track improvement after a fix — filter by date to confirm a flag stopped appearing after a system prompt or rubric change.

Alerts tab

Alerts surfaces automated findings across Agent, Scoring, Health, and Quality categories. Alerts are generated automatically based on configurable thresholds — for example, when hallucination rates exceed a set percentage, or when bias flags appear above a defined frequency.

To receive immediate email notifications for Critical and Warning flags, go to Settings → Notifications and enable Critical Flag Alerts.

Reports Tab

The Reports tab shows every candidate who has been through a TalkScore AI interview. You can search and filter the list, export to CSV, and click into any candidate to see their complete assessment.

Quick stats

Above the table, summary cards give you an at-a-glance read of the candidates matching your filters:

Card	What it shows
Total Reports	Count of reports matching your filters.
Avg Score	Mean overall score across listed candidates.
Score 4–5	Count and percentage of high-performing candidates.
Score 0–2	Count of candidates who scored low (also shows how many scored exactly 3).

Browsing the report list

Filter / Control	What it does
Live Data toggle	Updates the list in real time as new interviews come in. Useful for monitoring live campaigns.
Time period	Filter by Last 7 days, Last 30 days, or a custom date range.
Status filter	Show only completed calls, or filter by score range or CEFR level.
Assessment / Campaign filters	Narrow results to a specific AI agent or recruitment campaign.
Test Calls toggle	Show or hide internal test calls. Off by default so only real candidate data shows.
Export CSV	Download the full report list as a spreadsheet for offline analysis or sharing.

Reading a candidate report

Click any candidate row to open their full assessment. The report is organized into sections:

Section	What it contains
Header	Name, assessment, overall score, date, duration, completion status, candidate ID, email, phone number, assessment agent.
Interview Recording	Full MP3 audio player. Listen to the actual conversation alongside the transcript.
Transcript	Complete conversation with speaker labels (agent name and candidate initials), timestamps, and turn count. This is the ground truth — always check it when reviewing or questioning a score.
Per-Dimension Scores	Each soft skill scored 0–5 with a paragraph of AI reasoning that cites specific transcript evidence. The AI doesn't just assign a number — it explains its reasoning with direct references to what the candidate said.
Data Extraction	Structured information automatically pulled from the conversation: candidate feedback summary, eligibility answers (work authorization, age, drug screen consent), rehire status, onsite training preference, and whether the candidate rejected the AI interview at any point.
Sentiment Analysis	Overall sentiment summary describing how engagement evolved, per-question sentiment scores (1–5 per question showing tone shifts), and an overall sentiment shift (more positive, more negative, or neutral). Sentiment can surface concerns that scores miss.
Candidate Questions	Questions the candidate asked during the interview, with timestamp, stage (e.g., "closing"), exact question text, and context. Often a signal of engagement.
Agent Quality Analysis	AI review of the agent's behavior in this specific call — overall assessment plus each individual flag with severity, timestamp, exact transcript quote, and a plain-language explanation.

Interpreting the TalkScore

The overall TalkScore is the average of individual soft skill dimension scores on a 0–5 scale. Score thresholds vary by client and role — confirm with your Talkpush representative if you're unsure what constitutes a pass for your specific assessment.

Score Range	General interpretation
4–5	Strong match. Candidate performed well across configured criteria.
3–4	Partial match. Review the per-dimension breakdown and transcript before deciding.
0–3	Did not meet criteria. Check for call quality issues (very short call, technical problems) before concluding.

Note: Do not rely on the overall score alone. The per-dimension breakdown, sentiment analysis, Agent Quality Analysis, and transcript together give a much richer picture of each candidate.

CEFR in reports

If your assessment includes language evaluation, each candidate receives a CEFR level from A1 (beginner) to C2 (near-native). Each report includes per-dimension language scores for Grammar, Fluency, Vocabulary, Pronunciation, and Comprehension (each on a 0–10 scale), plus a plain-language explanation of the candidate's English proficiency. CEFR scores are evaluated independently from soft skill scores.

Submitting score feedback

If you believe a score is incorrect after reviewing the full report, use the Feedback feature to submit a Score Opinion. Note which dimension seems wrong and why. Feedback appears in Insights → Feedback and helps the Talkpush team identify patterns for rubric refinement.

Common workflows

Investigating a hallucination spike

Go to Insights → Health and note the hallucination count.
Switch to the Quality tab and filter by Issue Type: Hallucination.
Read the explanations for the most recent flags. Look for the pattern:
- Placeholder values read aloud (e.g., "This is a None, None Remote CSR role") — The agent's data fields aren't populated correctly.
- Fabricated details (e.g., inventing a salary or schedule) — The agent's configuration needs stricter guardrails.
- Misquoted job details — The agent's configuration contains outdated information.
Share your findings with your Talkpush representative, including the affected agent name, time period, and the pattern you identified.
Monitor the Quality log over the next few days to confirm hallucination flags stop appearing after the fix.

Weekly quality review

Go to Insights → Health, set time filter to "Last 7 days."
Check the three headline metrics. Is "With Issues" trending up or down?
Look at the top 3 flag types. Are they the same as last week, or are new issues emerging?
Go to Feedback and check for new Score Opinions. Any patterns in what's being overscored or underscored?
If any agent has a disproportionate number of flags, open its assessment to review the configuration.

Reviewing a flagged candidate

Navigate from Insights → Quality (or from a notification email) to the candidate's report.
Read the Agent Quality Analysis section first to understand what was flagged.
Check the timestamp on the flag and find that moment in the Transcript.
Listen to the Recording at that timestamp to hear the actual exchange.
Review the Per-Dimension Scores — did the quality issue affect scoring? (e.g., if the agent hallucinated job details, the candidate may have responded based on wrong information.)
Decide whether the candidate needs to be re-interviewed or the score manually adjusted.

Using reports for rubric refinement

Go to Reports and sort or filter to find candidates who scored at the extremes (0–2 or 5).
Open 3–5 reports from each extreme.
Read the AI reasoning for each per-dimension score. Does it make sense? Is a "5" truly excellent, or is the bar too low?
Compare candidates who scored 4 vs. 5 — can you tell the difference from the transcript? If not, the rubric criteria for those levels may need to be more specific.
Share findings with your Talkpush representative — they can refine the scoring rubric based on the patterns you've identified.

Flag taxonomy reference

All quality flags fall into three categories. Use this as a lookup when you see a flag and want to understand exactly what it means.

Agent Quality flags (12)

Issues with the AI agent's behavior during the conversation.

Flag	What it means
Hallucination	The agent stated something factually incorrect or read placeholder values (like "None") as real job details. The most serious agent flag.
Repetition	The agent repeated the same question or phrase multiple times.
Intent Misunderstood	The agent misinterpreted the candidate's response and replied with something irrelevant.
Abrupt End	The conversation ended suddenly without proper closing.
Vague Answers	The agent accepted vague candidate responses without probing for detail.
Technical Error	A system issue occurred during the interview (audio, connection, processing).
Question Ignored	The agent skipped or didn't address a question the candidate asked.
User Confusion	The candidate appeared confused by the agent's instructions or questions.
Off-Topic Deviation	The agent strayed from the interview script to discuss irrelevant topics.
Leading Questions	The agent asked questions that suggested the desired answer.
Incomplete Evaluation	The agent ended the interview without covering all required assessment areas.
Unprofessional Tone	The agent used informal, rude, or unprofessional language.

Scoring flags (3)

Contradictions or gaps in scoring data.

Flag	What it means
Score Completion Mismatch	High score on an incomplete call, or low score on a completed one.
High Score Incomplete	High overall score despite the call not being completed.
Missing Score Completed	A completed interview has no scores — the scoring pipeline may have failed.

Candidate Behavior flags (2)

Flag	What it means
Requested Escalation	The candidate asked to speak to a human recruiter.
Used Profanity	The candidate used inappropriate language.

Specific alert types

Alert type	What it means
Bias flag spike	Multiple bias flags detected in a short period (e.g., 6 incidents in 7 days). Always Critical — contact your Talkpush representative immediately.
Score compression	High percentage of candidates receiving the same or very similar scores. A calibration issue, not a candidate quality issue.
Score completion mismatch	Scores were generated for calls that did not fully complete, or completed calls are missing scores.
Agent loop / Repetition	The agent got stuck repeating the same question or phrase.
Lack of empathy	The agent failed to acknowledge or respond appropriately to a candidate's emotional state.
Poor tone / Unprofessional tone	The agent used language that was informal, dismissive, or inappropriate for a professional interview.

Need to troubleshoot an issue?

For step-by-step help with missing reports, score concerns, Critical alerts, score compression, exporting data, and other common questions, see the FAQ and Troubleshooting article.

Getting Started with TalkScore Hub

What is TalkScore?

FAQ and Troubleshooting

TalkScore Live AI Interviews

TalkScore: How is the score generated?

Insights and Reports