Start for free

Assessment questions that reveal genuine thinking

How to write questions that separate authentic candidates from AI-generated responses. 10 frameworks ChatGPT struggles with.

Back to blog

When 75% of candidates are using AI to write their applications, the assessment question you ask becomes your most important hiring decision. Ask the wrong question, and you get polished but hollow responses that all sound the same. Ask the right question, and you can spot genuine thinking in seconds.

The good news: AI is surprisingly bad at certain types of questions. Not because it lacks knowledge, but because these questions require something AI fundamentally cannot provide: personal experience, genuine opinion, and authentic reasoning about specific situations.

Why most assessment questions fail

Here's the problem. Most assessment questions are some variation of:

These questions have two fatal flaws. First, they're so common that AI has been trained on millions of "perfect" answers. Second, they're vague enough that a generic response can sound plausible for any candidate.

92%
of AI-generated answers pass basic screening

Generic behavioral questions get generic AI answers that are nearly impossible to distinguish from authentic responses without additional signals.

The solution isn't to make questions harder. It's to make them more specific, more personal, and more grounded in the actual work you need done.

10 question frameworks that work

Certain question types consistently reveal genuine thinking. Here are 10 frameworks you can adapt for any role.

1. The specific constraint question

Force candidates to work within real-world limitations. AI tends to give idealized answers; genuine candidates know that reality is messy.

Example question
"You have a budget of £500 and 2 weeks to increase signups by 20%. What would you do, and what would you explicitly NOT do?"

Why it works: AI struggles with trade-off reasoning. Authentic candidates reveal their priorities and real-world experience through what they choose to exclude.

2. The "what would you change" question

Ask candidates to critique something specific about your product, process, or industry. Genuine candidates have opinions; AI hedges.

Example question
"Look at our homepage for 2 minutes. What's one thing you would change and why? What's one thing you wouldn't touch?"

Why it works: Requires genuine opinion and reasoning. Even if AI can describe what it sees, it doesn't have real preferences or design instincts. Candidates who care will have specific, defensible views.

3. The failure reflection question

Ask about something that went wrong and what they learned. Authentic failure stories have specific, uncomfortable details that AI tends to smooth over.

Example question
"Tell me about a professional decision you made that turned out to be wrong. What specifically did you misjudge, and how did you realize it?"

Why it works: AI gives sanitized failure stories with neat lessons. Real failures are messier, more specific, and often reveal more about judgment than successes do.

4. The unpopular opinion question

Ask for a view that goes against conventional wisdom in your industry. Genuine candidates have developed their own perspectives through experience.

Example question
"What's something most people in [industry] believe that you think is wrong? Why do you disagree?"

Why it works: AI is trained to reflect consensus views. Authentic candidates who've done the work often have contrarian insights they can defend with real experience.

5. The micro-detail question

Zoom in on a tiny, specific aspect of the work. People who've actually done the job know the small things; people who haven't (including AI) speak in generalities.

Example question
"Walk me through exactly how you would name the files and organize the folder structure for a design project with 3 stakeholders and 5 revision rounds."

Why it works: This level of operational detail is rarely documented online. Only people who've dealt with real project chaos have opinions about folder naming.

6. The honest assessment question

Ask candidates to evaluate their own fit, including areas where they might struggle. Self-awareness is hard to fake.

Example question
"Based on our job description, what part of this role do you think you'd be weakest at? How would you address that gap?"

Why it works: AI gives false modesty. Genuine candidates who've read the job description carefully will identify real gaps and have realistic plans to address them.

7. The disagreement question

Present a decision or approach and ask them to argue against it. The ability to steelman opposing views reveals depth of understanding.

Example question
"We're considering moving to a fully remote model. Make the strongest argument against this decision, even if you personally support remote work."

Why it works: AI tends to take the socially acceptable position. Genuine candidates can articulate nuanced counterarguments because they've actually thought through the trade-offs.

8. The "show your work" question

Ask not just for an answer, but for the reasoning process that led there. Authentic thinking leaves a trail; AI often skips straight to conclusions.

Example question
"Here's a dataset of our last 6 months of customer support tickets. What patterns do you notice, and walk me through how you identified them?"

Why it works: Requires actual analysis, not just pattern-matching to common answers. The reasoning process reveals genuine capability.

9. The personal context question

Connect the question to their specific background or previous experience. This makes generic answers obviously wrong.

Example question
"Looking at your experience at [previous company], what's one process you helped improve there that you think would work well here? What would you need to change about it?"

Why it works: Requires genuine knowledge of both their history and your company. AI can't know what actually happened at their previous job.

10. The priority question

Give a list of tasks and ask what they'd do first, last, and not at all. Resource allocation reveals values and judgment.

Example question
"You're starting this role on Monday. Here are 8 things on your plate. Rank them in order of how you'd tackle them, and explain your top and bottom choice."

Why it works: There's no objectively correct order. The answer reveals how candidates think about impact, urgency, and stakeholder management.

The behavioral signals matter too

Even the best question can be gamed with enough effort. That's why the how matters as much as the what. When you combine thoughtful questions with behavioral signals like time spent, editing patterns, and paste detection, you get a much clearer picture of authenticity.

A candidate who spends 12 minutes crafting a response, making multiple edits and revisions, is showing you something different than one who pastes a polished paragraph in 30 seconds.

The goal isn't to trick candidates or make things harder. It's to give authentic candidates a chance to show who they really are, instead of losing them in a sea of identical AI-generated responses.

Putting it together

The best assessments combine two or three of these frameworks. A specific constraint question followed by a reflection on trade-offs. An opinion question paired with a request to argue the other side.

The key is to make the question impossible to answer well without genuine thought. Not because you want to exclude people, but because you want to find the ones who've actually done the thinking.

In a world where anyone can generate a "perfect" answer in seconds, the candidates who take time to give a real answer are exactly the ones you want to hire.

Ready to see authentic candidates?

FirstLook helps you identify genuine responses with questions that matter and signals that reveal effort.

Start for free

Related articles

Technology

AI in recruitment: why fighting bots with bots isn't working

75% of candidates use AI to apply. Learn why the AI arms race is making hiring worse.

Skills-first

Why skills-first hiring works

What if you could see what candidates can do before their background?