Truth Score series · Updated June 2026

How to reduce AI hallucinations

You can't eliminate hallucinations, but you can cut them by 70% or more. Here are the techniques that work, ranked by measured impact, with when to use each.

Quick answer. Biggest lever: retrieval-augmented generation (RAG), ~71% reduction. Then self-consistency checking (~65%), ensemble checking (30–50%) and prompt mitigation (~22 points). Layer RAG with a human review step and you turn an unreliable generalist into a dependable tool.

Techniques ranked by impact

TechniqueReductionEffortWhen to use
RAG (retrieval)~71%MediumAny factual or document-grounded task
Self-consistency checking~65%MediumHigh-stakes single answers
Ensemble (multi-model)30–50%HighCritical decisions worth the cost
Prompt mitigation~22ppLowEvery prompt — cheap baseline
Fine-tuning on domain dataVariesHighNarrow, repeated, specialised tasks

1. RAG — ground the model in your sources

Retrieval-augmented generation connects the model to a store of your verified documents, so it retrieves real facts instead of inventing them. It is the single most effective technique and the foundation of reliable business AI. If you do one thing, do this.

2. Self-consistency checking

Ask the model the same question several ways (or several times) and compare. Agreement signals reliability; divergence flags a likely hallucination. Effective for individual high-stakes answers.

3. Ensemble checking

Run the query across more than one model and compare outputs. Disagreement surfaces fabrication. It costs more — remember the token cost — so reserve it for genuinely critical decisions.

4. Prompt mitigation — the cheap baseline

Instruct the model to say when it is unsure and to show its reasoning. A simple line like "If you are not certain, say so rather than guessing" cuts hallucination by around 22 percentage points and costs nothing. Apply it everywhere.

5. Fine-tuning

For narrow, repeated tasks, fine-tuning on your own verified data can reduce error meaningfully. Higher effort — worth it only for stable, high-volume use cases.

The non-negotiable: human review

No technique reaches zero. For anything consequential, keep a human in the loop. This is the heart of basic governance and the reason it matters to know what AI can't do.

Picking a reliable base model first? Start from the Truth Score — a higher-scoring model plus RAG is the strongest combination.