Quick answer. Highest risk: legal (58–88%) and medical (43–64%) without mitigation. Lowest: summarising supplied text (under 1.5%). The same model can be safe for one task in your business and dangerous for another.
Risk by sector
| Sector / task | Hallucination rate | Required mitigation |
|---|---|---|
| Legal queries & citations | 58–88% | Verify every source; human lawyer owns output |
| Medical case summaries | 43–64% | Clinical review mandatory; never autonomous |
| Finance & figures | High without tools | Calculator/code tool + human sign-off |
| General factual Q&A | 15–33% | RAG + spot checks |
| Customer support answers | Moderate | RAG against verified help centre |
| Summarising supplied text | <1.5% | Light review — safest use |
Why some sectors are so much worse
Tasks that require recalling specific facts (case law, drug interactions, exact figures) hit the model's weakest point — it generates plausible specifics it doesn't actually know. Tasks grounded in supplied text (summarising, extracting) keep the model anchored to what's in front of it, so error rates collapse. The lesson: ground the model wherever you can.
What each sector should do
- Legal: AI drafts and summarises; a qualified human verifies every citation and owns the advice. Never file unchecked output.
- Healthcare: documentation and admin support only; clinical decisions stay with professionals. Confirm HIPAA tooling — see the privacy checklist.
- Finance: wire in a calculator/code tool for any maths; human sign-off on numbers that matter.
- Customer service: ground answers in your verified knowledge base with RAG; escalate edge cases to humans.
The universal fix
Across every sector, retrieval-augmented generation (RAG) is the single biggest lever — cutting hallucinations by around 71% by grounding the model in your own verified sources. Full list in how to reduce hallucinations.
Choosing a model for a high-risk sector? Favour high Truth Score models and use the match engine with safety weighted high.