Enter the email address associated with your access grant.
No access? Request here →
Three extraction architectures tested on a 7-page NY WC FNOL with a dual-employer layout trap. The approach marketed for its audit trail hallucinated the claimant's name and date of injury.
A controlled comparison of three extraction architectures on a single high-complexity WC FNOL reveals that the approach marketed for its audit trail hallucinated the claimant's name, last name, and date of injury. The zero-LLM approach extracted every deterministic field correctly and is the only architecture defensible under regulatory examination.
| Metric | Approach A Groq · llama-3.3-70b · T5 |
Approach B LangExtract · Gemini · T5 |
Approach C Docling + Regex · T0+T2 |
|---|---|---|---|
| External API calls | 1 | 7 | 0 |
| Latency | 4.3s | 42.7s | 8.2s |
| Fields extracted (of 36) | 36 | 36 | 29 Cat 1–3 only |
| Cat 1–2 deterministic fields | 19 / 19 | 15 / 19 | 19 / 19 |
| Hallucinations | 0 | 3 ▲ | 0 |
| Document-provenance grounded | 0 values | 101 intervals † | 29 / 29 |
| Deterministic (zero variance) | No | No | Yes |
| Cost per document | ~$0.04–0.06 | ~$0.28–0.42 | ~$0.00015 |
| PHI-safe for data posture B/C | No | No | Yes |
▲ Approach B grounding intervals point to real text from wrong entities — see Finding 1. † 29/36 fields for Approach C = Cat 1–3; the 7 missing Cat 4 inference fields were left empty, not hallucinated.
LangExtract returned a character-interval citation for every extracted value — the feature distinguishing its audit architecture from a standard LLM call. Three of those intervals pointed to real text at the cited position. The text belonged to the wrong entity or the wrong date context.
A grounding interval proves a string exists in the document. It does not prove the string is the correct value for the correct field. Under NYDFS Part 216 examination, this distinction is the difference between passing and failing provenance review.
| Field | Extracted | Cited source text | Actual source |
|---|---|---|---|
| first_name | "Franklin" | "Franklin Logistics Inc…" | Third-party shipper — not the claimant |
| last_name | "Mr." | "Dear Mr. Johnson," | Broker salutation — not a surname |
| date_of_injury | "March 4" | "filed March 4 by prior counsel" | Attorney filing date — injury was March 18 |
| Field | Ground truth | A: Groq | B: LangExtract | C: Docling+Regex |
|---|---|---|---|---|
| policy_number | AP-2026-WC-9214 | ✓ | ✓ | ✓ |
| claim_number | WCH250721001 | ✓ | ✓ | ✓ |
| employer_fein | 47-2381094 | ✓ | ✓ | ✓ |
| naics_code | 561320 | ✓ | ✓ | ✓ |
| date_of_injury | 2026-03-18 | ✓ | ✗ "March 4" | ✓ |
| date_reported | 2026-03-28 | ✓ | ✓ | ✓ |
| date_of_birth | 1988-07-15 | ✓ | ✓ | ✓ |
| ssn_last4 | 4721 | ✓ | ✓ | ✓ |
| hourly_rate | $28.50 | ✓ | ✓ | ✓ |
| avg_weekly_wage | $1,140.00 | ✓ | ✓ | ✓ |
| reporting_delay_days | 10 | ✓ | ✗ | ✓ |
| attorney_contact_date | 2026-03-21 | ✓ | ✓ | ✓ |
| first_name | Terrence | ✓ | ✗ "Franklin" | ✓ |
| last_name | Jackson | ✓ | ✗ "Mr." | ✓ |
| employer_name | Apex Staffing Solutions | ✓ | ✓ | ✓ |
| body_part_primary | Lower back / lumbar | ✓ | ✓ | ✓ |
| injury_mechanism | Lifting / exertion | ✓ | ✓ | ✓ |
| occupation_class | Warehouse / labor | ✓ | ✓ | ✓ |
| state_of_injury | NY | ✓ | ✓ | ✓ |
| Score — Category 1 & 2 | 19 / 19 | 15 / 19 | 19 / 19 | |
Shaded rows = fields where Approach B returned a grounded hallucination. Approach C Cat 4 fields (claim type, RTW status, attorney flag, same body part, delay flag) were left empty by design — not hallucinated.
The document contains three business entities — Apex Staffing, Excel Manufacturing, and Franklin Logistics — before the Employee Information section that contains the claimant's name. Approach B scanned the full document; Approach C partitioned it.
Each regex pattern runs only against its assigned section pool. The first_name pattern sees only text under the Employee Information header. "Franklin" exists only in the Employer Information pool. The two pools never intersect. The attribution error is not a probability to manage — it is a structural impossibility.
SECTION_MAP = {
"employer": re.compile(
r"employer\s+information", re.I),
"employee": re.compile(
r"(?:injured\s+)?employee\s+information", re.I),
"injury": re.compile(
r"(?:injury|incident)\s+(?:information|details)", re.I),
}
# first_name runs in "employee" pool only.
# "Franklin" exists in "employer" pool only.
# No overlap. Attribution error is impossible.
| Architecture | Answer to "where did this value come from?" | Examination result |
|---|---|---|
| A — Groq | "The language model extracted policy number AP-2026-WC-9214 from the document. Confidence: high." | No provenance |
| B — LangExtract | "The value 'Franklin' was extracted from characters 412–419, which reads 'Franklin'." | Misleading — wrong entity |
| C — Docling+Regex | "First name matched by First\s+Name[.:\s]+([A-Z][a-z]+)\b in the Employee Information section. Deterministic. Reproducible on every run." |
Passes examination |
This benchmark is anchored to an active enterprise POC covering 79,908 annual documents across three use cases. Phase 1 recommendation: run Approach C on 50 real labeled documents before any GPU or LLM infrastructure investment.
Discuss the extraction architecture →We work with insurers and MGAs who are serious about the architecture — not just the demo. Conversations start with the problem, not the product.