What AI in Insurance Underwriting Gets Wrong Without Good Data

In 2026, AI in insurance underwriting is no longer a futuristic boardroom slide. It is in submission triage, fleet schedule extraction, eligibility checks, renewal reviews, pricing support, and every other corner where underwriters used to live on coffee and copy-paste.
Here is my hot take after a decade in insurance: the biggest risk is not AI making underwriting decisions. The bigger risk is insurers asking AI to make clean decisions from data they would never let a junior underwriter rely on.
We all know the temptation. A vendor says the system can read PDFs, score risk, enrich records, route submissions, and shave days off turnaround time. Lovely. But if the loss runs are inconsistent, the VINs are mistyped, the garaging address is stale, and the prior carrier data lives in a spreadsheet called Final_Final_v7, the AI is not underwriting. It is guessing with better posture.
I once watched an underwriter spend half a morning reconciling the same driver across three documents. William J. Carter, Bill Carter, and W Carter. Same person, probably. Different records, absolutely. That is the kind of tiny mess that looks harmless until it is multiplied across thousands of submissions. AI does not make that mess disappear by magic. It either fixes it, flags it, or quietly bakes it into the decision.
The underwriter's time problem is real
Let us start with the reason AI in insurance underwriting is getting so much attention. Underwriters are drowning in administrative work. McKinsey has estimated that underwriters can spend up to 60 percent of their time on non-core tasks rather than actual risk assessment.
That sounds about right. If you have worked in commercial auto, small business, specialty, or MGA underwriting, you have seen the routine: open the broker email, download the attachments, re-key the schedule, check the VINs, chase missing loss runs, compare prior coverage, hunt for driver details, paste notes into the system, and then, finally, think about the risk.
No wonder carriers and MGAs want automation. Accenture has reported heavy investment by insurance CEOs in AI for claims and underwriting. The appetite is there. The business case is there. The backlog is definitely there.
But there is a trap. If we automate bad intake, we simply move the mess faster. It is like putting a jet engine on a shopping cart. Impressive for three seconds, then everyone has questions.
What AI gets wrong when the data is weak
AI in insurance underwriting tends to fail in predictable ways when the data foundation is poor. The problems are rarely dramatic at first. You do not usually get a giant red alert saying, Warning, this risk score is built on nonsense. You get a tidy recommendation that looks reasonable enough to pass through a busy desk.
That is what makes it dangerous.
It mistakes missing information for low risk
The most common data problem is not wrong data. It is missing data that looks normal.
A blank loss history might mean the insured has no losses. It might also mean the loss run was not attached, the broker uploaded the wrong file, the carrier name changed, the system failed to parse a scanned PDF, or the loss history is hiding in an email thread from Tuesday.
A smart underwriting workflow should treat missing information as a risk signal, not a shrug. Unknown is not the same as clean. If a commercial auto submission has no recent MVR information, no driver dates of birth, and partial VINs, the system should not reward the submission for being light on bad news. It should pause, ask for the right data, or route the file to an underwriter with the reason clearly stated.
That sounds basic. It is also where many underwriting automation projects stumble.
It learns operational habits instead of risk
Here is one that does not get discussed enough. Poor data can cause underwriting AI to learn your workflows, not your risks.
Say Broker A sends beautiful submissions with neat schedules, consistent fields, and complete loss runs. Broker B sends profitable accounts, but the submissions arrive as messy PDFs, sideways scans, and email notes that require translation from broker dialect into English.
If your system is not built to normalize messy data properly, it may start treating Broker A's submissions as better risks because they are easier to process. That is not risk selection. That is a formatting bias wearing a tie.
I have seen good business get delayed because the paperwork was ugly, and mediocre business fly through because the template looked clean. Humans do this too, by the way. We just tend not to admit it out loud.
It produces false precision
Nothing makes me more suspicious than a very specific score built from very questionable data.
An AI system might return a risk score of 82.4. That feels scientific. But if the garaging address is from two renewals ago, the vehicle use is self-reported, the mileage is unverified, and the loss run was parsed incorrectly, that decimal point is doing a lot of emotional labor.
False precision is especially dangerous for MGAs and carriers trying to grow quickly. It can make teams overconfident in eligibility decisions, pricing adequacy, tier placement, and referral routing. The score looks objective, so fewer people challenge it.
Good underwriting data does not eliminate judgment. It gives judgment a fighting chance.
It breaks underwriter trust
Underwriters are not anti-automation. Most of them are anti-bad automation, which is a very different thing.
If the system flags a clean account three times for the wrong reason, or recommends appetite decline because it misunderstood a loss description, underwriters will start ignoring it. Once that happens, good luck getting trust back. The first bad recommendation gets muttered about at the coffee machine for weeks.
This is why data quality is not a back-office technical concern. It is an adoption issue. If underwriters cannot see why a recommendation was made, what data was used, and where the uncertainty sits, they will work around the system. Quietly. Efficiently. With impressive creativity.
It turns compliance into archaeology
Every insurer wants faster underwriting until a regulator, reinsurer, internal audit team, or claims committee asks why a decision was made.
If the answer is somewhere in an email, an overwritten spreadsheet, an API response no one logged, and a PDF that has since been renamed, you do not have an underwriting audit trail. You have an archaeological dig.
Good AI in insurance underwriting needs traceability. Which source supplied the data? When was it pulled? Was it verified? What rule or workflow used it? Who approved the exception? If you cannot answer those questions, the problem is not only operational. It is governance.
Good data is boring, which is why it wins
The best underwriting data is not glamorous. It is complete, current, normalized, traceable, and relevant. That may not sound like a keynote speech, but it is what separates useful automation from expensive theater.
Good underwriting data usually has five traits:
- Completeness: The key fields needed to assess the risk are present, including loss history, driver details, vehicle data, coverage terms, prior insurance, and relevant business characteristics.
- Consistency: Names, addresses, VINs, class codes, dates, limits, and loss descriptions follow formats the system can compare and validate.
- Freshness: The data reflects the current risk, not a stale renewal file or a broker's best guess from last year.
- Traceability: Every important data point can be traced back to its source, timestamp, and validation status.
- Decision relevance: The data actually helps underwrite the risk, rather than cluttering the file with noise.
That last point matters. More data does not automatically mean better underwriting. I have seen files enriched with so many external fields that the underwriter needed a second cup of coffee just to find the three signals that mattered.
If you want a deeper look at the foundations, we have written before about the broader underwriting data quality problem. The short version is simple: you cannot automate what you cannot trust.
External data can help, if it is governed
Data enrichment is one of the best ways to improve underwriting accuracy. Vehicle history, MVRs, property characteristics, hazard data, court records, prior coverage, business records, and claims history can all sharpen risk selection when they are used correctly.
But enrichment without governance is just a bigger pile.
I like external data when it answers a specific underwriting question. Is the VIN valid? Is the garaging address consistent with the exposure? Does the driver history match the application? Is the property in a high-risk hazard zone? Has the applicant had repeated coverage lapses? Those are useful checks.
The problems begin when enriched data is treated as automatically true, automatically relevant, or automatically allowed for every use case. Insurers need clear rules for source reliability, permissible use, refresh frequency, and exception handling.
Consider injury-related history or claims information that may flow into renewal analysis, litigation review, or portfolio monitoring where legally permitted and relevant. A vague note like chiro visit in NYC is not very useful on its own. The file needs context: provider identity, treatment dates, diagnosis details where appropriate, source, and verification status. If a record points to an integrated pain-relief provider such as Move Well MD, the system should not flatten that into generic medical activity. Context is what keeps data from becoming a blunt instrument.
The same principle applies across underwriting. Enrichment should clarify the risk. It should not create a black box full of unverified hints.
For more on this topic, our guide to how data enrichment improves underwriting accuracy covers the practical side of connecting external data to underwriting decisions.
The data warehouse is the control room
Here is another opinion I will defend at any industry dinner: workflow automation without a strong data warehouse is plumbing without a water meter.
You may move submissions faster. You may reduce manual touchpoints. You may even improve quote turnaround. But if the key data points disappear into the workflow and never become usable intelligence, you are missing the bigger prize.
A unified data warehouse gives underwriting leaders the ability to see what is actually happening. Which submissions are being declined? Which brokers have the highest missing-data rates? Which eligibility rules create the most referrals? Which classes are drifting outside appetite? Which renewal segments are producing leakage? Which third-party data sources are worth the cost?
That is where AI in insurance underwriting becomes more than a faster intake tool. It becomes a management discipline.
This is one reason Inaza is built around more than workflow automation. The platform is designed to capture key data points from underwriting, claims, customer service, and operations, then support dashboards and analytics that help insurers see their business more clearly. Pre-built API templates can enrich workflows with sources such as Verisk, LexisNexis, HazardHub, and others. Industry benchmarks can also help teams compare performance against market context, which is particularly useful for portfolio narratives, reinsurance conversations, and renewal strategy.
In plain English, automation should not only do the task. It should leave behind useful evidence.
The claims feedback loop is where good underwriting gets smarter
Underwriting and claims are often treated like neighbors who wave politely but never have dinner. That is a mistake.
Claims data is one of the richest sources of underwriting insight. FNOL patterns, cause of loss, repair costs, injury severity, litigation indicators, fraud flags, reserve movement, and settlement timing can all tell underwriters whether pricing, eligibility, and appetite are aligned with reality.
The stakes are not small. The FBI notes that insurance fraud costs the United States more than $300 billion per year. Fraud is usually discussed as a claims problem, but the underwriting file often contains early signals. Repeated address inconsistencies, suspicious prior coverage gaps, odd vehicle usage, mismatched business descriptions, and unexplained loss patterns can all appear before a claim ever lands.
When claims and underwriting data remain disconnected, AI misses those patterns. Worse, it may keep writing similar business because the feedback never reaches the front door.
A good underwriting system should learn from downstream outcomes. Not in a vague, futuristic sense. In a practical sense. Did the account produce frequency losses? Did severity exceed expectations? Were certain referral reasons predictive? Did missing documentation correlate with bad outcomes? Did specific brokers or classes show persistent leakage?
That is the kind of loop insurers need if they want AI to support profitable growth rather than faster quote volume alone.
How to fix the data problem before scaling underwriting AI
The fix is not to pause every AI project until the data is perfect. Perfect data is a myth, like a claim file with all attachments labeled correctly on the first try.
The better approach is to make data quality part of the workflow itself.
Start at intake. Underwriting automation should capture submissions from emails, PDFs, spreadsheets, portals, and APIs, then structure the information before it reaches the rating or decisioning layer. If the system cannot confidently read a field, it should say so. If a value conflicts with another source, it should flag the conflict. If a required field is missing, it should request it or route the case.
Treat unknowns properly. A missing MVR, unclear business use, partial VIN, or stale loss run should not slip through as neutral. Unknowns should have a status, an owner, and a resolution path.
Keep underwriters in the loop where judgment matters. The goal is not to remove underwriters from complex decisions. The goal is to remove the manual archaeology so they can spend more time assessing risk. The best systems explain what they found, what they could not verify, and why a recommendation was made.
Measure data quality like you measure underwriting performance. Track missing-field rates, re-keying volume, validation failures, referral reasons, enrichment hit rates, quote turnaround, bind ratios, loss ratio by data confidence, and premium leakage indicators. If no one measures data quality, it becomes everyone’s problem and no one’s job.
Finally, do not confuse a pilot with a platform. A small tool that handles one document type can be useful, but underwriting needs connected data across intake, enrichment, rules, referrals, bind, renewal, and claims feedback. Otherwise, you end up with five clever tools and the same old spreadsheet in the middle.
The real promise of AI in insurance underwriting
The real promise is not that AI will become the smartest underwriter in the room. I have met too many sharp underwriters to buy that line.
The real promise is that AI can give underwriters cleaner files, faster context, better consistency, fewer manual checks, and a clearer view of portfolio performance. That is enough. In fact, that is plenty.
But the insurers that win will not be the ones with the flashiest demo. They will be the ones with the cleanest data flows, the strongest validation, the clearest audit trails, and the discipline to connect underwriting decisions to claims outcomes.
AI in insurance underwriting does not fail because the technology lacks ambition. It fails because the data foundation is treated like a side project.
And in underwriting, side projects have a funny way of becoming loss ratio problems.
Frequently Asked Questions
Why does good data matter for AI in insurance underwriting? Good data gives AI the context needed to support accurate decisions. Without complete, current, and traceable data, underwriting AI may misread risk, produce false precision, route submissions incorrectly, or create audit problems.
Can AI fix poor underwriting data by itself? AI can help structure, validate, and enrich poor data, but it still needs workflow rules, source controls, human oversight, and feedback loops. The system should flag uncertainty rather than pretend every field is reliable.
What underwriting data should insurers prioritize first? Start with the fields that most affect eligibility, pricing, and referral decisions. For many P&C insurers and MGAs, that includes loss runs, vehicle or property data, driver or operator details, prior coverage, location, class codes, limits, deductibles, and claims history.
How can MGAs start using AI without disrupting their teams? Begin with a targeted workflow such as submission intake, loss run extraction, fleet schedule validation, or eligibility checks. The best early projects reduce re-keying, improve data quality, and keep underwriters in control of exceptions.
What is the biggest mistake insurers make with underwriting AI? The biggest mistake is scaling automation before fixing data visibility. If the organization cannot see where data comes from, how it was validated, and how it influenced a decision, the automation will be hard to trust and harder to govern.
Make underwriting AI worth trusting
Inaza helps insurers, MGAs, and brokers automate underwriting, claims, customer service, and operations while capturing the data needed for better decisions and better business intelligence.
With customizable workflows, support for all file types, seamless system integration, a unified data warehouse, real-time analytics dashboards, 250+ workflow templates, API templates for data enrichment, and industry benchmarks, Inaza is designed to make automation practical without forcing teams into a painful rebuild.
If you want AI underwriting that your team can actually trust, start with the data. Then make the workflow do the heavy lifting.
Explore Inaza to see how connected insurance automation can help your underwriting team move faster without flying blind.


