AI for Data Extraction in (Re)Insurance: Build or Buy?


Summary:

  • Efficient data extraction is crucial for (re)insurance as unstructured data can hinder key functions like underwriting and claims management.
  • GenAI significantly outperforms traditional NLP models by handling broad ranges of unstructured data with minimal training, making it a valuable tool for (re)insurers.
  • (Re)insurers must decide between buying off-the-shelf solutions, building in-house our blending approaches, considering factors like flexibility, integration, and cost.
  • The chosen data extraction approach should align with a (re)insurer’s broader AI strategy, positioning them to innovate and maintain a competitive edge in the AI-driven landscape.

In the (re)insurance industry, efficiently extracting and utilising various data sources is a strategic necessity, as data underpins critical business functions like underwriting and claims management. However, data points are often trapped in unstructured formats, leading to high operational costs and limiting the information available for decision-making.

Traditional methods of extracting and processing unstructured data have often fallen short, with most natural language processing (NLP) models requiring extensive data and skills to train — a particular problem for insurance due to the specificity of documents handled.

Generative artificial intelligence (GenAI) is transforming how (re)insurers handle unstructured data. New large language models (LLMs) already outperform traditional NLP solutions straight out of the box without specific training and offer even greater potential.

We believe this development calls for reassessing data extraction capabilities. However, with many new vendors entering the market, solutions can be hard to evaluate and distinguish. More importantly, improving data extraction is a key first step in enabling the development of advanced AI workflows and realizing a promising GenAI journey in insurance.

Get data and AI insights delivered

Sign-up to stay updated on GenAI articles.

Key criteria for an effective data extraction solution

An effective extraction solution must meet several criteria:

Flexibility: It should handle a wide range of inputs and outputs, from simple emails to complex Excel files with exposure lists and loss runs. The solution must classify these inputs and apply appropriate strategies while extracting fields as required by (re)insurer systems.

Accuracy and learning: As 100% accuracy is unrealistic, the solution should enable human review and learn from corrections to improve over time. The usability of these review workflows will affect end-user acceptance.

Transparency, security, and compliance: The solution must offer clear visibility into its decision-making process and extraction strategies, ensure data traceability, and adhere to high standards of data security.

Integration: It should seamlessly integrate with existing (re)insurance systems like policy administration platforms, claims systems, and underwriting tools, and support automated workflows based on extracted data.

Cost: Total cost of ownership, including licensing, development, maintenance, training, and integration with core systems, should be carefully evaluated.

Sidebar: Technical aspects of NLP vs LLM

Data and AI figures v2 1 Figure 1
Figure 1: Technical aspects of NLP vs LLM

Buying AI solutions: Speed and efficiency

Off-the-shelf AI solutions offer a quick implementation path for (re)insurance companies, especially for companies not yet having invested in their AI capabilities or lacking confidence in them. These solutions outsource early development challenges and can produce quick wins. In this rapidly evolving market, with players originally relying on classical NLP techniques now scrambling to adopt newer GenAI models and new players aggressively entering the field. Some vendors are focusing on insurance-specific functionality, aiming for comprehensive, tailored solutions. However, the lack of transparency in underlying technologies makes it difficult for insurers to assess true capabilities and limitations. The main challenge lies in selecting the right solution, a process often approached unsystematically. This leaves insurers dependent on vivo performance and roadmaps, which may not align with specific needs. Given the breadth of the field, the risk of betting on the wrong horse, especially this early in the race, is a significant one.

Data and AI figures v2 1 Figure 2
Figure 2: Pros and cons of buying AI solutions

Building AI solutions: A tailored approach

At the opposite end of the spectrum, building an in-house AI solution for data extraction offers the potential for a highly customised tool that aligns closely with a (re)insurer's specific needs and long-term strategic objectives. This approach provides full control over development and integration with legacy systems but comes with significant challenges. These include substantial financial investment, complex management of AI development, and a high risk of project failure. Despite these obstacles, the long-term benefits can be transformative. Developing proprietary capabilities and AI expertise positions insurers to gain a competitive edge in data utilization, decision-making, and innovation across their entire organization. The skills and knowledge acquired become invaluable assets, enabling insurers to tackle diverse challenges and drive enterprise-wide efficiency.

Data and AI figures v2 1 Figure 3
Figure 3: Pros and cons of building AI solutions

Orchestrating AI solutions: A strategic middle ground

Between the extremes of buying off-the-shelf and building from scratch lies a strategic middle ground: AI orchestration. This approach involves creating a centralized system that integrates and manages various pre-built AI models to create tailored extraction strategies. It allows insurers to maintain control over the architecture and data flow while leveraging multiple AI models' capabilities. The orchestration layer acts as a conductor, ensuring that the different AI components work together harmoniously to address the specific needs of the (re)insurer.

For example, an insurer might use a pre-built model for document classification, another for named entity recognition, and a third for relationship extraction. The orchestration layer would coordinate these models, passing the output of one as the input to the next, to create a comprehensive data extraction pipeline tailored to the insurer's specific requirements.

This approach offers flexibility and customization while leveraging existing IT capabilities, without the need for extensive initial AI expertise. Though it doesn't provide the full range of in-house capabilities that building from scratch does, AI orchestration creates a framework for integrating extraction capabilities into core systems. This framework can be expanded as needed, all while adhering to critical data governance, security, and compliance requirements. Ultimately, AI orchestration enables (re)insurers to create powerful, tailored AI solutions that drive efficiency and competitive advantage in their rapidly evolving industry landscape.

Data and AI figures v2 1 Figure 4
Figure 4: Pros and cons of orchestrating AI solutions

(Re)insurance use cases

Data and AI figures v2 1 Figure 5

By implementing these AI data extraction solutions, (re)insurance companies can:

  • Significantly reduce time spent on manual data entry and validation
  • Improve accuracy in pricing, risk assessment, and claims processing
  • Enhance overall operational efficiency
  • Allow skilled professionals to focus on high-value tasks and decision-making

As the (re)insurance industry continues to digitise, effective data extraction becomes crucial for managing diverse digital channels, including broker/placement platforms, portals, email systems, and third-party data sources.

How Generative AI Creates Value for (Re)Insurers

Check out our latest whitepaper for insights into key patterns, use cases, and tailored recommendations from our experts.

Strategic decision-making for AI implementation

GenAI presents a new opportunity in (re)insurance, making the choice of extraction approach both urgent and strategic. This development should prompt (re)insurers to reassess their data extraction capabilities and consider whether to buy, build, or orchestrate within their broader AI strategy.

The decision between off-the-shelf solutions, in-house development, or orchestration is more than just a way to handle immediate data extraction needs. It’s a crucial step in a promising GenAI journey, where creating tailored AI workflows will be a major competitive advantage. The chosen approach to data extraction will lay the groundwork for this capability.

By evaluating factors such as flexibility, continuous improvement, compliance, system integration, and total cost of ownership, (re)insurers can select a solution that aligns with their long-term AI goals. This strategic choice will address current challenges and boost their ability to innovate and compete in an increasingly AI-driven insurance market.


Our experts in this topic