Product & Technology

Designing and building a safety harness for our insurance LLMs

Building a model trained specifically to trade insurance safely is only the beginning. To be ready for customer use in a regulated financial environment, you need a safety harness around it - a continuous system of monitoring, explainability, and protection that ensures every interaction stays compliant, auditable, and trustworthy.

1. Continuous monitoring and benchmarking

Insurance language is regulated language. That means every response must be demonstrably fair, clear, and not misleading. We have built an evaluation framework that continuously tests our models against thousands of real insurance questions to detect failure modes such as missing disclaimers or language that risks crossing the advice boundary.

This framework powers the Compliance Risk Index, our benchmark for model safety. It measures performance across four dimensions: evidence and retrieval, policy logic, communication standards, and the detection of vulnerability or complaints. Each dimension is scored at scale, giving us a measurable view of how our models perform over time and against the generalist models available from other providers.

‍

2. Explainability and traceability

We believe all model outputs should be explainable. That means that we must be able to attribute model responses to ground truth source material.

There are two approaches to model attribution.

Generate a response, and then check for similarity after, and then assume that similarity equals attribution.
While generating a response, see what nodes are being activated, and trace those nodes back to source material.

Nearly everyone uses the first approach because it is simpler, and can be done with closed source, black-box models. However, this approach is not reliable, as the assumption that similarity equals attribution doesn’t always turn out to be true.

We believe having robust explainability is a key requirement for the safe application of large language models in insurance, and as such have invested in the second approach.

Because we have built the model, we have access to the internal model weights during inference and as such can link node activations to exact text within policy wordings. This means we can provide industry leading levels of explainability and traceability.

3. Retrieval accuracy at scale

To answer customer queries with correct and relevant information, our models must be able to draw from a dynamic knowledge base with thousands of policy documents and supporting materials.

Traditional retrieval-augmentated generation (RAG) systems rely on vector embeddings. However, these embeddings introduce brittleness:

Chunking strategies must be retuned as documents change
Embedding models can have blind spots for domain-specific terminology
Retrieval quality degrades when new policy types are added

Rather than segmenting optimisation between the retrieval pipeline and the model, we put all the capability into the model itself. This means we can teach the model to issue precise text-pattern searches over raw policy documents, compose multiple searches to handle complex queries, and importantly self-correct when initial searches don't yield useful results.

This approach delivers higher retrieval accuracy than embedding based approaches, even as the number of documents increases.

This means that when customers ask questions, they can be confident that the responses are accurate, even when comparing many policies at once.

4. Hallucination detection and output guardrails

Generative models can sometimes fill gaps with plausible but unsubstantiated information. In insurance, unsubstantiated, potentially false information can lead to customer harm. We need to be able to detect this, and ultimately prevent any hallucinated content from being exposed to the consumer.

We have trained a secondary model with the sole purpose of detecting hallucinations. It does this by confirming that any information contained in the output message is in fact contained in source material that the model referenced at the time of generating the response.

If hallucinated content is detected, the output is flagged and the message can be regenerated or the conversation can be escalated. These controls ensure that every response is grounded in verifiable, in-context evidence, ensuring the consumers can trust the model’s outputs.

The result: safe, confident deployment

Together, these mechanisms form the safety harness around our models. They make Open General Insurance Intelligence not just powerful, but safe to deploy directly to customers through products like Insurance Companion, delivering compliant, trustworthy experiences at scale.

No items found.

We're building Open General Insurance Intelligence – and not using closed source models

AI can bring clarity to complex insurance products, but to do so safely, you need an AI that deeply understands insurance.

Why GenAI in insurance is risky, and why it’s worth getting right

Today, we’re introducing the AI Compliance Risk Index™ - a new benchmark for measuring the safety of consumer-facing GenAI tools in insurance.

Privacy by design

What you share, and who you share it with, should be up to you. Here we look at how we design and architect for privacy, and give users control over their information.

Solutions

Open Embedded Open Agentic Open Intel

For Customers

UK customers Australian customers

For Developers

Documentation

Company

About Open Careers Research Blog Brand stories Media Contact Us

Terms and Policies

UK Terms and Policies Australia Terms and Policies New Zealand Terms and Policies

Open Insurance Technologies Pty Ltd (Australian Business Number 21 612 668 998) operates subsidiaries in Australia, New Zealand and the UK.

UK insurance policies are issued by Open Insurance Services UK Limited. Open Insurance Services UK Limited is authorised and regulated by the Financial Conduct Authority under reference number 988625. Registered in England and Wales (registered company number 9365669).

Australian insurance policies are issued by Open Insurance Pty Ltd (Australian Business Number 23 166 949 444) in its capacity as an AFS Licensee No. 451712, on behalf of various insurers.

New Zealand insurance policies are issued by Open Insurance Ltd (NZ Company Number 7969816), on behalf of Tower Limited (NZ Company Number 143050).

© 2025 Open Insurance Technologies Pty Ltd. This website contains content intended for viewing and consumption only by commercial entities interested in understanding opportunities to embed insurance into commercial offerings. It does not constitute a retail advertisement of any specific insurance product or service.