Insurance: Part 1 – Predictive Analytics in Insurance





25th June 2018


Data Science


By: Natasha Mashanovich, Senior Data Scientist at World Programming, UK

Insurance is “a promise to provide compensation in the future if certain events take place during a specified time period” (source: Unlike many other products whose cost is known before the product is sold, insurance is a very different ‘beast’ as the price of insurance policies is unknown at the time of purchase. Hence, selling an insurance product carries a great financial risk.

At its simplest mathematical notation, a product price is defined as the sum of cost and profit. The primary aim, and biggest challenge, in the insurance sector is the accurate estimation of the product cost. Over the years, insurers have developed a plethora of tools, methodologies and mathematical models to calculate the cost. The big data revolution – along with advances in data processing, predictive analytics and artificial intelligence – have made this effort more achievable. Nevertheless, the fact that in 2015 the UK motor insurance market made underwriting profit for the first time since 1994 shows that insurance is an extremely challenging business sector (source:

The key facts stated in the latest UK Insurance & Long-term Savings annual report from the Association of British Insurers (ABI) confirms the importance of the insurance industry for the UK’s economic strength. The UK insurance industry is the largest in Europe and the fourth largest in the world with the total premium income of £300 billion generated in 2016. There are over 900 authorised general insurers in the UK with more than 300,000 employees. The value of premiums written is constantly growing with motor and contents insurance being the largest products. Over 75% of the UK’s households have had motor and/or contents insurance. Despite the total revenue measured in tens of billions, fine margins and fraudulent claims totalling £800 million led to a £200 million underwriting loss in motor insurance.

Figure 1. UK Insurance Key Facts (source:, 2017)

The paramount objective in the insurance market is, therefore, to set adequate, fair and competitive premiums. With a customer-centric approach, an insurance pricing system should be easy to understand, provide stable rates over time, be agile to economic drifts, and include loss control that would ultimately provide affordable rates. These are very challenging and opposing requirements that place a large financial burden on the insurers.

To be able to provide premiums, insurers try to answer many unknowns throughout the customer journey (Figure 2) such as: how risky is a customer; should they receive a discount offer; how much discount to offer; how to acquire more customers; how to retain existing customers; what is the likelihood of a customer making a claim and would it be possible to predict the total claim amount; can we identify fraudulent customers; how to encourage customers to purchase other products, and so on.

Figure 2. Customer Interactions throughout the Customer Journey

Calculation of adequate, fair, and competitive insurance policies is the key to answering these questions and ensure the long-term customer relationship. Hence, insurance pricing – often referred as ratemaking – is the key driver in the insurance industry and the art of data science in this industry. Two most important insurance concepts responsible for adequate-fair-competitive insurance policies are Pricing and Claims. These concepts, supported by Fraud Detection are the key analytics elements that contribute to rapid advances in insurance technology innovations (Figure 3).

Figure 3. Insurance Analytics Framework

Table 1 illustrates how Data Science can be utilised across these three insurance concepts and assist in dealing with various business challenges at different stages in the customer lifetime cycle.

Segment Challenges Anayltics solution Typical modelling approach Business benefits
PRICING The ultimate cost of an insurance policy is not known at the time of sale Customer level Ratemaking (risk-based pricing) Generalised linear models (e.g. the GENMOD procedure in the SAS programming language) Adequate and fair pricing so the Premium = Loss + Profit
Understanding competitive market and its dynamics Market-based pricing models including: conversion, demand and retention models Propensity models Expanding the customer base, competitive advantage
How valuable are my customers? Customer lifetime value Survival analysis, segmentation, propensity models Optimal marketing campaigns
What is a customer pricing tolerance? Price elasticity Optimisation Maximising profit
CLAIMS Reduce high operational/IT cost and maintain customer satisfaction Claims management framework including first notification of loss Holistic approach utilising: propensity and regression models including bodily injury, claim cost, write-off Real-time decision making, monetisation
FRAUD DETECTION Application and claims fraud detection Fraud detection framework Holistic approach utilising: propensity models, anomaly detection, fraud rules, black lists, link analysis Minimising loss


Table 1. Leveraging Data Science for InsurTech

Successful development, implementation and utilisation of insurance predictive models is heavily dependent on a selected analytics platform that must satisfy an extensive range of requirements including ETL (extract, transform, load) capabilities; data manipulation, preparation and visualisation; model building and validation; model deployment; testing; production; and monitoring. Insurers often opt for a mixture of commercial and open-source tools to justify the cost of implementation. However, careful consideration is necessary as this often could lead to a suboptimal solution, as the integration process can be time and resource-consuming.

Figure 4. WPS Analytics Platform for Insurance

Would you like to discuss requirements or arrange a demo?

Have a question?

Get in touch with our sales team

Try or buy

Standard Edition
Academic Edition
Community Edition