Blog

x is not f(x): Insurance edition

I have recently been reading Nassim Taleb’s new book , Statistical Consequences of Fat Tails, which is freely available on arXiv:

https://arxiv.org/abs/2001.10488

Note that if the book interests you, you are welcome to join the Global Technical Incerto Reading Club, which I host together with James Sharpe (https://twitter.com/Sharpe_Actuary?s=20 ). Next week (3 March), Nassim will speak to the reading club on the book, with some focus on Chapter 3, and if you are interested in the talk, you can sign up here:

Session 1 – Introduction by Nassim Taleb

Tuesday, Mar 2, 2021, 5:00 PM

Online event
,

560 Members Attending

We are delighted that Nassim Taleb will speak at the next meeting of the reading club. Details to follow.

Check out this Meetup →

Reading Club: Meetup 1

In preparation for the talk I read through Chapter 3, which summarizes some of the key themes within the book. In this post I discuss some of the thoughts I had on how the themes addressed in Chapter 3 pop up within the insurance world, with a focus on the idea that exposure to a risk needs to be treated differently from the underlying risk itself.

x is not f(x)

The idea as I understood it is that one can spend much time and effort trying to forecast the behavior of a random variable x whereas the component of the risk that has a practical effect on you is not the random variable itself, but is defined by how you have ‘shaped’ your exposure to the risk, expressed as f(x).

In what follows, I call the random variable x the ‘risk’ and the manner in which the risk affects you, the exposure to the risk f(x). Key is that whereas it can be difficult, if not impossible, to gain knowledge about a risk (often due to the difficult statistical properties of the risks that are impactful), changing the impact of the risk on you (or your P&L/company) is easier. The idea is expressed beautifully in the book by the adage of a trader (page 37):

“In Fooled by Randomness (2001/2005), the character is asked which was more probable that a given market would go higher or lower by the end of the month. Higher, he said, much more probable. But then it was revealed that he was making trades that benefit if that particular market goes down.”

If we consider that the insurance industry has survived exposure to heavy-tailed risks for centuries, and that insurance companies usually cause system-wide problems only when they stray into taking large financial risks (as opposed to insurance risks), it would be reasonable that the industry should be a good example of implementing these principles in practice.

Shaping exposures within insurance

The idea of focusing on how one is exposed to risk as opposed to trying to forecast the actual risk itself appears almost everywhere within the insurance industry, once you start looking for it. In almost every case I can think of, insurers do not accept the full exposure to risks as they stand, but using contractual terms and conditions, ensure that the full impact of a risk does not manifest on their P&L.

Some obvious examples are the applications of limits within general insurance. Instead of taking on the full consequences of a risk, insurance policies usually have a maximum payout for each occurrence of a risk, and perhaps also for the total impact of the risks over the course of the policy term. Limits act to ensure that the full tail risk of an insurance policy is limited (technically, this is called “right censoring”) and that the maximum loss on an insurance book is bounded. See the sections below for some more discussion of the implications of policy limits.

Limits act to shape the exposure to insurance claim ‘severity’, which is one of two famous components of insurance risk losses. The other is ‘frequency’ which refers to the extent that more claims than expected might result within an insurance portfolio. Another common response to reduce the exposure to frequency losses is to include excesses within policies that require policyholders to pay for losses below a certain amount. Since insurance risks generate many smaller losses, which nonetheless involve a constant cost to administer and pay claims, shaping the frequency exposure in this manner is also key.

Now we consider less obvious examples. Whereas the latter examples are implemented in a strictly contractual manner, a key process through which insurers shape their risk exposure is via underwriting. Some risks are just too heavy-tailed for insurers to have much of an appetite to write them, for example, almost every property policy I have seen excludes losses resulting from war and nuclear power. Within liability insurance, anything with a United States liability exposure is usually considered too risky for insurers outside of the US to provide cover for, due to the extreme claim awards in the US compared to other jurisdictions. Many insurers will only approach aviation risks or highly volatile manufacturing operations (e.g. chemicals) with extreme care.

A different approach is to only write policies for a slice of a risk that is more appealing. For example, within the cyber insurance market, most insurers do not offer coverage for the full risk exposure of a cyber loss, but only provide limited support e.g. helping recover lost data or covering the costs of crisis management practitioners. This leads some people to complain that the cyber risk market is “inefficient” or “dysfunctional” since it is difficult to find cover for the actual cyber risks faced; on the other hand given the potential extreme losses that cyber risk can cause, and the limited data on loss experience, this criticism is somewhat akin to “lecturing birds on how to fly”.

The final layer in shaping exposure for an insurer is reinsurance – or getting risk away off one’s P&L by passing it onto another insurer. Among the many different forms of reinsurance are policies that produce an option-like exposure, where one can pass risk above a fixed level of losses to the counterparty for a fixed premium (excess of loss). Other options are to share risks in more or less equal proportions.

By the end of applying all of these risk “shaping” approaches to define
f(x) , hopefully an insurer is suitably protected from risks it doesn’t want to take.

Other implications

One of the implications of adding limits to an insurance policy is that the subsequent analysis of losses generated by these policies needs to use special methods. A popular approach for analyzing the severity of losses within general insurance is to fit one distribution to the smaller and more frequent attritional losses, and another disruption to the extreme losses, with the latter distribution often motivated by extreme value theory (see the introductory session on EVT here). However, this approach ignores the fact the each loss has an upper bound determined by the limits on the policy generating the loss. Also, since these extreme losses follow a very heavy tailed distribution, naïve estimators of the statistical properties of these losses are likely to be biased. To solve this problem in other domains,
Taleb and his collaborators introduce an approach – called “shadow moments” – which works by first transforming the data to a new domain that is unbounded, parameterizing EVT distributions in this domain, and then translating the implications of these models back to the original bounded domain. Two works demonstrating this are in the context of war and pandemic casualties are Cirillo & Taleb (2016, 2020). The shadow moment approach seems to have substantial applicability in insurance modelling.

In the next post we will talk about how actuaries apply valuation approaches to measure f(x).

Bibliography

Cirillo, P., & Taleb, N. N. (2016). On the statistical properties and tail risk of violent conflicts. Physica A: Statistical Mechanics and Its Applications, 452, 29–45. https://doi.org/10.1016/j.physa.2016.01.050

Cirillo, P., & Taleb, N. N. (2020, June 1). Tail risk of contagious diseases. Nature Physics, Vol. 16, pp. 606–613. https://doi.org/10.1038/s41567-020-0921-x

The Actuary and IBNR Techniques

Some exhibits from the talk. The box plot shows our method can select a performant variant of the chain ladder method for out of sample data.

Yesterday, we presented our new paper at the virtual GIRO event. The paper provides a method for the optimal selection of IBNR techniques using a machine learning philosophy, and can be found here:

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3697256

The slides from the presentation are below.

Nagging Predictors

We are excited to release a new paper on aggregating neural network predictions with a focus on actuarial work. The paper examines the stability of neural network predictors at a portfolio and individual policy level and shows how the variability of these predictions can act as a data-driven metric for assessing which policies the network struggles to fit. Finally, we also discuss calibrating a meta-network on the basis of this metric to predict the aggregated results of the networks.

Here is one image from the paper which looks at the effect of “informing” the meta-network about how variable each individual policy is.

If you would like to read the paper, it can be found here:

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3627163

HMD – Weekly Data

The Human Mortality Database (link) is one of the best sources of demographic data that is publicly available. In addition to their regular reporting across about 40 countries, the curators have now added a special time series of weekly death data across 13 countries to enable the tracking of COVID19 deaths, and the effect on weekly mortality rates. In their usual fashion, the HMD have provided the data in an easy to use csv file which can be downloaded from the website.

Rob Hyndman (whose work on time series forecasting I have learned much from over the years and whose R packages and textbook I use/mention in this blog) posted today about this new data source on his excellent blog (https://robjhyndman.com/hyndsight/excess-deaths/). He shows plots of the excess deaths for all 13 countries.

I was wondering how one might derive a predictive interval for these weekly mortality rates, and to what extent the COVID19
mortality rates would lie outside a reasonable interval. What follows is a rough analysis using time series methods. A quick inspection of the rates for the UK shows that there are strong seasonal features, as shown in the image below.

Plot of UK total mortality rates (all ages and both sexes), 2010 to 2020

Interestingly, the data dip and then recover around week 13 each year, probably due to reporting lags around the Easter holidays. Other similar effects can be seen in the data. To produce reasonable forecasts, one approach would be to attempt to model these seasonal issues explicitly.

Thanks to the excellent forecast package in R, this is really easy. First, we show a season and trend (STL) decomposition of the these data (excluding 2020). For details on this technique, see this link.

STL decomposition of the UK total weekly death data

The plot shows quite a strong seasonal pattern which is often characterized by dips and recoveries in the data. Overall, the trend component seems to have increased since 2010, which is a little puzzling at first glance, as mortality rates in the UK have improved for most of the period 2010-2019, but is probably due to an aging population.

One could now use the STL forecasting function (forecast::stlf) to produce forecasts. Here, I have rather chosen to fit a seasonal ARIMA model (link). The model specification that the  forecast::auto.arima function is an ARIMA(2,0,1)(1,1,0)[52] model.

Finally, we are ready to forecast! In this application, I have used confidence levels of 95% and 99.5% respectively. Plotting the 2020 data against this interval produces the following figure.

It can be seen that the 2020 weekly mortality rates fall dramatically outside even a 99.5% interval. It is probably not too surprising that an uncertainty interval calibrated on only 10 years of data is too narrow, but the extent to which this has occurred is dramatic!

This analysis is very rudimentary and could be improved in several ways: obviously the distributional assumptions should be amended to allow for larger shocks, and more advanced forecasting would allow for sharing of information across countries.

The code can be found on my GitHub here:

https://github.com/RonRichman/covid_weekly



2020 Hachemeister Prize

I am very grateful to the Casualty Actuary Society’s committee that awarded my 2018 paper “AI in Actuarial Science” the 2020 Hachemeister prize. I hope that the methods discussed in the paper eventually make an impact on P&C actuarial practice! If you want to read the paper, please find it here. One of my favorite images from the paper is pasted below and shows the results of a convolutional autoencoder fit to telematics data.

Convolutional Autoencoder fit to v-a telematics heatmaps generated using the simulation machine kindly provided by Mario Wüthrich at this link.

Discrimination-Free Insurance Pricing

We are excited to share a new paper on “Discrimination-Free Insurance Pricing” written by Mathias Lindholm, Andreas Tsanakas, Mario Wüthrich and a small contribution from myself. In this paper, we present a general method for removing direct and indirect discrimination from the types of models in common use in the insurance sector (GLMs) as well as more advanced machine learning methods.

One of my favorite plots from the paper shows a comparison of prices produced using a neural net with our method applied to the results.

We would like to hear your feedback and the paper can be downloaded here:

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3520676

IDSC Africa – Call for Abstracts

The call for abstracts is open for the first Insurance Data Science Conference – Africa at the Sandton Convention Centre, 20 April 2020, which is jointly organized by the Actuarial Society of South Africa, QED Actuaries and Consultants and the University of the Witwatersrand. The invited keynote speakers are Mario Wüthrich (RiskLab, ETH Zürich) and Marjorie Ngwenya (ALU School of Insurance, past President of the Institute and Faculty of Actuaries).

We are inviting abstracts in the areas of insurance analytics, machine learning, artificial intelligence, and actuarial science. Please send your abstract by 29 February 2020 to: submissions@insurancedatascience.org.za

Your submission should include:

  • Name and affiliation of the speaker
  • Email address of the speaker
  • Title of the presentation
  • Abstract of 10 to 20 lines and not more than 5 references
  • Format: pdf file and either a Word file or LaTeX file
  • If submitting LaTex, please test your submission file, e.g. via https://latexbase.com/

The submitted abstracts will be evaluated and speakers will be selected by the scientific committee. Conference fees will be waived for selected speakers.

For more information visit the conference web site: https://insurancedatascience.org.za. Registration for the conference will open in mid-February 2020.

Note: The international version of this conference has been held in European cities for the past 7 years and returns to London in 2020. For more information, please visit: https://www.insurancedatascience.org/

IDSC – Africa

One of the most inspiring events I have attended was the Insurance Data Science Conference held in Zurich this year.

I am very excited to announce that on 20 April 2020 we will hold an affiliated event in South Africa, which will be organized by the Actuarial Society of South Africa, QED Actuaries & Consultants, and the University of the Witwatersrand. The conference website is here:

The call for abstracts is open and we look forward to receiving your submission.

Finally, the 2020 event in Europe will be held in London at the Cass Business School. The call for abstracts has just gone out and the website is here:

Insurance Data Science Conference