Extending Development Patterns with Tails

Extending Development Patterns with Tails#

Getting Started#

This tutorial focuses on extending the developent patterns beyond the tail.

Be sure to make sure your packages are updated. For more info on how to update your pakages, visit Keeping Packages Updated.

# Black linter, optional
%load_ext lab_black

import pandas as pd
import numpy as np
import chainladder as cl
import matplotlib.pyplot as plt

print("pandas: " + pd.__version__)
print("numpy: " + np.__version__)
print("chainladder: " + cl.__version__)

pandas: 2.1.4
numpy: 1.24.3
chainladder: 0.8.18

Disclaimer#

Note that a lot of the examples shown might not be applicable in a real world scenario, and is only meant to demonstrate some of the functionalities included in the package. The user should always follow all applicable laws, the Code of Professional Conduct, applicable Actuarial Standards of Practice, and exercise their best actuarial judgement.

Basic Tail Fitting#

Tails are another class of transformers. Similar to the Development estimator, they come with fit, transform and fit_transform methods. Also, like our Development estimator, you can define a tail in the absence of data or if you believe development will continue beyond your latest evaluation period.

quarterly = cl.load_sample("quarterly")
quarterly["paid"]

/home/docs/checkouts/readthedocs.org/user_builds/chainladder-python/conda/latest/lib/python3.11/site-packages/chainladder/core/base.py:250: UserWarning: The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.
  arr = dict(zip(datetime_arg, pd.to_datetime(**item)))
/home/docs/checkouts/readthedocs.org/user_builds/chainladder-python/conda/latest/lib/python3.11/site-packages/chainladder/core/base.py:250: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
  arr = dict(zip(datetime_arg, pd.to_datetime(**item)))

	3	6	9	12	15	18	21	24	27	30	...	108	111	114	117	120	123	126	129	132	135
1995	3.00	24.00	65.00	141.00	273.00	418.00	550.00	692.00	814.00	876.00	...	1,099.00	1,099.00	1,100.00	1,098.00	1,098.00	1,098.00	1,099.00	1,099.00	1,100.00	1,100.00
1996	1.00	16.00	54.00	135.00	260.00	398.00	594.00	758.00	871.00	964.00	...	1,296.00	1,296.00	1,297.00	1,298.00	1,298.00	1,298.00
1997	1.00	17.00	55.00	166.00	296.00	442.00	587.00	701.00	811.00	891.00	...	1,197.00	1,198.00
1998	1.00	11.00	40.00	93.00	185.00	343.00	474.00	643.00	744.00	831.00	...
1999	1.00	14.00	47.00	113.00	225.00	379.00	570.00	715.00	832.00	955.00	...
2000	1.00	6.00	28.00	100.00	194.00	297.00	415.00	521.00	616.00	697.00	...
2001	1.00	7.00	37.00	128.00	271.00	427.00	579.00	722.00	838.00	937.00	...
2002	1.00	10.00	45.00	110.00	236.00	442.00	668.00	890.00	1,078.00	1,198.00	...
2003	1.00	9.00	31.00	94.00	192.00	299.00	408.00	792.00	873.00	949.00	...
2004	4.00	16.00	49.00	170.00	289.00	442.00	601.00	793.00	948.00		...
2005	1.00	7.00	36.00	97.00	183.00						...
2006	1.00										...

Upon fitting data, we get updated ldf_ and cdf_ attributes that extend beyond the length of the triangle. Notice how the tail includes extra development periods (age 147) beyond the end of the triangle (age 135) at which point an age-to-ultimate tail factor is applied.

tail = cl.TailCurve()
tail.fit(quarterly)

print("Triangle latest", quarterly.development.max())
tail.fit(quarterly).ldf_["paid"]

Triangle latest 135

	3-6	6-9	9-12	12-15	15-18	18-21	21-24	24-27	27-30	30-33	...	120-123	123-126	126-129	129-132	132-135	135-138	138-141	141-144	144-147	147-150
(All)	8.5625	3.5547	2.7659	1.9332	1.6055	1.4011	1.3270	1.1658	1.1098	1.0780	...	1.0000	1.0009	1.0000	1.0009	1.0000	1.0001	1.0001	1.0001	1.0001	1.0003

These extra twelve months (147 - 135, or one year) of development patterns are included as it is typical to want to track IBNR run-off over a 1-year time horizon from the valuation date. The one-year extension is currently fixed at one year and there is no ability to extend it even further. However, a subsequent version of chainladder will look to address this issue.

Curve Fitting#

Curve fitting takes selected development patterns and extrapolates them using either an exponential or inverse_power fit. In most cases, the inverse_power produces a thicker (more conservative) tail.

exp = cl.TailCurve(curve="exponential").fit(quarterly["paid"])
exp.tail_

	135-Ult
(All)	1.00065

inv = cl.TailCurve(curve="inverse_power").fit(quarterly["paid"])
inv.tail_

	135-Ult
(All)	1.021283

When fitting a tail, by default, all of the data will be used; however, we can specify which period of development patterns we want to begin including in the curve fitting process with fit_period.

Patterns will also be generated for 100 periods beyond the end of the triangle by default, or we can specify how far beyond the triangle to project the tail factor to before dropping the age-to-age factor down to 1.0 using extrap_periods.

Note that even though we can extrapolate the curve many years beyond the end of the triangle for computational purposes, the resultant development factors will compress all ldf_ beyond one year into a single age-ultimate factor.

quarterly["incurred"]

	3	6	9	12	15	18	21	24	27	30	...	108	111	114	117	120	123	126	129	132	135
1995	44.00	96.00	194.00	420.00	621.00	715.00	748.00	906.00	950.00	973.00	...	1,098.00	1,099.00	1,103.00	1,100.00	1,098.00	1,100.00	1,100.00	1,098.00	1,101.00	1,100.00
1996	42.00	136.00	202.00	365.00	541.00	651.00	817.00	988.00	1,052.00	1,122.00	...	1,300.00	1,300.00	1,302.00	1,300.00	1,303.00	1,300.00
1997	17.00	43.00	135.00	380.00	530.00	714.00	813.00	945.00	966.00	1,008.00	...	1,203.00	1,200.00
1998	10.00	43.00	107.00	238.00	393.00	574.00	732.00	894.00	935.00	967.00	...
1999	13.00	41.00	109.00	306.00	481.00	657.00	821.00	1,007.00	1,021.00	1,141.00	...
2000	2.00	29.00	88.00	254.00	380.00	501.00	615.00	735.00	788.00	842.00	...
2001	4.00	25.00	151.00	333.00	777.00	663.00	856.00	988.00	1,063.00	1,167.00	...
2002	2.00	34.00	115.00	290.00	472.00	809.00	1,054.00	1,543.00	1,617.00	1,505.00	...
2003	3.00	19.00	90.00	692.00	597.00	929.00	883.00	1,117.00	1,092.00	1,176.00	...
2004	4.00	38.00	138.00	371.00	583.00	756.00	902.00	1,111.00	1,212.00		...
2005	21.00	79.00	115.00	299.00	422.00						...
2006	13.00										...

cl.TailCurve(fit_period=(12, None), extrap_periods=50).fit(quarterly).ldf_["incurred"]

	3-6	6-9	9-12	12-15	15-18	18-21	21-24	24-27	27-30	30-33	...	120-123	123-126	126-129	129-132	132-135	135-138	138-141	141-144	144-147	147-150
(All)	3.5988	2.4768	2.7341	1.4683	1.2966	1.1825	1.2418	1.0451	1.0440	1.0365	...	0.9996	1.0000	0.9982	1.0027	0.9991	1.0003	1.0003	1.0002	1.0002	1.0012

cl.TailCurve(fit_period=(1, None), extrap_periods=50).fit(quarterly).ldf_["incurred"]

	3-6	6-9	9-12	12-15	15-18	18-21	21-24	24-27	27-30	30-33	...	120-123	123-126	126-129	129-132	132-135	135-138	138-141	141-144	144-147	147-150
(All)	3.5988	2.4768	2.7341	1.4683	1.2966	1.1825	1.2418	1.0451	1.0440	1.0365	...	0.9996	1.0000	0.9982	1.0027	0.9991	1.0002	1.0002	1.0001	1.0001	1.0006

In this example, we ignore the first five development patterns for curve fitting, and we allow our tail extrapolation to go 50 quarters beyond the end of the triangle. Note that both fit_period and extrap_periods follow the development_grain of the triangle being fit.

Chaining Multiple Transformers#

It is very common to need to get the development factors first then apply a tail curve to extend our development pattern. chainladder transformers take Triangle objects as inputs, but the returned objects are also Triangle objects with their transform method. To chain multiple transformers together, we must invoke the transform method on each transformer similar to how sklearn approaches its own tranformers.

print("First attempt:")
try:
    cl.TailCurve().fit(cl.Development().fit(quarterly))
    print("This passes.")
except:
    print("This fails because we did not transform the triangle")

print("\nSecond attempt:")
try:
    cl.TailCurve().fit(cl.Development().fit_transform(quarterly))
    print("This passes because we transformed the triangle")
except:
    print("This fails.")

First attempt:
This fails because we did not transform the triangle

Second attempt:
This passes because we transformed the triangle

We can also invoke the methods without chaining the operations together.

dev = cl.Development().fit_transform(quarterly)
tail = cl.TailCurve().fit(dev)
tail.cdf_["paid"]

	3-Ult	6-Ult	9-Ult	12-Ult	15-Ult	18-Ult	21-Ult	24-Ult	27-Ult	30-Ult	...	120-Ult	123-Ult	126-Ult	129-Ult	132-Ult	135-Ult	138-Ult	141-Ult	144-Ult	147-Ult
(All)	946.34	110.52	31.09	11.24	5.81	3.62	2.58	1.95	1.67	1.51	...	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00

Chaining multiple transformers together is a very common pattern in chainladder. Like its inspiration sklearn, we can create an overall estimator known as a Pipeline that combines multiple transformers and optional predictors in one estimator.

sequence = [
    ("simple_dev", cl.Development(average="simple")),
    ("inverse_power_tail", cl.TailCurve(curve="inverse_power")),
]

pipe = cl.Pipeline(steps=sequence).fit(quarterly)

Pipeline keeps references to each step with its named_steps argument.

print(pipe.named_steps.simple_dev)
print(pipe.named_steps.inverse_power_tail)

Development(average='simple')
TailCurve(curve='inverse_power')

The Pipeline estimator is almost an exact replica of the sklearn Pipeline, and the docs for sklearn are very comprehensive. To learn more about Pipeline, reference their docs.

With a Triangle transformed to include development patterns and tails, we are now ready to start fitting our suite of IBNR models.