Extending Development Patterns with Tails#

Getting Started#

This tutorial focuses on extending the developent patterns beyond the tail.

Be sure to make sure your packages are updated. For more info on how to update your pakages, visit Keeping Packages Updated.

# Black linter, optional
%load_ext lab_black

import pandas as pd
import numpy as np
import chainladder as cl
import matplotlib.pyplot as plt

print("pandas: " + pd.__version__)
print("numpy: " + np.__version__)
print("chainladder: " + cl.__version__)
pandas: 2.1.4
numpy: 1.24.3
chainladder: 0.8.18

Disclaimer#

Note that a lot of the examples shown might not be applicable in a real world scenario, and is only meant to demonstrate some of the functionalities included in the package. The user should always follow all applicable laws, the Code of Professional Conduct, applicable Actuarial Standards of Practice, and exercise their best actuarial judgement.

Basic Tail Fitting#

Tails are another class of transformers. Similar to the Development estimator, they come with fit, transform and fit_transform methods. Also, like our Development estimator, you can define a tail in the absence of data or if you believe development will continue beyond your latest evaluation period.

quarterly = cl.load_sample("quarterly")
quarterly["paid"]
/home/docs/checkouts/readthedocs.org/user_builds/chainladder-python/conda/latest/lib/python3.11/site-packages/chainladder/core/base.py:250: UserWarning: The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.
  arr = dict(zip(datetime_arg, pd.to_datetime(**item)))
/home/docs/checkouts/readthedocs.org/user_builds/chainladder-python/conda/latest/lib/python3.11/site-packages/chainladder/core/base.py:250: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
  arr = dict(zip(datetime_arg, pd.to_datetime(**item)))
3 6 9 12 15 18 21 24 27 30 ... 108 111 114 117 120 123 126 129 132 135
1995 3.00 24.00 65.00 141.00 273.00 418.00 550.00 692.00 814.00 876.00 ... 1,099.00 1,099.00 1,100.00 1,098.00 1,098.00 1,098.00 1,099.00 1,099.00 1,100.00 1,100.00
1996 1.00 16.00 54.00 135.00 260.00 398.00 594.00 758.00 871.00 964.00 ... 1,296.00 1,296.00 1,297.00 1,298.00 1,298.00 1,298.00
1997 1.00 17.00 55.00 166.00 296.00 442.00 587.00 701.00 811.00 891.00 ... 1,197.00 1,198.00
1998 1.00 11.00 40.00 93.00 185.00 343.00 474.00 643.00 744.00 831.00 ...
1999 1.00 14.00 47.00 113.00 225.00 379.00 570.00 715.00 832.00 955.00 ...
2000 1.00 6.00 28.00 100.00 194.00 297.00 415.00 521.00 616.00 697.00 ...
2001 1.00 7.00 37.00 128.00 271.00 427.00 579.00 722.00 838.00 937.00 ...
2002 1.00 10.00 45.00 110.00 236.00 442.00 668.00 890.00 1,078.00 1,198.00 ...
2003 1.00 9.00 31.00 94.00 192.00 299.00 408.00 792.00 873.00 949.00 ...
2004 4.00 16.00 49.00 170.00 289.00 442.00 601.00 793.00 948.00 ...
2005 1.00 7.00 36.00 97.00 183.00 ...
2006 1.00 ...

Upon fitting data, we get updated ldf_ and cdf_ attributes that extend beyond the length of the triangle. Notice how the tail includes extra development periods (age 147) beyond the end of the triangle (age 135) at which point an age-to-ultimate tail factor is applied.

tail = cl.TailCurve()
tail.fit(quarterly)

print("Triangle latest", quarterly.development.max())
tail.fit(quarterly).ldf_["paid"]
Triangle latest 135
3-6 6-9 9-12 12-15 15-18 18-21 21-24 24-27 27-30 30-33 ... 120-123 123-126 126-129 129-132 132-135 135-138 138-141 141-144 144-147 147-150
(All) 8.5625 3.5547 2.7659 1.9332 1.6055 1.4011 1.3270 1.1658 1.1098 1.0780 ... 1.0000 1.0009 1.0000 1.0009 1.0000 1.0001 1.0001 1.0001 1.0001 1.0003

These extra twelve months (147 - 135, or one year) of development patterns are included as it is typical to want to track IBNR run-off over a 1-year time horizon from the valuation date. The one-year extension is currently fixed at one year and there is no ability to extend it even further. However, a subsequent version of chainladder will look to address this issue.

Curve Fitting#

Curve fitting takes selected development patterns and extrapolates them using either an exponential or inverse_power fit. In most cases, the inverse_power produces a thicker (more conservative) tail.

exp = cl.TailCurve(curve="exponential").fit(quarterly["paid"])
exp.tail_
135-Ult
(All) 1.00065
inv = cl.TailCurve(curve="inverse_power").fit(quarterly["paid"])
inv.tail_
135-Ult
(All) 1.021283

When fitting a tail, by default, all of the data will be used; however, we can specify which period of development patterns we want to begin including in the curve fitting process with fit_period.

Patterns will also be generated for 100 periods beyond the end of the triangle by default, or we can specify how far beyond the triangle to project the tail factor to before dropping the age-to-age factor down to 1.0 using extrap_periods.

Note that even though we can extrapolate the curve many years beyond the end of the triangle for computational purposes, the resultant development factors will compress all ldf_ beyond one year into a single age-ultimate factor.

quarterly["incurred"]
3 6 9 12 15 18 21 24 27 30 ... 108 111 114 117 120 123 126 129 132 135
1995 44.00 96.00 194.00 420.00 621.00 715.00 748.00 906.00 950.00 973.00 ... 1,098.00 1,099.00 1,103.00 1,100.00 1,098.00 1,100.00 1,100.00 1,098.00 1,101.00 1,100.00
1996 42.00 136.00 202.00 365.00 541.00 651.00 817.00 988.00 1,052.00 1,122.00 ... 1,300.00 1,300.00 1,302.00 1,300.00 1,303.00 1,300.00
1997 17.00 43.00 135.00 380.00 530.00 714.00 813.00 945.00 966.00 1,008.00 ... 1,203.00 1,200.00
1998 10.00 43.00 107.00 238.00 393.00 574.00 732.00 894.00 935.00 967.00 ...
1999 13.00 41.00 109.00 306.00 481.00 657.00 821.00 1,007.00 1,021.00 1,141.00 ...
2000 2.00 29.00 88.00 254.00 380.00 501.00 615.00 735.00 788.00 842.00 ...
2001 4.00 25.00 151.00 333.00 777.00 663.00 856.00 988.00 1,063.00 1,167.00 ...
2002 2.00 34.00 115.00 290.00 472.00 809.00 1,054.00 1,543.00 1,617.00 1,505.00 ...
2003 3.00 19.00 90.00 692.00 597.00 929.00 883.00 1,117.00 1,092.00 1,176.00 ...
2004 4.00 38.00 138.00 371.00 583.00 756.00 902.00 1,111.00 1,212.00 ...
2005 21.00 79.00 115.00 299.00 422.00 ...
2006 13.00 ...
cl.TailCurve(fit_period=(12, None), extrap_periods=50).fit(quarterly).ldf_["incurred"]
3-6 6-9 9-12 12-15 15-18 18-21 21-24 24-27 27-30 30-33 ... 120-123 123-126 126-129 129-132 132-135 135-138 138-141 141-144 144-147 147-150
(All) 3.5988 2.4768 2.7341 1.4683 1.2966 1.1825 1.2418 1.0451 1.0440 1.0365 ... 0.9996 1.0000 0.9982 1.0027 0.9991 1.0003 1.0003 1.0002 1.0002 1.0012
cl.TailCurve(fit_period=(1, None), extrap_periods=50).fit(quarterly).ldf_["incurred"]
3-6 6-9 9-12 12-15 15-18 18-21 21-24 24-27 27-30 30-33 ... 120-123 123-126 126-129 129-132 132-135 135-138 138-141 141-144 144-147 147-150
(All) 3.5988 2.4768 2.7341 1.4683 1.2966 1.1825 1.2418 1.0451 1.0440 1.0365 ... 0.9996 1.0000 0.9982 1.0027 0.9991 1.0002 1.0002 1.0001 1.0001 1.0006

In this example, we ignore the first five development patterns for curve fitting, and we allow our tail extrapolation to go 50 quarters beyond the end of the triangle. Note that both fit_period and extrap_periods follow the development_grain of the triangle being fit.

Chaining Multiple Transformers#

It is very common to need to get the development factors first then apply a tail curve to extend our development pattern. chainladder transformers take Triangle objects as inputs, but the returned objects are also Triangle objects with their transform method. To chain multiple transformers together, we must invoke the transform method on each transformer similar to how sklearn approaches its own tranformers.

print("First attempt:")
try:
    cl.TailCurve().fit(cl.Development().fit(quarterly))
    print("This passes.")
except:
    print("This fails because we did not transform the triangle")

print("\nSecond attempt:")
try:
    cl.TailCurve().fit(cl.Development().fit_transform(quarterly))
    print("This passes because we transformed the triangle")
except:
    print("This fails.")
First attempt:
This fails because we did not transform the triangle

Second attempt:
This passes because we transformed the triangle

We can also invoke the methods without chaining the operations together.

dev = cl.Development().fit_transform(quarterly)
tail = cl.TailCurve().fit(dev)
tail.cdf_["paid"]
3-Ult 6-Ult 9-Ult 12-Ult 15-Ult 18-Ult 21-Ult 24-Ult 27-Ult 30-Ult ... 120-Ult 123-Ult 126-Ult 129-Ult 132-Ult 135-Ult 138-Ult 141-Ult 144-Ult 147-Ult
(All) 946.34 110.52 31.09 11.24 5.81 3.62 2.58 1.95 1.67 1.51 ... 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

Chaining multiple transformers together is a very common pattern in chainladder. Like its inspiration sklearn, we can create an overall estimator known as a Pipeline that combines multiple transformers and optional predictors in one estimator.

sequence = [
    ("simple_dev", cl.Development(average="simple")),
    ("inverse_power_tail", cl.TailCurve(curve="inverse_power")),
]

pipe = cl.Pipeline(steps=sequence).fit(quarterly)

Pipeline keeps references to each step with its named_steps argument.

print(pipe.named_steps.simple_dev)
print(pipe.named_steps.inverse_power_tail)
Development(average='simple')
TailCurve(curve='inverse_power')

The Pipeline estimator is almost an exact replica of the sklearn Pipeline, and the docs for sklearn are very comprehensive. To learn more about Pipeline, reference their docs.

With a Triangle transformed to include development patterns and tails, we are now ready to start fitting our suite of IBNR models.