Extending Development Patterns with Tails#
Getting Started#
This tutorial focuses on extending the developent patterns beyond the tail.
Be sure to make sure your packages are updated. For more info on how to update your pakages, visit Keeping Packages Updated.
# Black linter, optional
%load_ext lab_black
import pandas as pd
import numpy as np
import chainladder as cl
import matplotlib.pyplot as plt
print("pandas: " + pd.__version__)
print("numpy: " + np.__version__)
print("chainladder: " + cl.__version__)
pandas: 2.1.4
numpy: 1.24.3
chainladder: 0.8.18
Disclaimer#
Note that a lot of the examples shown might not be applicable in a real world scenario, and is only meant to demonstrate some of the functionalities included in the package. The user should always follow all applicable laws, the Code of Professional Conduct, applicable Actuarial Standards of Practice, and exercise their best actuarial judgement.
Basic Tail Fitting#
Tails are another class of transformers. Similar to the Development
estimator, they come with fit
, transform
and fit_transform
methods. Also, like our Development
estimator, you can define a tail in the absence of data or if you believe development will continue beyond your latest evaluation period.
quarterly = cl.load_sample("quarterly")
quarterly["paid"]
/home/docs/checkouts/readthedocs.org/user_builds/chainladder-python/conda/latest/lib/python3.11/site-packages/chainladder/core/base.py:250: UserWarning: The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.
arr = dict(zip(datetime_arg, pd.to_datetime(**item)))
/home/docs/checkouts/readthedocs.org/user_builds/chainladder-python/conda/latest/lib/python3.11/site-packages/chainladder/core/base.py:250: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
arr = dict(zip(datetime_arg, pd.to_datetime(**item)))
3 | 6 | 9 | 12 | 15 | 18 | 21 | 24 | 27 | 30 | ... | 108 | 111 | 114 | 117 | 120 | 123 | 126 | 129 | 132 | 135 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1995 | 3.00 | 24.00 | 65.00 | 141.00 | 273.00 | 418.00 | 550.00 | 692.00 | 814.00 | 876.00 | ... | 1,099.00 | 1,099.00 | 1,100.00 | 1,098.00 | 1,098.00 | 1,098.00 | 1,099.00 | 1,099.00 | 1,100.00 | 1,100.00 |
1996 | 1.00 | 16.00 | 54.00 | 135.00 | 260.00 | 398.00 | 594.00 | 758.00 | 871.00 | 964.00 | ... | 1,296.00 | 1,296.00 | 1,297.00 | 1,298.00 | 1,298.00 | 1,298.00 | ||||
1997 | 1.00 | 17.00 | 55.00 | 166.00 | 296.00 | 442.00 | 587.00 | 701.00 | 811.00 | 891.00 | ... | 1,197.00 | 1,198.00 | ||||||||
1998 | 1.00 | 11.00 | 40.00 | 93.00 | 185.00 | 343.00 | 474.00 | 643.00 | 744.00 | 831.00 | ... | ||||||||||
1999 | 1.00 | 14.00 | 47.00 | 113.00 | 225.00 | 379.00 | 570.00 | 715.00 | 832.00 | 955.00 | ... | ||||||||||
2000 | 1.00 | 6.00 | 28.00 | 100.00 | 194.00 | 297.00 | 415.00 | 521.00 | 616.00 | 697.00 | ... | ||||||||||
2001 | 1.00 | 7.00 | 37.00 | 128.00 | 271.00 | 427.00 | 579.00 | 722.00 | 838.00 | 937.00 | ... | ||||||||||
2002 | 1.00 | 10.00 | 45.00 | 110.00 | 236.00 | 442.00 | 668.00 | 890.00 | 1,078.00 | 1,198.00 | ... | ||||||||||
2003 | 1.00 | 9.00 | 31.00 | 94.00 | 192.00 | 299.00 | 408.00 | 792.00 | 873.00 | 949.00 | ... | ||||||||||
2004 | 4.00 | 16.00 | 49.00 | 170.00 | 289.00 | 442.00 | 601.00 | 793.00 | 948.00 | ... | |||||||||||
2005 | 1.00 | 7.00 | 36.00 | 97.00 | 183.00 | ... | |||||||||||||||
2006 | 1.00 | ... |
Upon fitting data, we get updated ldf_
and cdf_
attributes that extend beyond the length of the triangle. Notice how the tail includes extra development periods (age 147) beyond the end of the triangle (age 135) at which point an age-to-ultimate tail factor is applied.
tail = cl.TailCurve()
tail.fit(quarterly)
print("Triangle latest", quarterly.development.max())
tail.fit(quarterly).ldf_["paid"]
Triangle latest 135
3-6 | 6-9 | 9-12 | 12-15 | 15-18 | 18-21 | 21-24 | 24-27 | 27-30 | 30-33 | ... | 120-123 | 123-126 | 126-129 | 129-132 | 132-135 | 135-138 | 138-141 | 141-144 | 144-147 | 147-150 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
(All) | 8.5625 | 3.5547 | 2.7659 | 1.9332 | 1.6055 | 1.4011 | 1.3270 | 1.1658 | 1.1098 | 1.0780 | ... | 1.0000 | 1.0009 | 1.0000 | 1.0009 | 1.0000 | 1.0001 | 1.0001 | 1.0001 | 1.0001 | 1.0003 |
These extra twelve months (147 - 135, or one year) of development patterns are included as it is typical to want to track IBNR run-off over a 1-year time horizon from the valuation date. The one-year extension is currently fixed at one year and there is no ability to extend it even further. However, a subsequent version of chainladder
will look to address this issue.
Curve Fitting#
Curve fitting takes selected development patterns and extrapolates them using either an exponential
or inverse_power
fit. In most cases, the inverse_power
produces a thicker (more conservative) tail.
exp = cl.TailCurve(curve="exponential").fit(quarterly["paid"])
exp.tail_
135-Ult | |
---|---|
(All) | 1.00065 |
inv = cl.TailCurve(curve="inverse_power").fit(quarterly["paid"])
inv.tail_
135-Ult | |
---|---|
(All) | 1.021283 |
When fitting a tail, by default, all of the data will be used; however, we can specify which period of development patterns we want to begin including in the curve fitting process with fit_period
.
Patterns will also be generated for 100 periods beyond the end of the triangle by default, or we can specify how far beyond the triangle to project the tail factor to before dropping the age-to-age factor down to 1.0 using extrap_periods
.
Note that even though we can extrapolate the curve many years beyond the end of the triangle for computational purposes, the resultant development factors will compress all ldf_
beyond one year into a single age-ultimate factor.
quarterly["incurred"]
3 | 6 | 9 | 12 | 15 | 18 | 21 | 24 | 27 | 30 | ... | 108 | 111 | 114 | 117 | 120 | 123 | 126 | 129 | 132 | 135 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1995 | 44.00 | 96.00 | 194.00 | 420.00 | 621.00 | 715.00 | 748.00 | 906.00 | 950.00 | 973.00 | ... | 1,098.00 | 1,099.00 | 1,103.00 | 1,100.00 | 1,098.00 | 1,100.00 | 1,100.00 | 1,098.00 | 1,101.00 | 1,100.00 |
1996 | 42.00 | 136.00 | 202.00 | 365.00 | 541.00 | 651.00 | 817.00 | 988.00 | 1,052.00 | 1,122.00 | ... | 1,300.00 | 1,300.00 | 1,302.00 | 1,300.00 | 1,303.00 | 1,300.00 | ||||
1997 | 17.00 | 43.00 | 135.00 | 380.00 | 530.00 | 714.00 | 813.00 | 945.00 | 966.00 | 1,008.00 | ... | 1,203.00 | 1,200.00 | ||||||||
1998 | 10.00 | 43.00 | 107.00 | 238.00 | 393.00 | 574.00 | 732.00 | 894.00 | 935.00 | 967.00 | ... | ||||||||||
1999 | 13.00 | 41.00 | 109.00 | 306.00 | 481.00 | 657.00 | 821.00 | 1,007.00 | 1,021.00 | 1,141.00 | ... | ||||||||||
2000 | 2.00 | 29.00 | 88.00 | 254.00 | 380.00 | 501.00 | 615.00 | 735.00 | 788.00 | 842.00 | ... | ||||||||||
2001 | 4.00 | 25.00 | 151.00 | 333.00 | 777.00 | 663.00 | 856.00 | 988.00 | 1,063.00 | 1,167.00 | ... | ||||||||||
2002 | 2.00 | 34.00 | 115.00 | 290.00 | 472.00 | 809.00 | 1,054.00 | 1,543.00 | 1,617.00 | 1,505.00 | ... | ||||||||||
2003 | 3.00 | 19.00 | 90.00 | 692.00 | 597.00 | 929.00 | 883.00 | 1,117.00 | 1,092.00 | 1,176.00 | ... | ||||||||||
2004 | 4.00 | 38.00 | 138.00 | 371.00 | 583.00 | 756.00 | 902.00 | 1,111.00 | 1,212.00 | ... | |||||||||||
2005 | 21.00 | 79.00 | 115.00 | 299.00 | 422.00 | ... | |||||||||||||||
2006 | 13.00 | ... |
cl.TailCurve(fit_period=(12, None), extrap_periods=50).fit(quarterly).ldf_["incurred"]
3-6 | 6-9 | 9-12 | 12-15 | 15-18 | 18-21 | 21-24 | 24-27 | 27-30 | 30-33 | ... | 120-123 | 123-126 | 126-129 | 129-132 | 132-135 | 135-138 | 138-141 | 141-144 | 144-147 | 147-150 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
(All) | 3.5988 | 2.4768 | 2.7341 | 1.4683 | 1.2966 | 1.1825 | 1.2418 | 1.0451 | 1.0440 | 1.0365 | ... | 0.9996 | 1.0000 | 0.9982 | 1.0027 | 0.9991 | 1.0003 | 1.0003 | 1.0002 | 1.0002 | 1.0012 |
cl.TailCurve(fit_period=(1, None), extrap_periods=50).fit(quarterly).ldf_["incurred"]
3-6 | 6-9 | 9-12 | 12-15 | 15-18 | 18-21 | 21-24 | 24-27 | 27-30 | 30-33 | ... | 120-123 | 123-126 | 126-129 | 129-132 | 132-135 | 135-138 | 138-141 | 141-144 | 144-147 | 147-150 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
(All) | 3.5988 | 2.4768 | 2.7341 | 1.4683 | 1.2966 | 1.1825 | 1.2418 | 1.0451 | 1.0440 | 1.0365 | ... | 0.9996 | 1.0000 | 0.9982 | 1.0027 | 0.9991 | 1.0002 | 1.0002 | 1.0001 | 1.0001 | 1.0006 |
In this example, we ignore the first five development patterns for curve fitting, and we allow our tail extrapolation to go 50 quarters beyond the end of the triangle. Note that both fit_period
and extrap_periods
follow the development_grain
of the triangle being fit.
Chaining Multiple Transformers#
It is very common to need to get the development factors first then apply a tail curve to extend our development pattern. chainladder
transformers take Triangle
objects as inputs, but the returned objects are also Triangle
objects with their transform
method. To chain multiple transformers together, we must invoke the transform
method on each transformer similar to how sklearn
approaches its own tranformers.
print("First attempt:")
try:
cl.TailCurve().fit(cl.Development().fit(quarterly))
print("This passes.")
except:
print("This fails because we did not transform the triangle")
print("\nSecond attempt:")
try:
cl.TailCurve().fit(cl.Development().fit_transform(quarterly))
print("This passes because we transformed the triangle")
except:
print("This fails.")
First attempt:
This fails because we did not transform the triangle
Second attempt:
This passes because we transformed the triangle
We can also invoke the methods without chaining the operations together.
dev = cl.Development().fit_transform(quarterly)
tail = cl.TailCurve().fit(dev)
tail.cdf_["paid"]
3-Ult | 6-Ult | 9-Ult | 12-Ult | 15-Ult | 18-Ult | 21-Ult | 24-Ult | 27-Ult | 30-Ult | ... | 120-Ult | 123-Ult | 126-Ult | 129-Ult | 132-Ult | 135-Ult | 138-Ult | 141-Ult | 144-Ult | 147-Ult | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
(All) | 946.34 | 110.52 | 31.09 | 11.24 | 5.81 | 3.62 | 2.58 | 1.95 | 1.67 | 1.51 | ... | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
Chaining multiple transformers together is a very common pattern in chainladder
. Like its inspiration sklearn
, we can create an overall estimator known as a Pipeline
that combines multiple transformers and optional predictors in one estimator.
sequence = [
("simple_dev", cl.Development(average="simple")),
("inverse_power_tail", cl.TailCurve(curve="inverse_power")),
]
pipe = cl.Pipeline(steps=sequence).fit(quarterly)
Pipeline
keeps references to each step with its named_steps
argument.
print(pipe.named_steps.simple_dev)
print(pipe.named_steps.inverse_power_tail)
Development(average='simple')
TailCurve(curve='inverse_power')
The Pipeline
estimator is almost an exact replica of the sklearn Pipeline
, and the docs for sklearn
are very comprehensive. To learn more about Pipeline
, reference their docs.
With a Triangle
transformed to include development patterns and tails, we are now ready to start fitting our suite of IBNR models.