Development Tutorial#
Getting Started#
This tutorial focuses on selecting the development factors.
Be sure to make sure your packages are updated. For more info on how to update your pakages, visit Keeping Packages Updated.
# Black linter, optional
%load_ext lab_black
import pandas as pd
import numpy as np
import chainladder as cl
print("pandas: " + pd.__version__)
print("numpy: " + np.__version__)
print("chainladder: " + cl.__version__)
pandas: 2.1.4
numpy: 1.24.3
chainladder: 0.8.18
Disclaimer#
Note that a lot of the examples shown might not be applicable in a real world scenario, and is only meant to demonstrate some of the functionalities included in the package. The user should always follow all applicable laws, the Code of Professional Conduct, applicable Actuarial Standards of Practice, and exercise their best actuarial judgement.
Testing for Violation of Chain Ladder’s Assumptions#
The chain ladder method is based on the strong assumptions of independence across origin periods and across valuation periods. Mack developed tests to verify if these assumptions hold, and these tests have been implemented in the chainladder
package.
Before the chain ladder model can be used, we should verify that the data satisfies the underlying assumptions using tests at the desired confidence interval level. If assumptions are violated, we should consider if ultimates can be estimated using other models.
There are two main tests that we need to perform:
The
valuation_correlation
test:This test tests for the assumption of independence of accident years. In fact, it tests for correlation across calendar periods (diagonals), and by extension, origin periods (rows).
An additional parameter,
total
, can be passed, depending on if we want to calculate valuation correlation in total across all origins (True
), or for each origin separately (False
).The test uses Z-statistic.
The
development_correlation
test:This test tests for the assumption of independence of the chain ladder method that assumes that subsequent development factors are not correlated (columns).
The test uses T-statistic.
raa = cl.load_sample("raa")
print(
"Are valuation years correlated? Or, are the origins correlated?",
raa.valuation_correlation(p_critical=0.1, total=True).z_critical.values,
)
print(
"Are development periods coorelated?",
raa.development_correlation(p_critical=0.5).t_critical.values,
)
Are valuation years correlated? Or, are the origins correlated? [[False]]
Are development periods coorelated? [[ True]]
/home/docs/checkouts/readthedocs.org/user_builds/chainladder-python/conda/latest/lib/python3.11/site-packages/numpy/lib/nanfunctions.py:1217: RuntimeWarning: All-NaN slice encountered
return function_base._ureduce(a, func=_nanmedian, keepdims=keepdims,
The above tests show that the raa
triangle is independent in both cases, suggesting that there is no evidence that the chain ladder model is not an appropriate method to develop the ultimate amounts. It is suggested to review Mack’s papers to ensure a proper understanding of the methodology and the choice of p_critical
.
Mack also demonstrated that we can test for valuation years’ correlation. To test for each valuation year’s correlation individually, we set total
to False
.
raa.valuation_correlation(p_critical=0.1, total=False).z_critical
/home/docs/checkouts/readthedocs.org/user_builds/chainladder-python/conda/latest/lib/python3.11/site-packages/numpy/lib/nanfunctions.py:1217: RuntimeWarning: All-NaN slice encountered
return function_base._ureduce(a, func=_nanmedian, keepdims=keepdims,
1982 | 1983 | 1984 | 1985 | 1986 | 1987 | 1988 | 1989 | 1990 | |
---|---|---|---|---|---|---|---|---|---|
1981 | False | False | False | False | False | False | False | False | False |
Note that the tests are run on the entire 4 dimensions of the triangle
.
Estimator Basics#
All development methods follow the sklearn
estimator API. These estimators have a few properties that are worth getting used to.
We instantiate the estimator with your choice of assumptions. In the case where we don’t opt for any assumptions, defaults are chosen for you.
At this point, we’ve chosen an estimator and assumptions (even if default) but we have not shown our estimator a Triangle
. At this point it is merely instructions on how to fit development patterns, but no patterns exist as of yet.
All estimators have a fit
method and you can pass a triangle to your estimator. Let’s fit
a Triangle
in a Development
estimator. Let’s also assign the estimator to a variable so we can reference attributes about it.
genins = cl.load_sample("genins")
dev = cl.Development().fit(genins)
Now that we have fit
a Development
estimator, it has many additional properties that didn’t exist before fitting. For example,
we can view the ldf_
dev.ldf_
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | 3.4906 | 1.7473 | 1.4574 | 1.1739 | 1.1038 | 1.0863 | 1.0539 | 1.0766 | 1.0177 |
We can view the cdf_
dev.cdf_
12-Ult | 24-Ult | 36-Ult | 48-Ult | 60-Ult | 72-Ult | 84-Ult | 96-Ult | 108-Ult | |
---|---|---|---|---|---|---|---|---|---|
(All) | 14.4466 | 4.1387 | 2.3686 | 1.6252 | 1.3845 | 1.2543 | 1.1547 | 1.0956 | 1.0177 |
We can also convert between LDFs and CDFs using incr_to_cum() and cum_to_incr() similar to triangles.
dev.ldf_.incr_to_cum()
12-Ult | 24-Ult | 36-Ult | 48-Ult | 60-Ult | 72-Ult | 84-Ult | 96-Ult | 108-Ult | |
---|---|---|---|---|---|---|---|---|---|
(All) | 14.4466 | 4.1387 | 2.3686 | 1.6252 | 1.3845 | 1.2543 | 1.1547 | 1.0956 | 1.0177 |
dev.cdf_.cum_to_incr()
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | 3.4906 | 1.7473 | 1.4574 | 1.1739 | 1.1038 | 1.0863 | 1.0539 | 1.0766 | 1.0177 |
Notice these attributes have a trailing underscore (_
). This is scikit-learn’s API convention, as its documentation states, “attributes that have been estimated from the data must always have a name ending with trailing underscore, for example the coefficients of some regression estimator would be stored in a coef_
attribute after fit
has been called.” In summary, the trailing underscore in class attributes is a scikit-learn’s convention to denote that the attributes are estimated, or to denote that they are fitted attributes.
print("Assumption parameter (no underscore):", dev.average)
print("Estimated parameter (underscore):\n", dev.ldf_)
Assumption parameter (no underscore): volume
Estimated parameter (underscore):
12-24 24-36 36-48 48-60 60-72 72-84 84-96 96-108 108-120
(All) 3.490607 1.747333 1.457413 1.173852 1.103824 1.086269 1.053874 1.076555 1.017725
Development Averaging#
Now that we have a grounding in triangle manipulation and the basics of estimators, we can start getting more creative with customizing our development factors.
The basic Development
estimator uses a weighted regression through the origin for estimating parameters. Mack showed that using weighted regressions allows for:
volume
weighted average development patternssimple
average development factorsOLS
regression
estimate of development factor where the regression equation is Y = mX + 0
While he posited this framework to suggest the MackChainladder
stochastic method, it is an elegant form even for deterministic development pattern selection.
genins = cl.load_sample("genins")
genins
12 | 24 | 36 | 48 | 60 | 72 | 84 | 96 | 108 | 120 | |
---|---|---|---|---|---|---|---|---|---|---|
2001 | 357,848 | 1,124,788 | 1,735,330 | 2,218,270 | 2,745,596 | 3,319,994 | 3,466,336 | 3,606,286 | 3,833,515 | 3,901,463 |
2002 | 352,118 | 1,236,139 | 2,170,033 | 3,353,322 | 3,799,067 | 4,120,063 | 4,647,867 | 4,914,039 | 5,339,085 | |
2003 | 290,507 | 1,292,306 | 2,218,525 | 3,235,179 | 3,985,995 | 4,132,918 | 4,628,910 | 4,909,315 | ||
2004 | 310,608 | 1,418,858 | 2,195,047 | 3,757,447 | 4,029,929 | 4,381,982 | 4,588,268 | |||
2005 | 443,160 | 1,136,350 | 2,128,333 | 2,897,821 | 3,402,672 | 3,873,311 | ||||
2006 | 396,132 | 1,333,217 | 2,180,715 | 2,985,752 | 3,691,712 | |||||
2007 | 440,832 | 1,288,463 | 2,419,861 | 3,483,130 | ||||||
2008 | 359,480 | 1,421,128 | 2,864,498 | |||||||
2009 | 376,686 | 1,363,294 | ||||||||
2010 | 344,014 |
We can also print the age_to_age
factors.
genins.age_to_age
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
2001 | 3.1432 | 1.5428 | 1.2783 | 1.2377 | 1.2092 | 1.0441 | 1.0404 | 1.0630 | 1.0177 |
2002 | 3.5106 | 1.7555 | 1.5453 | 1.1329 | 1.0845 | 1.1281 | 1.0573 | 1.0865 | |
2003 | 4.4485 | 1.7167 | 1.4583 | 1.2321 | 1.0369 | 1.1200 | 1.0606 | ||
2004 | 4.5680 | 1.5471 | 1.7118 | 1.0725 | 1.0874 | 1.0471 | |||
2005 | 2.5642 | 1.8730 | 1.3615 | 1.1742 | 1.1383 | ||||
2006 | 3.3656 | 1.6357 | 1.3692 | 1.2364 | |||||
2007 | 2.9228 | 1.8781 | 1.4394 | ||||||
2008 | 3.9533 | 2.0157 | |||||||
2009 | 3.6192 |
And colorcode with heatmap()
.
genins.age_to_age.heatmap()
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
2001 | 3.1432 | 1.5428 | 1.2783 | 1.2377 | 1.2092 | 1.0441 | 1.0404 | 1.0630 | 1.0177 |
2002 | 3.5106 | 1.7555 | 1.5453 | 1.1329 | 1.0845 | 1.1281 | 1.0573 | 1.0865 | |
2003 | 4.4485 | 1.7167 | 1.4583 | 1.2321 | 1.0369 | 1.1200 | 1.0606 | ||
2004 | 4.5680 | 1.5471 | 1.7118 | 1.0725 | 1.0874 | 1.0471 | |||
2005 | 2.5642 | 1.8730 | 1.3615 | 1.1742 | 1.1383 | ||||
2006 | 3.3656 | 1.6357 | 1.3692 | 1.2364 | |||||
2007 | 2.9228 | 1.8781 | 1.4394 | ||||||
2008 | 3.9533 | 2.0157 | |||||||
2009 | 3.6192 |
vol = cl.Development(average="volume").fit(genins).ldf_
vol
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | 3.4906 | 1.7473 | 1.4574 | 1.1739 | 1.1038 | 1.0863 | 1.0539 | 1.0766 | 1.0177 |
sim = cl.Development(average="simple").fit(genins).ldf_
sim
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | 3.5661 | 1.7456 | 1.4520 | 1.1810 | 1.1112 | 1.0848 | 1.0527 | 1.0748 | 1.0177 |
In most cases, estimator attributes are Triangle
s themselves and can be manipulated with just like raw triangles.
print("LDF Type: ", type(vol))
print("Difference between volume and simple average:")
vol - sim
LDF Type: <class 'chainladder.core.triangle.Triangle'>
Difference between volume and simple average:
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | -0.0755 | 0.0018 | 0.0055 | -0.0071 | -0.0074 | 0.0015 | 0.0011 | 0.0018 |
We can specify how the LDFs are averaged independently for each age-to-age period. For example, we can use volume
averaging on the first pattern, simple
the second, regression
the third, and then repeat the cycle three times for the 9 age-to-age factors that we need. Note that the array of selected method must be of the same length as the number of age-to-age factors.
cl.Development(average=["volume", "simple", "regression"] * 3).fit(genins).ldf_
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | 3.4906 | 1.7456 | 1.4619 | 1.1739 | 1.1112 | 1.0873 | 1.0539 | 1.0748 | 1.0177 |
Another example, using volume
-weighting for the first factor, simple
-weighting for the next 5 factors, and volume
-weighting for the last 3 factors.
cl.Development(average=["volume"] + ["simple"] * 5 + ["volume"] * 3).fit(genins).ldf_
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | 3.4906 | 1.7456 | 1.4520 | 1.1810 | 1.1112 | 1.0848 | 1.0539 | 1.0766 | 1.0177 |
Averaging Period#
Development
comes with an n_periods
parameter that allows you to select the latest n
origin periods for fitting your development patterns. n_periods=-1
is used to indicate the usage of all available periods, which is also the default if the parameter is not specified. The units of n_periods
follows the origin_grain
of the underlying triangle.
cl.Development().fit(genins).ldf_
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | 3.4906 | 1.7473 | 1.4574 | 1.1739 | 1.1038 | 1.0863 | 1.0539 | 1.0766 | 1.0177 |
cl.Development(n_periods=-1).fit(genins).ldf_
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | 3.4906 | 1.7473 | 1.4574 | 1.1739 | 1.1038 | 1.0863 | 1.0539 | 1.0766 | 1.0177 |
cl.Development(n_periods=3).fit(genins).ldf_
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | 3.4604 | 1.8465 | 1.3920 | 1.1539 | 1.0849 | 1.0974 | 1.0539 | 1.0766 | 1.0177 |
Much like average
, n_periods
can also be set for each age-to-age individually.
cl.Development(n_periods=[8, 2, 6, 5, -1, 2, -1, -1, 5]).fit(genins).ldf_
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | 3.5325 | 1.9502 | 1.4808 | 1.1651 | 1.1038 | 1.0825 | 1.0539 | 1.0766 | 1.0177 |
Note that if we provide n_periods
that is greater than what is available for any particular age-to-age period, all available periods will be used instead.
cl.Development(n_periods=[1, 2, 3, 4, 5, 6, 7, 8, 9]).fit(
genins
).ldf_ == cl.Development(n_periods=[1, 2, 3, 4, 5, 4, 3, 2, 1]).fit(genins).ldf_
True
Discarding Problematic Link Ratios#
Even with n_periods
, there are situations where you might want to be more surgical in our selections. For example, you could have a valuation period with bad data and wish to omit the entire diagonal from your averaging.
cl.Development(drop_valuation="2004").fit(genins).ldf_
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | 3.3797 | 1.7517 | 1.4426 | 1.1651 | 1.1038 | 1.0863 | 1.0539 | 1.0766 | 1.0177 |
We can also do an olympic averaging (i.e. exluding high and low from each period).
cl.Development(drop_high=True, drop_low=True).fit(genins).ldf_
/home/docs/checkouts/readthedocs.org/user_builds/chainladder-python/conda/latest/lib/python3.11/site-packages/chainladder/development/base.py:523: UserWarning: Some exclusions have been ignored. At least 1 (use preserve = ...) link ratio(s) is required for development estimation.
warnings.warn(warning)
/home/docs/checkouts/readthedocs.org/user_builds/chainladder-python/conda/latest/lib/python3.11/site-packages/chainladder/development/base.py:204: UserWarning: Some exclusions have been ignored. At least 1 (use preserve = ...) link ratio(s) is required for development estimation.
warnings.warn(warning)
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | 3.5201 | 1.7277 | 1.4351 | 1.1930 | 1.1018 | 1.0825 | 1.0573 | 1.0766 | 1.0177 |
The function also accepts intergers. For example, if we want to drop the highest 3 factors from each period.
cl.Development(drop_high=3).fit(genins).ldf_
/home/docs/checkouts/readthedocs.org/user_builds/chainladder-python/conda/latest/lib/python3.11/site-packages/chainladder/development/base.py:523: UserWarning: Some exclusions have been ignored. At least 1 (use preserve = ...) link ratio(s) is required for development estimation.
warnings.warn(warning)
/home/docs/checkouts/readthedocs.org/user_builds/chainladder-python/conda/latest/lib/python3.11/site-packages/chainladder/development/base.py:204: UserWarning: Some exclusions have been ignored. At least 1 (use preserve = ...) link ratio(s) is required for development estimation.
warnings.warn(warning)
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | 3.1614 | 1.6392 | 1.3687 | 1.1222 | 1.0601 | 1.0441 | 1.0539 | 1.0766 | 1.0177 |
There’s a preserve
that we can use, this variable allows us to specified the minimum number of LDFs required for calculation. If this minimum is not yet, the drop_high
and drop_low
for that age will be ignored. This is especially useful in the tail when the data is thin.
cl.Development(drop_high=3, drop_low=2, preserve=2).fit(genins).ldf_
/home/docs/checkouts/readthedocs.org/user_builds/chainladder-python/conda/latest/lib/python3.11/site-packages/chainladder/development/base.py:523: UserWarning: Some exclusions have been ignored. At least 2 link ratio(s) is required for development estimation.
warnings.warn(warning)
/home/docs/checkouts/readthedocs.org/user_builds/chainladder-python/conda/latest/lib/python3.11/site-packages/chainladder/development/base.py:204: UserWarning: Some exclusions have been ignored. At least 2 link ratio(s) is required for development estimation.
warnings.warn(warning)
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | 3.4108 | 1.7012 | 1.4061 | 1.1739 | 1.1038 | 1.0863 | 1.0539 | 1.0766 | 1.0177 |
We can also use an array of booleans or ints.
cl.Development(drop_high=[True, True, False, True], drop_low=[1, 2, 0, 3]).fit(
genins
).ldf_
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | 3.5201 | 1.7685 | 1.4574 | 1.2342 | 1.1038 | 1.0863 | 1.0539 | 1.0766 | 1.0177 |
Or maybe there is just a single outlier link-ratio that you don’t think is indicative of future development. For these, you can specify the intersection of the origin and development age of the denominator of the link-ratio to drop
.
genins.age_to_age.heatmap()
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
2001 | 3.1432 | 1.5428 | 1.2783 | 1.2377 | 1.2092 | 1.0441 | 1.0404 | 1.0630 | 1.0177 |
2002 | 3.5106 | 1.7555 | 1.5453 | 1.1329 | 1.0845 | 1.1281 | 1.0573 | 1.0865 | |
2003 | 4.4485 | 1.7167 | 1.4583 | 1.2321 | 1.0369 | 1.1200 | 1.0606 | ||
2004 | 4.5680 | 1.5471 | 1.7118 | 1.0725 | 1.0874 | 1.0471 | |||
2005 | 2.5642 | 1.8730 | 1.3615 | 1.1742 | 1.1383 | ||||
2006 | 3.3656 | 1.6357 | 1.3692 | 1.2364 | |||||
2007 | 2.9228 | 1.8781 | 1.4394 | ||||||
2008 | 3.9533 | 2.0157 | |||||||
2009 | 3.6192 |
Let’s say we believe the 4.5680 factor from origin 2004 between age 12 and 24 should be dropped, we can use drop=('2004', 12)
.
cl.Development(drop=("2004", 12)).fit(genins).ldf_
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | 3.3797 | 1.7473 | 1.4574 | 1.1739 | 1.1038 | 1.0863 | 1.0539 | 1.0766 | 1.0177 |
If there are more than one outliers, you can also pass an array of array to the drop
argument.
cl.Development(drop=[("2004", 12), ("2008", 24)]).fit(genins).ldf_
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
(All) | 3.3797 | 1.7041 | 1.4574 | 1.1739 | 1.1038 | 1.0863 | 1.0539 | 1.0766 | 1.0177 |
Transformers#
In sklearn
, there are two types of estimators: transformers and predictors. A transformer transforms the input data (X) in some ways, and a predictor predicts a new value (or values, Y) by using the input data X.
Development
is a transformer, as the returned object is a means to create development patterns, which is used to estimate ultimates, but itself is not a reserving model (predictor).
Transformers come with the tranform
and fit_transform
method. These will return a Triangle
object, but augment it with additional information for use in a subsequent IBNR model (a predictor). drop_high
(and drop_low
) can take an array of boolean variables, indicating if the highest factor should be dropped for each of the LDF calculation.
transformed_triangle = cl.Development(drop_high=[True] * 4 + [False] * 5).fit_transform(
genins
)
transformed_triangle
12 | 24 | 36 | 48 | 60 | 72 | 84 | 96 | 108 | 120 | |
---|---|---|---|---|---|---|---|---|---|---|
2001 | 357,848 | 1,124,788 | 1,735,330 | 2,218,270 | 2,745,596 | 3,319,994 | 3,466,336 | 3,606,286 | 3,833,515 | 3,901,463 |
2002 | 352,118 | 1,236,139 | 2,170,033 | 3,353,322 | 3,799,067 | 4,120,063 | 4,647,867 | 4,914,039 | 5,339,085 | |
2003 | 290,507 | 1,292,306 | 2,218,525 | 3,235,179 | 3,985,995 | 4,132,918 | 4,628,910 | 4,909,315 | ||
2004 | 310,608 | 1,418,858 | 2,195,047 | 3,757,447 | 4,029,929 | 4,381,982 | 4,588,268 | |||
2005 | 443,160 | 1,136,350 | 2,128,333 | 2,897,821 | 3,402,672 | 3,873,311 | ||||
2006 | 396,132 | 1,333,217 | 2,180,715 | 2,985,752 | 3,691,712 | |||||
2007 | 440,832 | 1,288,463 | 2,419,861 | 3,483,130 | ||||||
2008 | 359,480 | 1,421,128 | 2,864,498 | |||||||
2009 | 376,686 | 1,363,294 | ||||||||
2010 | 344,014 |
transformed_triangle.link_ratio
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
2001 | 3.1432 | 1.5428 | 1.2783 | 1.2092 | 1.0441 | 1.0404 | 1.0630 | 1.0177 | |
2002 | 3.5106 | 1.7555 | 1.5453 | 1.1329 | 1.0845 | 1.1281 | 1.0573 | 1.0865 | |
2003 | 4.4485 | 1.7167 | 1.4583 | 1.2321 | 1.0369 | 1.1200 | 1.0606 | ||
2004 | 1.5471 | 1.0725 | 1.0874 | 1.0471 | |||||
2005 | 2.5642 | 1.8730 | 1.3615 | 1.1742 | 1.1383 | ||||
2006 | 3.3656 | 1.6357 | 1.3692 | 1.2364 | |||||
2007 | 2.9228 | 1.8781 | 1.4394 | ||||||
2008 | 3.9533 | ||||||||
2009 | 3.6192 |
Our transformed triangle behaves as our original genins
triangle. However, notice the link_ratios exclude any droppped values you specified.
transformed_triangle.link_ratio.heatmap()
12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | |
---|---|---|---|---|---|---|---|---|---|
2001 | 3.1432 | 1.5428 | 1.2783 | 1.2092 | 1.0441 | 1.0404 | 1.0630 | 1.0177 | |
2002 | 3.5106 | 1.7555 | 1.5453 | 1.1329 | 1.0845 | 1.1281 | 1.0573 | 1.0865 | |
2003 | 4.4485 | 1.7167 | 1.4583 | 1.2321 | 1.0369 | 1.1200 | 1.0606 | ||
2004 | 1.5471 | 1.0725 | 1.0874 | 1.0471 | |||||
2005 | 2.5642 | 1.8730 | 1.3615 | 1.1742 | 1.1383 | ||||
2006 | 3.3656 | 1.6357 | 1.3692 | 1.2364 | |||||
2007 | 2.9228 | 1.8781 | 1.4394 | ||||||
2008 | 3.9533 | ||||||||
2009 | 3.6192 |
print(type(transformed_triangle))
transformed_triangle.latest_diagonal
<class 'chainladder.core.triangle.Triangle'>
2010 | |
---|---|
2001 | 3,901,463 |
2002 | 5,339,085 |
2003 | 4,909,315 |
2004 | 4,588,268 |
2005 | 3,873,311 |
2006 | 3,691,712 |
2007 | 3,483,130 |
2008 | 2,864,498 |
2009 | 1,363,294 |
2010 | 344,014 |
However, it has other attributes that make it IBNR model-ready.
transformed_triangle.cdf_
12-Ult | 24-Ult | 36-Ult | 48-Ult | 60-Ult | 72-Ult | 84-Ult | 96-Ult | 108-Ult | |
---|---|---|---|---|---|---|---|---|---|
(All) | 13.1367 | 3.8870 | 2.2809 | 1.6131 | 1.3845 | 1.2543 | 1.1547 | 1.0956 | 1.0177 |
fit_transform()
is equivalent to calling fit
and transform
in succession on the same triangle. Again, this should feel very familiar to the sklearn
practitioner.
cl.Development().fit_transform(genins) == cl.Development().fit(genins).transform(genins)
True
The reason you might want want to use fit
and transform
separately would be when you want to apply development patterns to a a different triangle. For example, we can:
Extract the commercial auto triangles from the
clrd
datasetSummarize to an industry level and
fit
aDevelopment
objectWe can then
transform
the individual company triangles with the industry development patterns
clrd = cl.load_sample("clrd")
comauto = clrd[clrd["LOB"] == "comauto"]["CumPaidLoss"]
comauto_industry = comauto.sum()
industry_dev = cl.Development().fit(comauto_industry)
industry_dev.transform(comauto)
Triangle Summary | |
---|---|
Valuation: | 1997-12 |
Grain: | OYDY |
Shape: | (157, 1, 10, 10) |
Index: | [GRNAME, LOB] |
Columns: | [CumPaidLoss] |
Working with Multidimensional Triangles#
Several (though not all) of the estimators in chainladder
can be fit to several triangles simultaneously. While this can be a convenient shorthand, all these estimators use the same assumptions across every triangle.
clrd = cl.load_sample("clrd").groupby("LOB").sum()["CumPaidLoss"]
print("Fitting to " + str(len(clrd.index)) + " industries simultaneously.")
cl.Development().fit_transform(clrd).cdf_
Fitting to 6 industries simultaneously.
Triangle Summary | |
---|---|
Valuation: | 2261-12 |
Grain: | OYDY |
Shape: | (6, 1, 1, 9) |
Index: | [LOB] |
Columns: | [CumPaidLoss] |
For greater control, you can slice individual triangles out and fit separate patterns to each.
print(cl.Development(average="simple").fit(clrd.loc["wkcomp"]))
print(cl.Development(n_periods=4).fit(clrd.loc["ppauto"]))
print(cl.Development(average="regression", n_periods=6).fit(clrd.loc["comauto"]))
Development(average='simple')
Development(n_periods=4)
Development(average='regression', n_periods=6)