Tail Estimators#
Unobserved loss experience beyond the edge of a Triangle can be substantial and is a necessary part of an actuary’s analysis. This is particularly so for long tailed lines more common in commercial insurance or excess layers of loss.
With all tail estimation, we are extrapolating beyond the known data which comes with its own challenges. It tends to be more difficult to validate assumptions when performing tail estimation. Additionally, many of the techniques carry a high degree of sensitivity to the assumptions you will choose when conducting analysis.
Nevertheless, it is a necessary part of the actuary’s toolkit in estimating reserve liabilities.
Basics and Commonalities#
As with the Development family of estimators, the Tail module provides a variety of tail transformers that allow for the extrapolation of development patterns beyond the end of the triangle. Being transformers, we can expect that each estimator has a fit and a transform method. It is also expected that the transform method will give us back our Triangle with additional estimated parameters for incorporating our tail review in our IBNR estimates.
Tail#
Every tail estimator produced a tail_ attribute which represents the point estimate of the tail of the Triangle.
Tip
Recall that the trailing underscore_ implies that these parameters are only available once we’ve fit our estimator.
We will demonstrate this property using the TailConstant estimator which we explore further below.
triangle = cl.load_sample('genins')
tail = cl.TailConstant(1.10).fit_transform(triangle)
tail.tail_
| 120-Ult | |
|---|---|
| (All) | 1.1 |
Run Off#
In addition to point estimates, tail estimators support run-off analysis. The
ldf_ and cdf_ attribute splits the tail point estimate into enough
development patterns to allow for tail run-off to be examined for at least another
valuation year by default. For an annual development_grain grain, the development pattterns
include two additional patterns.
tail.cdf_
| 12-Ult | 24-Ult | 36-Ult | 48-Ult | 60-Ult | 72-Ult | 84-Ult | 96-Ult | 108-Ult | 120-Ult | 132-Ult | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| (All) | 15.8912 | 4.5526 | 2.6054 | 1.7877 | 1.5229 | 1.3797 | 1.2701 | 1.2052 | 1.1195 | 1.1000 | 1.0514 |
Note
When fitting a tail estimator without first specifying a Development estimator first, chainladder assumes a volume-weighted Development. To override this, you should declare transform your triangle with a development transformer first.
For quarterly grain, there are five additional development patterns and for monthly there are thirteen.
triangle = cl.load_sample('quarterly')['paid']
tail = cl.TailCurve().fit(triangle)
# Slice just the tail entries
tail.cdf_[~tail.ldf_.development.isin(triangle.link_ratio.development)]
/home/docs/checkouts/readthedocs.org/user_builds/chainladder-python/envs/experimental/lib/python3.11/site-packages/chainladder/core/base.py:403: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
datetime_mapping = dict(zip(datetime_arg, pd.to_datetime(**date_inference)))
| 135-Ult | 138-Ult | 141-Ult | 144-Ult | 147-Ult | |
|---|---|---|---|---|---|
| (All) | 1.0006 | 1.0005 | 1.0004 | 1.0004 | 1.0003 |
Note
The point estimate tail_ of the tail is the CDF for 135-Ult. The remaining CDFs simply represent the run-off of this tail as specified by the Tail estimator in use.
Attachment Age#
By default, tail estimators attach to the oldest development age of the Triangle.
In practice, the last several known development factors of a Triangle can be
unreliable and attaching the tail earlier and using it as a smoothing mechanism
is preferred. All tail estimators have an attachment_age parameter that
allows you to select the development age to which the tail will attach.
triangle = cl.load_sample('genins')
unsmoothed = cl.TailCurve().fit(triangle).ldf_
smoothed = cl.TailCurve(attachment_age=24).fit(triangle).ldf_
In this way, we can smooth over any volatile development patterns.
Projection period#
Regardless of where you attach a tail, there will be incremental patterns to at
least one year past the end of the Triangle to support run-off analysis.
This accommodates views of run-off for the oldest origin periods in your triangle for at least another year. As actuarial reviews typically occur no less than annually, this should be sufficient for examining run-off performance between actuarial valuations.
Cases arise where modeling run-off on a longer time horizon is desired. For these cases, it is possible to modify the projection_period (in months) of all Tail Esimtators by specifying the number of months you wish to extend your patterns.
# Extend tail run-off 4 years past end of Triangle.
tail = cl.TailCurve(projection_period=12*4).fit(triangle)
tail.ldf_[~tail.ldf_.development.isin(triangle.link_ratio.development)]
| 120-132 | 132-144 | 144-156 | 156-168 | 168-180 | |
|---|---|---|---|---|---|
| (All) | 1.0119 | 1.0071 | 1.0042 | 1.0025 | 1.0036 |
In this example, we see a higher development pattern for the last period than the ones prior. This is because the true tail run-off exceeds four years. To be clear, the projection_period has no effect on the actual estimated tail, it simply provides a longer time horizon for measuring run-off.
TailConstant#
TailConstant allows you to input a tail factor as a constant. This is
useful when relying on tail selections from an external source like industry data.
The tail factor supplied applies to all individual triangles contained within
the Triangle object. If this is not the desired outcome, slicing individual
triangles and applying TailConstant separately to each can be done.
Decay#
For run-off analysis, you can control the decay of your tail. An exponential
decay parameter is also available to facilitate the run off analysis described
above.
tail = cl.TailConstant(tail=1.05, decay=0.95).fit_transform(triangle)
tail.ldf_
| 12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | 120-132 | 132-144 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| (All) | 3.4906 | 1.7473 | 1.4574 | 1.1739 | 1.1038 | 1.0863 | 1.0539 | 1.0766 | 1.0177 | 1.0024 | 1.0474 |
As we can see in the example, the 5% tail in the example is split between the
amount to run-off over the subsequent calendar period 132-144, and the
remainder, 144-Ult. The split is controlled by the decay parameter. We
can always reference our tail_ point estimate.
tail.tail_
| 120-Ult | |
|---|---|
| (All) | 1.05 |
Examples#
[DevelopmentConstant Callable]
TailCurve#
TailCurve allows for extrapolating a tail factor using curve fitting.
Currently, exponential decay of LDFs and inverse power curve are supported.
clrd = cl.load_sample('clrd').groupby('LOB').sum()['CumPaidLoss']
cdf_ip = cl.TailCurve(curve='inverse_power').fit(clrd).tail_
cdf_xp = cl.TailCurve(curve='exponential').fit(clrd).tail_
Regression parameters#
Underlying the curve fit is an OLS regression which generates both a slope_
and intercept_ term.
For the exponential curve fit with slope, \(\beta\) and intercept \(\alpha\), the tail factor
is:
For the inverse_power curve, the tail factor is:
Where \(X\) is your selected link ratios and \(N\) is the number of years you wish to extrapolate the tail_
Deriving the tail_ factor manually:
triangle = cl.load_sample('genins')
tail = cl.TailCurve('exponential').fit(triangle)
tail.tail_
| 120-Ult | |
|---|---|
| (All) | 1.029499 |
A manual calculation using (1) above:
np.prod(
(1+np.exp(
np.arange(triangle.shape[-1],
triangle.shape[-1] + tail.extrap_periods) *
tail.slope_.values + tail.intercept_.values)
)
)
np.float64(1.0294991710529175)
For completeness, let’s also confirm (2):
tail = cl.TailCurve('inverse_power').fit(triangle)
tail.tail_
| 120-Ult | |
|---|---|
| (All) | 1.29243 |
np.prod(
1 + np.exp(tail.intercept_.values) *
(
np.arange(
triangle.shape[-1],
triangle.shape[-1] + tail.extrap_periods
) ** tail.slope_.values)
)
np.float64(1.292430311543694)
Extrapolation Period#
From these formulas, the actuary has control over the parameter, \(N\) which
represents how far out how far beyond the edge of Results for the inverse_power
curve are sensitive to this parameter as it tends to converge slowly to its
asymptotic value. This parameter can be controlled using the extrap_periods
hyperparamter.
Tip
Exponential decay is generally rapid and is less sensitive to the extrap_period
argument than the inverse power fit.
This example demonstrates the extrap_periods functionality of the TailCurve estimator. The estimator defaults to extrapolating out 100 periods. However, we can see that the “Inverse Power” curve fit doesn’t converge to its asymptotic value even with 100 periods whereas the “exponential” converges within 10 periods.
Fit period#
The default behavior of TailCurve is to include all cdf_ patterns from the Triangle in extrapolating the tail. Often, the data assumptions of linearity will be violated when using all
selected patterns. In those cases, you can use fit_period=(start_age, end_age) for fitting to a contiguous set of patterns.
dev = cl.Development().fit_transform(cl.load_sample('quarterly')['paid'])
fit_all = cl.TailCurve().fit(dev)
exclude = cl.TailCurve(fit_period=(36, None)).fit(dev)
/home/docs/checkouts/readthedocs.org/user_builds/chainladder-python/envs/experimental/lib/python3.11/site-packages/chainladder/core/base.py:403: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
datetime_mapping = dict(zip(datetime_arg, pd.to_datetime(**date_inference)))
Warning
The nature of curve fitting with log and exp transforms is inappropriate for development factors less
than 1.0. The default behavior of TailCurve is to ignore these observations when determining
the OLS regression parameters. You can force errors by setting the errors argument to “raise”.
dev = cl.TailCurve(errors='raise')
# This will fail because of LDFs < 1 and our choice to 'raise' errors
dev.fit(cl.load_sample('quarterly')['paid'])
You can also use a list to of booleans to specify which values you want to include. This allows for finer grain control over excluding outliers from your analysis. The following two estimators are equivalent.
(cl.TailCurve(fit_period=[False] * 11 + [True] * 33).fit(dev).cdf_ ==
cl.TailCurve(fit_period=(36, None)).fit(dev).cdf_)
np.True_
Examples#
[Attachment Age Smoothing]
[TailCurve Extrapolation]
[TailCurve Basics]
TailBondy#
Most people are familiar with the Bondy tail know it as a basic method that assumes the
tail_ is just a repeat of the last available ldf_.
Indeed, without any hyperparameter selection, this is how TailBondy works.
triangle = cl.load_sample('genins')
dev = cl.Development().fit_transform(triangle)
dev.ldf_.iloc[..., -1]
| 108-120 | |
|---|---|
| (All) | 1.0177 |
tail = cl.TailBondy().fit(triangle)
tail.tail_
| 120-Ult | |
|---|---|
| (All) | 1.017725 |
This simple relationship has a more generalized form. If we define the ldf_ at time \(n\) as \(f(n-1)\)
and assume that \(f(n) = f(n-1)^{B}\), it follows that the tail \(F(n)\) can be defined as:
where \(B\) is some constant, and if \(B\) is \(\frac{1}{2}\), then we obtain Bondy’s original tail formulation.
Rather than using the only the last known development factor or setting \(B=\frac{1}{2}\), we
can stablize things using more ldf_ using the earliest_age parameter. We can then minimize
the difference in the the assumed relationship \(f(n) = f(n-1)^{B}\) and our data to estimate \(B\).
This is often refered to as the Generalized Bondy method.
triangle = cl.load_sample('tail_sample')['paid']
dev = cl.Development(average='simple').fit_transform(triangle)
# Estimate the Bondy Tail
tail = cl.TailBondy(earliest_age=12).fit(dev)
tail.tail_
| 120-Ult | |
|---|---|
| (All) | 1.027763 |
Using our formulation (3) above, we can manually estimate our tail using our earliest ldf_
and our Bondy constant b_.
# Get last fitted LDF of the model
last_fitted_ldf = (tail.earliest_ldf_ ** (tail.b_ ** 8))
# Calculate the tail using the Bondy formula above
last_fitted_ldf ** (tail.b_ / (1-tail.b_))
| paid | |
|---|---|
| Total | |
| Total | 1.027756 |
TailClark#
TailClark is a continuation of the ClarkLDF model. Familiarity
with ClarkLDF will aid in understanding this Estimator.
The growth curve approach used by Clark produces development patterns for any
age including ages beyond the edge of the Triangle.
An example completing Clark’s model:
genins = cl.load_sample('genins')
dev = cl.ClarkLDF()
tail = cl.TailClark()
tail.fit(dev.fit_transform(genins)).ldf_
| 12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | 120-132 | 132-144 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| (All) | 4.0951 | 1.7205 | 1.3424 | 1.2005 | 1.1306 | 1.0910 | 1.0665 | 1.0504 | 1.0393 | 1.0313 | 1.2540 |
Coupled with attachment_age one could use any Development estimator and replicate the ClarkLDF estimator.
genins = cl.load_sample('genins')
tail = cl.TailClark(attachment_age=12)
tail.fit(cl.Development().fit_transform(genins)).ldf_
| 12-24 | 24-36 | 36-48 | 48-60 | 60-72 | 72-84 | 84-96 | 96-108 | 108-120 | 120-132 | 132-144 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| (All) | 4.0951 | 1.7205 | 1.3424 | 1.2005 | 1.1306 | 1.0910 | 1.0665 | 1.0504 | 1.0393 | 1.0313 | 1.2540 |
Truncated patterns#
Clark warns of the dangers of extrapolating too far beyond the edge of a Triangle,
particularly with a heavy tailed distribution. In these cases, it is suggested that
a suitable cut-off age or truncation_age be established where losses are considered
fully developed. This is very similar to the extrap_periods parameter of TailCurve,
however, it is expressed as an age (in months) as opposed to a period length.
tail = cl.TailClark(truncation_age=245)
tail.fit(dev.fit_transform(genins)).cdf_
| 12-Ult | 24-Ult | 36-Ult | 48-Ult | 60-Ult | 72-Ult | 84-Ult | 96-Ult | 108-Ult | 120-Ult | 132-Ult | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| (All) | 19.2089 | 4.6907 | 2.7264 | 2.0309 | 1.6917 | 1.4963 | 1.3715 | 1.2860 | 1.2243 | 1.1780 | 1.1423 |