PatsyFormula#
- class chainladder.PatsyFormula(formula=None)[source]#
A sklearn-style Transformer for patsy formulas.
PatsyFormula allows for R-style formula preprocessing of the
design_matrixof a machine learning algorithm. It’s particularly useful with the DevelopmentML and TweedieGLM estimators.- Parameters:
- formula: str
A string representation of the regression model X features.
- Attributes:
- design_info_:
The patsy instructions for generating the design_matrix, X.
Examples
If a development-only Poisson GLM produces residuals that vary systematically by accident year, adding
C(origin)to the formula introduces origin-level intercepts and reduces that structure. The expanded model matrix has more columns (one per development period plus one per origin), whichPatsyFormulabuilds from the same R-style string.genins = cl.load_sample("genins") by_dev = cl.TweedieGLM(design_matrix="C(development)").fit(genins) by_both = cl.TweedieGLM( design_matrix="C(development) + C(origin)" ).fit(genins) print(len(by_dev.coef_)) print(len(by_both.coef_)) print(by_dev.ldf_.values[0, 0, 0, :].round(4)) print(by_both.ldf_.values[0, 0, 0, :].round(4))
10 19 [3.5085 1.7436 1.4379 1.1656 1.0991 1.0832 1.0511 1.0693 1.0135] [3.491 1.7474 1.4574 1.1739 1.1038 1.0863 1.0539 1.0766 1.0177]
When
TweedieGLMis not flexible enough (for example, when you need a non-Tweedie model or a continuous origin term), build a customDevelopmentMLpipeline and usePatsyFormulaas the preprocessing step with the same formula syntax.from sklearn.linear_model import LinearRegression from sklearn.pipeline import Pipeline from chainladder.utils.utility_functions import PatsyFormula genins = cl.load_sample("genins") col = genins.columns[0] dev_only = cl.DevelopmentML( Pipeline( [ ("design_matrix", PatsyFormula("C(development)")), ("model", LinearRegression(fit_intercept=False)), ] ), y_ml=col, fit_incrementals=False, ).fit(genins) print(dev_only.ldf_.values[0, 0, 0, :].round(4))
[3.515 1.735 1.3993 1.152 1.0988 1.0926 1.0332 1.0245 0.8507]
Inherited Methods
|
Fit to data, then transform it. |
|
Get metadata routing of this object. |
|
Get parameters for this estimator. |
|
Set output container. |
|
Set the parameters of this estimator. |