docs/source/gam.rst
.. currentmodule:: statsmodels.gam.api
.. _gam:
Generalized Additive Models allow for penalized estimation of smooth terms in generalized linear models.
See Module Reference_ for commands and arguments.
The following illustrates a Gaussian and a Poisson regression where categorical variables are treated as linear terms and the effect of two explanatory variables is captured by penalized B-splines. The data is from the automobile dataset https://archive.ics.uci.edu/ml/datasets/automobile We can load a dataframe with selected columns from the unit test module.
.. ipython:: python
import statsmodels.api as sm
from statsmodels.gam.api import GLMGam, BSplines
# import data
from statsmodels.gam.tests.test_penalized import df_autos
# create spline basis for weight and hp
x_spline = df_autos[['weight', 'hp']]
bs = BSplines(x_spline, df=[12, 10], degree=[3, 3])
# penalization weight
alpha = np.array([21833888.8, 6460.38479])
gam_bs = GLMGam.from_formula('city_mpg ~ fuel + drive', data=df_autos,
smoother=bs, alpha=alpha)
res_bs = gam_bs.fit()
print(res_bs.summary())
# plot smooth components
res_bs.plot_partial(0, cpr=True)
res_bs.plot_partial(1, cpr=True)
alpha = np.array([8283989284.5829611, 14628207.58927821])
gam_bs = GLMGam.from_formula('city_mpg ~ fuel + drive', data=df_autos,
smoother=bs, alpha=alpha,
family=sm.families.Poisson())
res_bs = gam_bs.fit()
print(res_bs.summary())
# Optimal penalization weights alpha can be obtained through generalized
# cross-validation or k-fold cross-validation.
# The alpha above are from the unit tests against the R mgcv package.
gam_bs.select_penweight()[0]
gam_bs.select_penweight_kfold()[0]
References ^^^^^^^^^^
.. module:: statsmodels.gam.generalized_additive_model :synopsis: Generalized Additive Models .. currentmodule:: statsmodels.gam.generalized_additive_model
Model Class ^^^^^^^^^^^
.. autosummary:: :toctree: generated/
GLMGam LogitGam
Results Classes ^^^^^^^^^^^^^^^
.. autosummary:: :toctree: generated/
GLMGamResults
Smooth Basis Functions ^^^^^^^^^^^^^^^^^^^^^^
.. module:: statsmodels.gam.smooth_basis :synopsis: Classes for Spline and other Smooth Basis Function
.. currentmodule:: statsmodels.gam.smooth_basis
Currently there is verified support for two spline bases
.. autosummary:: :toctree: generated/
BSplines CyclicCubicSplines
statsmodels.gam.smooth_basis includes additional splines and a (global)
polynomial smoother basis but those have not been verified yet.
Families and Link Functions ^^^^^^^^^^^^^^^^^^^^^^^^^^^
The distribution families in GLMGam are the same as for GLM and so are
the corresponding link functions.
Current unit tests only cover Gaussian and Poisson, and GLMGam might not
work for all options that are available in GLM.