Back to Statsmodels

Optimization

docs/source/optimization.rst

0.15.0.dev06.6 KB
Original Source

.. module:: statsmodels.base.optimizer .. currentmodule:: statsmodels.base.optimizer

Optimization

statsmodels uses three types of algorithms for the estimation of the parameters of a model.

  1. Basic linear models such as :ref:WLS and OLS <regression> are directly estimated using appropriate linear algebra.
  2. :ref:RLM <rlm> and :ref:GLM <glm>, use iteratively re-weighted least squares. However, you can optionally select one of the scipy optimizers discussed below.
  3. For all other models, we use optimizers <https://docs.scipy.org/doc/scipy/reference/optimize.html>_ from scipy <https://docs.scipy.org/doc/scipy/reference/index.html>_.

Where practical, certain models allow for the optional selection of a scipy optimizer. A particular scipy optimizer might be default or an option. Depending on the model and the data, choosing an appropriate scipy optimizer enables avoidance of a local minima, fitting models in less time, or fitting a model with less memory.

statsmodels supports the following optimizers along with keyword arguments associated with that specific optimizer:

  • newton - Newton-Raphson iteration. While not directly from scipy, we consider it an optimizer because only the score and hessian are required.

    tol : float Relative error in params acceptable for convergence.

  • nm - scipy's fmin_nm

    xtol : float Relative error in params acceptable for convergence ftol : float Relative error in loglike(params) acceptable for convergence maxfun : int Maximum number of function evaluations to make.

  • bfgs - Broyden–Fletcher–Goldfarb–Shanno optimization, scipy's fmin_bfgs.

    gtol : float
        Stop when norm of gradient is less than gtol.
    norm : float
        Order of norm (np.inf is max, -np.inf is min)
    epsilon
        If fprime is approximated, use this value for the step
        size. Only relevant if LikelihoodModel.score is None.
    
  • lbfgs - A more memory-efficient (limited memory) implementation of bfgs. Scipy's fmin_l_bfgs_b.

    m : int
        The maximum number of variable metric corrections used to
        define the limited memory matrix. (The limited memory BFGS
        method does not store the full hessian but uses this many
        terms in an approximation to it.)
    pgtol : float
        The iteration will stop when
        ``max{|proj g_i | i = 1, ..., n} <= pgtol`` where pg_i is
        the i-th component of the projected gradient.
    factr : float
        The iteration stops when
        ``(f^k - f^{k+1})/max{|f^k|,|f^{k+1}|,1} <= factr * eps``,
        where eps is the machine precision, which is automatically
        generated by the code. Typical values for factr are: 1e12
        for low accuracy; 1e7 for moderate accuracy; 10.0 for
        extremely high accuracy. See Notes for relationship to
        ftol, which is exposed (instead of factr) by the
        scipy.optimize.minimize interface to L-BFGS-B.
    maxfun : int
        Maximum number of iterations.
    epsilon : float
        Step size used when approx_grad is True, for numerically
        calculating the gradient
    approx_grad : bool
        Whether to approximate the gradient numerically (in which
        case func returns only the function value).
    
  • cg - Conjugate gradient optimization. Scipy's fmin_cg.

    gtol : float
        Stop when norm of gradient is less than gtol.
    norm : float
        Order of norm (np.inf is max, -np.inf is min)
    epsilon : float
        If fprime is approximated, use this value for the step
        size. Can be scalar or vector.  Only relevant if
        Likelihoodmodel.score is None.
    
  • ncg - Newton conjugate gradient. Scipy's fmin_ncg.

    fhess_p : callable f'(x, \*args)
        Function which computes the Hessian of f times an arbitrary
        vector, p.  Should only be supplied if
        LikelihoodModel.hessian is None.
    avextol : float
        Stop when the average relative error in the minimizer
        falls below this amount.
    epsilon : float or ndarray
        If fhess is approximated, use this value for the step size.
        Only relevant if Likelihoodmodel.hessian is None.
    
  • powell - Powell's method. Scipy's fmin_powell.

    xtol : float
        Line-search error tolerance
    ftol : float
        Relative error in loglike(params) for acceptable for
        convergence.
    maxfun : int
        Maximum number of function evaluations to make.
    start_direc : ndarray
        Initial direction set.
    
  • basinhopping - Basin hopping. This is part of scipy's basinhopping tools.

    niter : integer
        The number of basin hopping iterations.
    niter_success : integer
        Stop the run if the global minimum candidate remains the
        same for this number of iterations.
    T : float
        The "temperature" parameter for the accept or reject
        criterion. Higher "temperatures" mean that larger jumps
        in function value will be accepted. For best results
        `T` should be comparable to the separation (in function
        value) between local minima.
    stepsize : float
        Initial step size for use in the random displacement.
    interval : integer
        The interval for how often to update the `stepsize`.
    minimizer : dict
        Extra keyword arguments to be passed to the minimizer
        `scipy.optimize.minimize()`, for example 'method' - the
        minimization method (e.g. 'L-BFGS-B'), or 'tol' - the
        tolerance for termination. Other arguments are mapped from
        explicit argument of `fit`:
        - `args` <- `fargs`
        - `jac` <- `score`
        - `hess` <- `hess`
    
  • minimize - Allows the use of any scipy optimizer.

    min_method : str, optional Name of minimization method to use. Any method specific arguments can be passed directly. For a list of methods and their arguments, see documentation of scipy.optimize.minimize. If no method is specified, then BFGS is used.

Model Class

Generally, there is no need for an end-user to directly call these functions and classes. However, we provide the class because the different optimization techniques have unique keyword arguments that may be useful to the user.

.. autosummary:: :toctree: generated/

Optimizer _fit_newton _fit_bfgs _fit_lbfgs _fit_nm _fit_cg _fit_ncg _fit_powell _fit_basinhopping