Back to Scikit Learn

Version 0.13

doc/whats_new/v0.13.rst

1.8.013.8 KB
Original Source

.. include:: _contributors.rst

.. currentmodule:: sklearn

============ Version 0.13

.. _changes_0_13_1:

Version 0.13.1

February 23, 2013

The 0.13.1 release only fixes some bugs and does not add any new functionality.

Changelog

  • Fixed a testing error caused by the function cross_validation.train_test_split being interpreted as a test by Yaroslav Halchenko_.

  • Fixed a bug in the reassignment of small clusters in the :class:cluster.MiniBatchKMeans by Gael Varoquaux_.

  • Fixed default value of gamma in :class:decomposition.KernelPCA by Lars Buitinck_.

  • Updated joblib to 0.7.0d by Gael Varoquaux_.

  • Fixed scaling of the deviance in :class:ensemble.GradientBoostingClassifier by Peter Prettenhofer_.

  • Better tie-breaking in :class:multiclass.OneVsOneClassifier by Andreas Müller_.

  • Other small improvements to tests and documentation.

People

List of contributors for release 0.13.1 by number of commits.

  • 16 Lars Buitinck_
  • 12 Andreas Müller_
  • 8 Gael Varoquaux_
  • 5 Robert Marchman
  • 3 Peter Prettenhofer_
  • 2 Hrishikesh Huilgolkar
  • 1 Bastiaan van den Berg
  • 1 Diego Molla
  • 1 Gilles Louppe_
  • 1 Mathieu Blondel_
  • 1 Nelle Varoquaux_
  • 1 Rafael Cunha de Almeida
  • 1 Rolando Espinoza La fuente
  • 1 Vlad Niculae_
  • 1 Yaroslav Halchenko_

.. _changes_0_13:

Version 0.13

January 21, 2013

New Estimator Classes

  • :class:dummy.DummyClassifier and :class:dummy.DummyRegressor, two data-independent predictors by Mathieu Blondel. Useful to sanity-check your estimators. See :ref:dummy_estimators in the user guide. Multioutput support added by Arnaud Joly.

  • :class:decomposition.FactorAnalysis, a transformer implementing the classical factor analysis, by Christian Osendorfer_ and Alexandre Gramfort_. See :ref:FA in the user guide.

  • :class:feature_extraction.FeatureHasher, a transformer implementing the "hashing trick" for fast, low-memory feature extraction from string fields by Lars Buitinck_ and :class:feature_extraction.text.HashingVectorizer for text documents by Olivier Grisel_ See :ref:feature_hashing and :ref:hashing_vectorizer for the documentation and sample usage.

  • :class:pipeline.FeatureUnion, a transformer that concatenates results of several other transformers by Andreas Müller_. See :ref:feature_union in the user guide.

  • :class:random_projection.GaussianRandomProjection, :class:random_projection.SparseRandomProjection and the function :func:random_projection.johnson_lindenstrauss_min_dim. The first two are transformers implementing Gaussian and sparse random projection matrix by Olivier Grisel_ and Arnaud Joly_. See :ref:random_projection in the user guide.

  • :class:kernel_approximation.Nystroem, a transformer for approximating arbitrary kernels by Andreas Müller_. See :ref:nystroem_kernel_approx in the user guide.

  • :class:preprocessing.OneHotEncoder, a transformer that computes binary encodings of categorical features by Andreas Müller_. See :ref:preprocessing_categorical_features in the user guide.

  • :class:linear_model.PassiveAggressiveClassifier and :class:linear_model.PassiveAggressiveRegressor, predictors implementing an efficient stochastic optimization for linear models by Rob Zinkov_ and Mathieu Blondel_. See :ref:passive_aggressive in the user guide.

  • :class:ensemble.RandomTreesEmbedding, a transformer for creating high-dimensional sparse representations using ensembles of totally random trees by Andreas Müller_. See :ref:random_trees_embedding in the user guide.

  • :class:manifold.SpectralEmbedding and function :func:manifold.spectral_embedding, implementing the "laplacian eigenmaps" transformation for non-linear dimensionality reduction by Wei Li. See :ref:spectral_embedding in the user guide.

  • :class:isotonic.IsotonicRegression by Fabian Pedregosa, Alexandre Gramfort and Nelle Varoquaux_,

Changelog

  • :func:metrics.zero_one_loss (formerly metrics.zero_one) now has an option for normalized output that reports the fraction of misclassifications, rather than the raw number of misclassifications. By Kyle Beauchamp.

  • :class:tree.DecisionTreeClassifier and all derived ensemble models now support sample weighting, by Noel Dawe_ and Gilles Louppe_.

  • Speedup improvement when using bootstrap samples in forests of randomized trees, by Peter Prettenhofer_ and Gilles Louppe_.

  • Partial dependence plots for :ref:gradient_boosting in ensemble.partial_dependence.partial_dependence by Peter Prettenhofer_. See :ref:sphx_glr_auto_examples_inspection_plot_partial_dependence.py for an example.

  • The table of contents on the website has now been made expandable by Jaques Grobler_.

  • :class:feature_selection.SelectPercentile now breaks ties deterministically instead of returning all equally ranked features.

  • :class:feature_selection.SelectKBest and :class:feature_selection.SelectPercentile are more numerically stable since they use scores, rather than p-values, to rank results. This means that they might sometimes select different features than they did previously.

  • Ridge regression and ridge classification fitting with sparse_cg solver no longer has quadratic memory complexity, by Lars Buitinck_ and Fabian Pedregosa_.

  • Ridge regression and ridge classification now support a new fast solver called lsqr, by Mathieu Blondel_.

  • Speed up of :func:metrics.precision_recall_curve by Conrad Lee.

  • Added support for reading/writing svmlight files with pairwise preference attribute (qid in svmlight file format) in :func:datasets.dump_svmlight_file and :func:datasets.load_svmlight_file by Fabian Pedregosa_.

  • Faster and more robust :func:metrics.confusion_matrix and :ref:clustering_evaluation by Wei Li.

  • cross_validation.cross_val_score now works with precomputed kernels and affinity matrices, by Andreas Müller_.

  • LARS algorithm made more numerically stable with heuristics to drop regressors too correlated as well as to stop the path when numerical noise becomes predominant, by Gael Varoquaux_.

  • Faster implementation of :func:metrics.precision_recall_curve by Conrad Lee.

  • New kernel metrics.chi2_kernel by Andreas Müller_, often used in computer vision applications.

  • Fix of longstanding bug in :class:naive_bayes.BernoulliNB fixed by Shaun Jackman.

  • Implemented predict_proba in :class:multiclass.OneVsRestClassifier, by Andrew Winterman.

  • Improve consistency in gradient boosting: estimators :class:ensemble.GradientBoostingRegressor and :class:ensemble.GradientBoostingClassifier use the estimator :class:tree.DecisionTreeRegressor instead of the tree._tree.Tree data structure by Arnaud Joly_.

  • Fixed a floating point exception in the :ref:decision trees <tree> module, by Seberg.

  • Fix :func:metrics.roc_curve fails when y_true has only one class by Wei Li.

  • Add the :func:metrics.mean_absolute_error function which computes the mean absolute error. The :func:metrics.mean_squared_error, :func:metrics.mean_absolute_error and :func:metrics.r2_score metrics support multioutput by Arnaud Joly_.

  • Fixed class_weight support in :class:svm.LinearSVC and :class:linear_model.LogisticRegression by Andreas Müller_. The meaning of class_weight was reversed as erroneously higher weight meant less positives of a given class in earlier releases.

  • Improve narrative documentation and consistency in :mod:sklearn.metrics for regression and classification metrics by Arnaud Joly_.

  • Fixed a bug in :class:sklearn.svm.SVC when using csr-matrices with unsorted indices by Xinfan Meng and Andreas Müller_.

  • :class:cluster.MiniBatchKMeans: Add random reassignment of cluster centers with little observations attached to them, by Gael Varoquaux_.

API changes summary

  • Renamed all occurrences of n_atoms to n_components for consistency. This applies to :class:decomposition.DictionaryLearning, :class:decomposition.MiniBatchDictionaryLearning, :func:decomposition.dict_learning, :func:decomposition.dict_learning_online.

  • Renamed all occurrences of max_iters to max_iter for consistency. This applies to semi_supervised.LabelPropagation and semi_supervised.label_propagation.LabelSpreading.

  • Renamed all occurrences of learn_rate to learning_rate for consistency in ensemble.BaseGradientBoosting and :class:ensemble.GradientBoostingRegressor.

  • The module sklearn.linear_model.sparse is gone. Sparse matrix support was already integrated into the "regular" linear models.

  • sklearn.metrics.mean_square_error, which incorrectly returned the accumulated error, was removed. Use :func:metrics.mean_squared_error instead.

  • Passing class_weight parameters to fit methods is no longer supported. Pass them to estimator constructors instead.

  • GMMs no longer have decode and rvs methods. Use the score, predict or sample methods instead.

  • The solver fit option in Ridge regression and classification is now deprecated and will be removed in v0.14. Use the constructor option instead.

  • feature_extraction.text.DictVectorizer now returns sparse matrices in the CSR format, instead of COO.

  • Renamed k in cross_validation.KFold and cross_validation.StratifiedKFold to n_folds, renamed n_bootstraps to n_iter in cross_validation.Bootstrap.

  • Renamed all occurrences of n_iterations to n_iter for consistency. This applies to cross_validation.ShuffleSplit, cross_validation.StratifiedShuffleSplit, :func:utils.extmath.randomized_range_finder and :func:utils.extmath.randomized_svd.

  • Replaced rho in :class:linear_model.ElasticNet and :class:linear_model.SGDClassifier by l1_ratio. The rho parameter had different meanings; l1_ratio was introduced to avoid confusion. It has the same meaning as previously rho in :class:linear_model.ElasticNet and (1-rho) in :class:linear_model.SGDClassifier.

  • :class:linear_model.LassoLars and :class:linear_model.Lars now store a list of paths in the case of multiple targets, rather than an array of paths.

  • The attribute gmm of hmm.GMMHMM was renamed to gmm_ to adhere more strictly with the API.

  • cluster.spectral_embedding was moved to :func:manifold.spectral_embedding.

  • Renamed eig_tol in :func:manifold.spectral_embedding, :class:cluster.SpectralClustering to eigen_tol, renamed mode to eigen_solver.

  • Renamed mode in :func:manifold.spectral_embedding and :class:cluster.SpectralClustering to eigen_solver.

  • classes_ and n_classes_ attributes of :class:tree.DecisionTreeClassifier and all derived ensemble models are now flat in case of single output problems and nested in case of multi-output problems.

  • The estimators_ attribute of :class:ensemble.GradientBoostingRegressor and :class:ensemble.GradientBoostingClassifier is now an array of :class:tree.DecisionTreeRegressor.

  • Renamed chunk_size to batch_size in :class:decomposition.MiniBatchDictionaryLearning and :class:decomposition.MiniBatchSparsePCA for consistency.

  • :class:svm.SVC and :class:svm.NuSVC now provide a classes_ attribute and support arbitrary dtypes for labels y. Also, the dtype returned by predict now reflects the dtype of y during fit (used to be np.float).

  • Changed default test_size in cross_validation.train_test_split to None, added possibility to infer test_size from train_size in cross_validation.ShuffleSplit and cross_validation.StratifiedShuffleSplit.

  • Renamed function sklearn.metrics.zero_one to sklearn.metrics.zero_one_loss. Be aware that the default behavior in sklearn.metrics.zero_one_loss is different from sklearn.metrics.zero_one: normalize=False is changed to normalize=True.

  • Renamed function metrics.zero_one_score to :func:metrics.accuracy_score.

  • :func:datasets.make_circles now has the same number of inner and outer points.

  • In the Naive Bayes classifiers, the class_prior parameter was moved from fit to __init__.

People

List of contributors for release 0.13 by number of commits.

  • 364 Andreas Müller_
  • 143 Arnaud Joly_
  • 137 Peter Prettenhofer_
  • 131 Gael Varoquaux_
  • 117 Mathieu Blondel_
  • 108 Lars Buitinck_
  • 106 Wei Li
  • 101 Olivier Grisel_
  • 65 Vlad Niculae_
  • 54 Gilles Louppe_
  • 40 Jaques Grobler_
  • 38 Alexandre Gramfort_
  • 30 Rob Zinkov_
  • 19 Aymeric Masurelle
  • 18 Andrew Winterman
  • 17 Fabian Pedregosa_
  • 17 Nelle Varoquaux
  • 16 Christian Osendorfer_
  • 14 Daniel Nouri_
  • 13 :user:Virgile Fritsch <VirgileFritsch>
  • 13 syhw
  • 12 Satrajit Ghosh_
  • 10 Corey Lynch
  • 10 Kyle Beauchamp
  • 9 Brian Cheung
  • 9 Immanuel Bayer
  • 9 mr.Shu
  • 8 Conrad Lee
  • 8 James Bergstra_
  • 7 Tadej Janež
  • 6 Brian Cajes
  • 6 Jake Vanderplas_
  • 6 Michael
  • 6 Noel Dawe
  • 6 Tiago Nunes
  • 6 cow
  • 5 Anze
  • 5 Shiqiao Du
  • 4 Christian Jauvin
  • 4 Jacques Kvam
  • 4 Richard T. Guy
  • 4 Robert Layton_
  • 3 Alexandre Abraham
  • 3 Doug Coleman
  • 3 Scott Dickerson
  • 2 ApproximateIdentity
  • 2 John Benediktsson
  • 2 Mark Veronda
  • 2 Matti Lyra
  • 2 Mikhail Korobov
  • 2 Xinfan Meng
  • 1 Alejandro Weinstein
  • 1 Alexandre Passos_
  • 1 Christoph Deil
  • 1 Eugene Nizhibitsky
  • 1 Kenneth C. Arnold
  • 1 Luis Pedro Coelho
  • 1 Miroslav Batchkarov
  • 1 Pavel
  • 1 Sebastian Berg
  • 1 Shaun Jackman
  • 1 Subhodeep Moitra
  • 1 bob
  • 1 dengemann
  • 1 emanuele
  • 1 x006