============ Version 0.13

.. _changes_0_13_1:

Version 0.13.1

February 23, 2013

The 0.13.1 release only fixes some bugs and does not add any new functionality.

Changelog

Fixed a testing error caused by the function cross_validation.train_test_split being interpreted as a test by Yaroslav Halchenko_.
Fixed a bug in the reassignment of small clusters in the :class:cluster.MiniBatchKMeans by Gael Varoquaux_.
Fixed default value of gamma in :class:decomposition.KernelPCA by Lars Buitinck_.
Updated joblib to 0.7.0d by Gael Varoquaux_.
Fixed scaling of the deviance in :class:ensemble.GradientBoostingClassifier by Peter Prettenhofer_.
Better tie-breaking in :class:multiclass.OneVsOneClassifier by Andreas Müller_.
Other small improvements to tests and documentation.

People

List of contributors for release 0.13.1 by number of commits.

16 Lars Buitinck_
12 Andreas Müller_
8 Gael Varoquaux_
5 Robert Marchman
3 Peter Prettenhofer_
2 Hrishikesh Huilgolkar
1 Bastiaan van den Berg
1 Diego Molla
1 Gilles Louppe_
1 Mathieu Blondel_
1 Nelle Varoquaux_
1 Rafael Cunha de Almeida
1 Rolando Espinoza La fuente
1 Vlad Niculae_
1 Yaroslav Halchenko_

.. _changes_0_13:

Version 0.13

January 21, 2013

New Estimator Classes

:class:dummy.DummyClassifier and :class:dummy.DummyRegressor, two data-independent predictors by Mathieu Blondel. Useful to sanity-check your estimators. See :ref:dummy_estimators in the user guide. Multioutput support added by Arnaud Joly.
:class:decomposition.FactorAnalysis, a transformer implementing the classical factor analysis, by Christian Osendorfer_ and Alexandre Gramfort_. See :ref:FA in the user guide.
:class:feature_extraction.FeatureHasher, a transformer implementing the "hashing trick" for fast, low-memory feature extraction from string fields by Lars Buitinck_ and :class:feature_extraction.text.HashingVectorizer for text documents by Olivier Grisel_ See :ref:feature_hashing and :ref:hashing_vectorizer for the documentation and sample usage.
:class:pipeline.FeatureUnion, a transformer that concatenates results of several other transformers by Andreas Müller_. See :ref:feature_union in the user guide.
:class:random_projection.GaussianRandomProjection, :class:random_projection.SparseRandomProjection and the function :func:random_projection.johnson_lindenstrauss_min_dim. The first two are transformers implementing Gaussian and sparse random projection matrix by Olivier Grisel_ and Arnaud Joly_. See :ref:random_projection in the user guide.
:class:kernel_approximation.Nystroem, a transformer for approximating arbitrary kernels by Andreas Müller_. See :ref:nystroem_kernel_approx in the user guide.
:class:preprocessing.OneHotEncoder, a transformer that computes binary encodings of categorical features by Andreas Müller_. See :ref:preprocessing_categorical_features in the user guide.
:class:linear_model.PassiveAggressiveClassifier and :class:linear_model.PassiveAggressiveRegressor, predictors implementing an efficient stochastic optimization for linear models by Rob Zinkov_ and Mathieu Blondel_. See :ref:passive_aggressive in the user guide.
:class:ensemble.RandomTreesEmbedding, a transformer for creating high-dimensional sparse representations using ensembles of totally random trees by Andreas Müller_. See :ref:random_trees_embedding in the user guide.
:class:manifold.SpectralEmbedding and function :func:manifold.spectral_embedding, implementing the "laplacian eigenmaps" transformation for non-linear dimensionality reduction by Wei Li. See :ref:spectral_embedding in the user guide.
:class:isotonic.IsotonicRegression by Fabian Pedregosa, Alexandre Gramfort and Nelle Varoquaux_,

Changelog

:func:metrics.zero_one_loss (formerly metrics.zero_one) now has an option for normalized output that reports the fraction of misclassifications, rather than the raw number of misclassifications. By Kyle Beauchamp.
:class:tree.DecisionTreeClassifier and all derived ensemble models now support sample weighting, by Noel Dawe_ and Gilles Louppe_.
Speedup improvement when using bootstrap samples in forests of randomized trees, by Peter Prettenhofer_ and Gilles Louppe_.
Partial dependence plots for :ref:gradient_boosting in ensemble.partial_dependence.partial_dependence by Peter Prettenhofer_. See :ref:sphx_glr_auto_examples_inspection_plot_partial_dependence.py for an example.
The table of contents on the website has now been made expandable by Jaques Grobler_.
:class:feature_selection.SelectPercentile now breaks ties deterministically instead of returning all equally ranked features.
:class:feature_selection.SelectKBest and :class:feature_selection.SelectPercentile are more numerically stable since they use scores, rather than p-values, to rank results. This means that they might sometimes select different features than they did previously.
Ridge regression and ridge classification fitting with sparse_cg solver no longer has quadratic memory complexity, by Lars Buitinck_ and Fabian Pedregosa_.
Ridge regression and ridge classification now support a new fast solver called lsqr, by Mathieu Blondel_.
Speed up of :func:metrics.precision_recall_curve by Conrad Lee.
Added support for reading/writing svmlight files with pairwise preference attribute (qid in svmlight file format) in :func:datasets.dump_svmlight_file and :func:datasets.load_svmlight_file by Fabian Pedregosa_.
Faster and more robust :func:metrics.confusion_matrix and :ref:clustering_evaluation by Wei Li.
cross_validation.cross_val_score now works with precomputed kernels and affinity matrices, by Andreas Müller_.
LARS algorithm made more numerically stable with heuristics to drop regressors too correlated as well as to stop the path when numerical noise becomes predominant, by Gael Varoquaux_.
Faster implementation of :func:metrics.precision_recall_curve by Conrad Lee.
New kernel metrics.chi2_kernel by Andreas Müller_, often used in computer vision applications.
Fix of longstanding bug in :class:naive_bayes.BernoulliNB fixed by Shaun Jackman.
Implemented predict_proba in :class:multiclass.OneVsRestClassifier, by Andrew Winterman.
Improve consistency in gradient boosting: estimators :class:ensemble.GradientBoostingRegressor and :class:ensemble.GradientBoostingClassifier use the estimator :class:tree.DecisionTreeRegressor instead of the tree._tree.Tree data structure by Arnaud Joly_.
Fixed a floating point exception in the :ref:decision trees <tree> module, by Seberg.
Fix :func:metrics.roc_curve fails when y_true has only one class by Wei Li.
Add the :func:metrics.mean_absolute_error function which computes the mean absolute error. The :func:metrics.mean_squared_error, :func:metrics.mean_absolute_error and :func:metrics.r2_score metrics support multioutput by Arnaud Joly_.
Fixed class_weight support in :class:svm.LinearSVC and :class:linear_model.LogisticRegression by Andreas Müller_. The meaning of class_weight was reversed as erroneously higher weight meant less positives of a given class in earlier releases.
Improve narrative documentation and consistency in :mod:sklearn.metrics for regression and classification metrics by Arnaud Joly_.
Fixed a bug in :class:sklearn.svm.SVC when using csr-matrices with unsorted indices by Xinfan Meng and Andreas Müller_.
:class:cluster.MiniBatchKMeans: Add random reassignment of cluster centers with little observations attached to them, by Gael Varoquaux_.

API changes summary

Renamed all occurrences of n_atoms to n_components for consistency. This applies to :class:decomposition.DictionaryLearning, :class:decomposition.MiniBatchDictionaryLearning, :func:decomposition.dict_learning, :func:decomposition.dict_learning_online.
Renamed all occurrences of max_iters to max_iter for consistency. This applies to semi_supervised.LabelPropagation and semi_supervised.label_propagation.LabelSpreading.
Renamed all occurrences of learn_rate to learning_rate for consistency in ensemble.BaseGradientBoosting and :class:ensemble.GradientBoostingRegressor.
The module sklearn.linear_model.sparse is gone. Sparse matrix support was already integrated into the "regular" linear models.
sklearn.metrics.mean_square_error, which incorrectly returned the accumulated error, was removed. Use :func:metrics.mean_squared_error instead.
Passing class_weight parameters to fit methods is no longer supported. Pass them to estimator constructors instead.
GMMs no longer have decode and rvs methods. Use the score, predict or sample methods instead.
The solver fit option in Ridge regression and classification is now deprecated and will be removed in v0.14. Use the constructor option instead.
feature_extraction.text.DictVectorizer now returns sparse matrices in the CSR format, instead of COO.
Renamed k in cross_validation.KFold and cross_validation.StratifiedKFold to n_folds, renamed n_bootstraps to n_iter in cross_validation.Bootstrap.
Renamed all occurrences of n_iterations to n_iter for consistency. This applies to cross_validation.ShuffleSplit, cross_validation.StratifiedShuffleSplit, :func:utils.extmath.randomized_range_finder and :func:utils.extmath.randomized_svd.
Replaced rho in :class:linear_model.ElasticNet and :class:linear_model.SGDClassifier by l1_ratio. The rho parameter had different meanings; l1_ratio was introduced to avoid confusion. It has the same meaning as previously rho in :class:linear_model.ElasticNet and (1-rho) in :class:linear_model.SGDClassifier.
:class:linear_model.LassoLars and :class:linear_model.Lars now store a list of paths in the case of multiple targets, rather than an array of paths.
The attribute gmm of hmm.GMMHMM was renamed to gmm_ to adhere more strictly with the API.
cluster.spectral_embedding was moved to :func:manifold.spectral_embedding.
Renamed eig_tol in :func:manifold.spectral_embedding, :class:cluster.SpectralClustering to eigen_tol, renamed mode to eigen_solver.
Renamed mode in :func:manifold.spectral_embedding and :class:cluster.SpectralClustering to eigen_solver.
classes_ and n_classes_ attributes of :class:tree.DecisionTreeClassifier and all derived ensemble models are now flat in case of single output problems and nested in case of multi-output problems.
The estimators_ attribute of :class:ensemble.GradientBoostingRegressor and :class:ensemble.GradientBoostingClassifier is now an array of :class:tree.DecisionTreeRegressor.
Renamed chunk_size to batch_size in :class:decomposition.MiniBatchDictionaryLearning and :class:decomposition.MiniBatchSparsePCA for consistency.
:class:svm.SVC and :class:svm.NuSVC now provide a classes_ attribute and support arbitrary dtypes for labels y. Also, the dtype returned by predict now reflects the dtype of y during fit (used to be np.float).
Changed default test_size in cross_validation.train_test_split to None, added possibility to infer test_size from train_size in cross_validation.ShuffleSplit and cross_validation.StratifiedShuffleSplit.
Renamed function sklearn.metrics.zero_one to sklearn.metrics.zero_one_loss. Be aware that the default behavior in sklearn.metrics.zero_one_loss is different from sklearn.metrics.zero_one: normalize=False is changed to normalize=True.
Renamed function metrics.zero_one_score to :func:metrics.accuracy_score.
:func:datasets.make_circles now has the same number of inner and outer points.
In the Naive Bayes classifiers, the class_prior parameter was moved from fit to __init__.

People

List of contributors for release 0.13 by number of commits.

364 Andreas Müller_
143 Arnaud Joly_
137 Peter Prettenhofer_
131 Gael Varoquaux_
117 Mathieu Blondel_
108 Lars Buitinck_
106 Wei Li
101 Olivier Grisel_
65 Vlad Niculae_
54 Gilles Louppe_
40 Jaques Grobler_
38 Alexandre Gramfort_
30 Rob Zinkov_
19 Aymeric Masurelle
18 Andrew Winterman
17 Fabian Pedregosa_
17 Nelle Varoquaux
16 Christian Osendorfer_
14 Daniel Nouri_
13 :user:Virgile Fritsch <VirgileFritsch>
13 syhw
12 Satrajit Ghosh_
10 Corey Lynch
10 Kyle Beauchamp
9 Brian Cheung
9 Immanuel Bayer
9 mr.Shu
8 Conrad Lee
8 James Bergstra_
7 Tadej Janež
6 Brian Cajes
6 Jake Vanderplas_
6 Michael
6 Noel Dawe
6 Tiago Nunes
6 cow
5 Anze
5 Shiqiao Du
4 Christian Jauvin
4 Jacques Kvam
4 Richard T. Guy
4 Robert Layton_
3 Alexandre Abraham
3 Doug Coleman
3 Scott Dickerson
2 ApproximateIdentity
2 John Benediktsson
2 Mark Veronda
2 Matti Lyra
2 Mikhail Korobov
2 Xinfan Meng
1 Alejandro Weinstein
1 Alexandre Passos_
1 Christoph Deil
1 Eugene Nizhibitsky
1 Kenneth C. Arnold
1 Luis Pedro Coelho
1 Miroslav Batchkarov
1 Pavel
1 Sebastian Berg
1 Shaun Jackman
1 Subhodeep Moitra
1 bob
1 dengemann
1 emanuele
1 x006