doc/neps/nep-0017-split-out-maskedarray.rst
.. _NEP17:
:Author: Stéfan van der Walt [email protected] :Status: Rejected :Type: Standards Track :Created: 2018-03-22 :Resolution: https://mail.python.org/pipermail/numpy-discussion/2018-May/078026.html
This NEP proposes removing MaskedArray functionality from NumPy, and publishing it as a stand-alone package.
MaskedArrays are a sub-class of the NumPy ndarray that adds
masking capabilities, i.e. the ability to ignore or hide certain array
values during computation.
While historically convenient to distribute this class inside of NumPy, improved packaging has made it possible to distribute it separately without difficulty.
Motivations for this move include:
ndarray object, and the essential utilities needed to manipulate
such arrays.ndarrays,
often cause complications when being used with other packages.
Fixing these issues is outside the scope of NumPy development.This NEP proposes a deprecation pathway through which MaskedArrays would still be accessible to users, but no longer as part of the core package.
Currently, a MaskedArray is created as follows::
from numpy import ma ma.array([1, 2, 3], mask=[True, False, True])
This will return an array where the values 1 and 3 are masked (no
longer visible to operations such as np.sum).
We propose refactoring the np.ma subpackage into a new
pip-installable library called maskedarray [2]_, which would be used
in a similar fashion::
import maskedarray as ma ma.array([1, 2, 3], mask=[True, False, True])
For two releases of NumPy, maskedarray would become a NumPy
dependency, and would expose MaskedArrays under the existing name,
np.ma. If imported as np.ma, a NumpyDeprecationWarning will
be raised, describing the impending deprecation with instructions on
how to modify code to use maskedarray.
After two releases, np.ma will be removed entirely. In order to obtain
np.ma, a user will install it via pip install or via their package
manager. Subsequently, importing maskedarray on a version of NumPy that
includes it integrally will raise an ImportError.
Documentation
NumPy's internal documentation refers explicitly to MaskedArrays in
certain places, e.g. `ndarray.concatenate`:
> When one or more of the arrays to be concatenated is a MaskedArray,
> this function will return a MaskedArray object instead of an ndarray,
> but the input masks are *not* preserved. In cases where a MaskedArray
> is expected as input, use the ma.concatenate function from the masked
> array module instead.
Such documentation will be removed, since the expectation is that
users of `maskedarray` will use methods from that package to operate
on MaskedArrays.
Other appearances
~~~~~~~~~~~~~~~~~
Explicit MaskedArray support will be removed from:
- `numpygenfromtext`
- `numpy.libmerge_arrays`, `numpy.lib.stack_arrays`
Backward compatibility
----------------------
For two releases of NumPy, apart from a deprecation notice, there will
be no user visible changes. Thereafter, `np.ma` will no longer be
available (instead, MaskedArrays will live in the `maskedarray`
package).
Note also that new PEPs on array-like objects may eventually provide
better support for MaskedArrays than is currently available.
Alternatives
------------
After a lively discussion on the mailing list:
- There is support (and active interest in) making a better *new* masked array
class.
- The new class should be a consumer of the external NumPy API with no special
status (unlike today where there are hacks across the codebase to support it)
- `MaskedArray` will stay where it is, at least until the new masked array
class materializes and has been tried in the wild.
References and footnotes
------------------------
.. [1] Subclassing ndarray,
https://docs.scipy.org/doc/numpy/user/basics.subclassing.html
.. [2] PyPI: maskedarray, https://pypi.org/project/maskedarray/
Copyright
---------
This document has been placed in the public domain.