doc/source/whatsnew/v0.9.1.rst
.. _whatsnew_0901:
{{ header }}
This is a bug fix release from 0.9.0 and includes several new features and enhancements along with a large number of bug fixes. The new features include by-column sort order for DataFrame and Series, improved NA handling for the rank method, masking functions for DataFrame, and intraday time-series filtering for DataFrame.
New features
- ``Series.sort``, ``DataFrame.sort``, and ``DataFrame.sort_index`` can now be
specified in a per-column manner to support multiple sort orders (:issue:`928`)
.. code-block:: ipython
In [2]: df = pd.DataFrame(np.random.randint(0, 2, (6, 3)),
...: columns=['A', 'B', 'C'])
In [3]: df.sort(['A', 'B'], ascending=[1, 0])
Out[3]:
A B C
3 0 1 1
4 0 1 1
2 0 0 1
0 1 0 0
1 1 0 0
5 1 0 0
- ``DataFrame.rank`` now supports additional argument values for the
``na_option`` parameter so missing values can be assigned either the largest
or the smallest rank (:issue:`1508`, :issue:`2159`)
.. ipython:: python
df = pd.DataFrame(np.random.randn(6, 3), columns=['A', 'B', 'C'])
df.loc[2:4] = np.nan
df.rank()
df.rank(na_option='top')
df.rank(na_option='bottom')
- DataFrame has new ``where`` and ``mask`` methods to select values according to a
given boolean mask (:issue:`2109`, :issue:`2151`)
DataFrame currently supports slicing via a boolean vector the same length as the DataFrame (inside the ``[]``).
The returned DataFrame has the same number of columns as the original, but is sliced on its index.
.. ipython:: python
df = pd.DataFrame(np.random.randn(5, 3), columns=['A', 'B', 'C'])
df
df[df['A'] > 0]
If a DataFrame is sliced with a DataFrame based boolean condition (with the same size as the original DataFrame),
then a DataFrame the same size (index and columns) as the original is returned, with
elements that do not meet the boolean condition as ``NaN``. This is accomplished via
the new method ``DataFrame.where``. In addition, ``where`` takes an optional ``other`` argument for replacement.
.. ipython:: python
df[df > 0]
df.where(df > 0)
df.where(df > 0, -df)
Furthermore, ``where`` now aligns the input boolean condition (ndarray or DataFrame), such that partial selection
with setting is possible. This is analogous to partial setting via ``.ix`` (but on the contents rather than the axis labels)
.. ipython:: python
df2 = df.copy()
df2[df2[1:4] > 0] = 3
df2
``DataFrame.mask`` is the inverse boolean operation of ``where``.
.. ipython:: python
df.mask(df <= 0)
- Enable referencing of Excel columns by their column names (:issue:`1936`)
.. code-block:: ipython
In [1]: xl = pd.ExcelFile('data/test.xls')
In [2]: xl.parse('Sheet1', index_col=0, parse_dates=True,
parse_cols='A:D')
- Added option to disable pandas-style tick locators and formatters
using ``series.plot(x_compat=True)`` or ``pandas.plot_params['x_compat'] =
True`` (:issue:`2205`)
- Existing TimeSeries methods ``at_time`` and ``between_time`` were added to
DataFrame (:issue:`2149`)
- DataFrame.dot can now accept ndarrays (:issue:`2042`)
- DataFrame.drop now supports non-unique indexes (:issue:`2101`)
- Panel.shift now supports negative periods (:issue:`2164`)
- DataFrame now support unary ~ operator (:issue:`2110`)
API changes
~~~~~~~~~~~
- Upsampling data with a PeriodIndex will result in a higher frequency
TimeSeries that spans the original time window
.. code-block:: ipython
In [1]: prng = pd.period_range('2012Q1', periods=2, freq='Q')
In [2]: s = pd.Series(np.random.randn(len(prng)), prng)
In [4]: s.resample('M')
Out[4]:
2012-01 -1.471992
2012-02 NaN
2012-03 NaN
2012-04 -0.493593
2012-05 NaN
2012-06 NaN
Freq: M, dtype: float64
- Period.end_time now returns the last nanosecond in the time interval
(:issue:`2124`, :issue:`2125`, :issue:`1764`)
.. ipython:: python
p = pd.Period('2012')
p.end_time
- File parsers no longer coerce to float or bool for columns that have custom
converters specified (:issue:`2184`)
.. ipython:: python
import io
data = ('A,B,C\n'
'00001,001,5\n'
'00002,002,6')
pd.read_csv(io.StringIO(data), converters={'A': lambda x: x.strip()})
See the :ref:`full release notes
<release>` or issue tracker
on GitHub for a complete list.
.. _whatsnew_0.9.1.contributors:
Contributors
.. contributors:: v0.9.0..v0.9.1