Pandas Cookbook: Recipes for Scientific Computing, Time Series Analysis and Data Visualization using Python

Author: Theodore Petrou
5.0
This Year Stack Overflow 1

Comments

by anonymous   2018-03-19

Updated:

pd.TimeGrouper() was formally deprecated in pandas v0.21.0 in favor of pd.Grouper().

The best use of pd.Grouper() is within groupby() when you're also grouping on non-datetime-columns. If you just need to group on a frequency, use resample().

For example, say you have:

>>> df = pd.DataFrame({'a': np.random.choice(['x', 'y'], size=50),
                       'b': np.random.rand(50)},
                      index=pd.date_range('2010', periods=50))

You could do:

>>> df.groupby(pd.Grouper(freq='M')).sum()
                  b
2010-01-31  18.5123
2010-02-28   7.7670

But the above is a little unnecessary because you're only grouping on the index. Instead you could do:

>>> df.resample('M').sum()
                 0       1
2010-01-31  13.234  17.641
2010-02-28   9.142   9.061

Conversely, here's a case where Grouper() would be useful:

>>> df.groupby([pd.Grouper(freq='M'), 'a']).sum()
                   b
           a        
2010-01-31 x  8.9452
           y  9.5671
2010-02-28 x  4.2522
           y  3.5148

For some more detail, take a look at Chapter 7 of Ted Petrou's Pandas Cookbook.

by anonymous   2018-02-18
@DipanjanSaha I will recommend https://www.amazon.com/dp/1784393878/?tag=stackoverflow17-20