Pandas aggregate count distinct
Post date: Sep 17, 2014 2:54:21 PM
Pandas aggregate count distinct
>>> df date duration user_id 0 2013-04-01 30 0001 1 2013-04-01 15 0001 2 2013-04-01 20 0002 3 2013-04-02 15 0002 4 2013-04-02 30 0002
>>> df.groupby("date").agg({"duration": np.sum, "user_id": pd.Series.nunique}) duration user_id date 2013-04-01 65 2 2013-04-02 45 1
Or you can do this
>>> df.groupby("date").agg({"duration": np.sum, "user_id": lambda x: x.nunique()}) duration user_id date 2013-04-01 65 2 2013-04-02 45 1
from stackoverflow url.