python - How to calculate average days between events by category 1 and category 2 -
i have table of shoplifting events store , product. i'm trying use python calculate average number of days between shoplifting events product. table looks this:
product store shoplifting date times shoplifted 1 8/28/2016 6 2 8/28/2016 6 3 8/28/2016 6 2 b 8/22/2016 3 1 b 8/22/2016 3 3 b 8/22/2016 3 1 c 8/18/2016 2 3 c 8/18/2016 2 4 c 8/18/2016 2 1 8/18/2016 5 3 8/18/2016 5 1 b 8/16/2016 2 1 8/14/2016 4 4 c 8/13/2016 1 3 8/12/2016 4 2 8/12/2016 4
product 1 stolen store on 8/28, 8/18, , 8/14 (10 days , 4 days between thefts) , store b on 8/22 , 8/16 (8 days), average of (10 + 4 + 8) / 3 = 7.33 days. product 1 expected results be:
product days between shoplifting 1 7.33
the "times shoplifted" column cumulative number of times store has been shoplifted. increases each shoplifting event. so, instance, on 8/28/2016, store shoplifted of items 1, 2, , 3. 6th time store had been shoplifted from.
i trying calculate average number of days between shoplifting product. i've been writing lot of loops , it's getting quite messy i'd cleaner way it. i'm not familiar pandas, believe has handy time processing ability...? how solve problem in pandas? or there better way?
i'd sort dataframe shoplifting date
first, each group, diff
give time deltas, , mean
average them.
df.sort_values('shoplifting date').groupby( 'product' )['shoplifting date'].apply(lambda x: x.diff().mean()).dropna() product 1 0 days 3 0 days 582 10 days 650 4 days name: shoplifting date, dtype: timedelta64[ns]
Comments
Post a Comment