python - Keep element data when extracting sessions -


similarly top wikipedia sessions example have following test data

edits = [       json.dumps({'timestamp': 0, 'username': 'user1', 'action': 'a'}),       json.dumps({'timestamp': 1, 'username': 'user1', 'action': 'b'}),       json.dumps({'timestamp': 20, 'username': 'user1', 'action': 'a'}),       json.dumps({'timestamp': 132, 'username': 'user2', 'action': 'a'}),       json.dumps({'timestamp': 500, 'username': 'user2', 'action': 'b'}),       json.dumps({'timestamp': 3601, 'username': 'user2', 'action': 'b'}),       json.dumps({'timestamp': 3602, 'username': 'user2', 'action': 'a'}),       json.dumps({'timestamp': 8004, 'username': 'user2', 'action': 'a'}),       json.dumps({'timestamp': 9320, 'username': 'user1', 'action': 'b'})   ] 

i split dataset sessions per username , each user session count user actions. previous dataset , 1 hour max gap (3600 seconds), want following result:

expected = [       'user1 : [0.0, 3620.0), a: 2, b: 1',       'user2 : [132.0, 7202.0), a: 2, b: 2',       'user2 : [8004.0, 11604.0), a: 1, b: 0',       'user1 : [9320.0, 12920.0), a: 0, b: 1',   ] 

contrary wikipedia sessions example need keep complete element data , not key in order use within custom combiner function.

you should able write combinefn counts number of actions of each type, using dictionary of counts accumulator. then, can use session windows in collection keyed user id combiner.

see beam programming guide section on combine fns ideas on how write one.


Comments

Popular posts from this blog

Command prompt result in label. Python 2.7 -

javascript - How do I use URL parameters to change link href on page? -

amazon web services - AWS Route53 Trying To Get Site To Resolve To www -