python 3.x - Collate a list of tuples into a dictionary of tuples of lists, where the length of the tuples is unknown? -


[('a',), ('b',), ('a',)]  

produces

{'a': (), 'b': ()}) 

[('a', 1.0), ('b', 2.0), ('a', 3.0)]  

produces

{'a': ([1.0, 3.0],), 'b': ([2.0],)} 

[('a', 1.0, 0.1), ('b', 2.0, 0.2), ('a', 1.0, 0.3)] 

produces

{'a': ([1.0, 1.0], [0.1, 0.3]), 'b': ([2.0], [0.2])} 

[('a', 1.0, 0.1, 7), ('b', 2.0, 0.2, 8), ('a', 1.0, 0.3, 9)]  

produces

{'a': ([1.0, 1.0], [0.1, 0.3], [7, 9]), 'b': ([2.0], [0.2], [8])} 

i new python - came with.

def collate(list_of_tuples):     if len(list_of_tuples)==0 or len(list_of_tuples[0])==0:         return defaultdict(tuple)     d = defaultdict(lambda: tuple([] in range(len(list_of_tuples[0])-1)))     t in list_of_tuples:         d[t[0]]          i,v in enumerate(t):             if i>0:                 d[t[0]][i-1].append(v)     return d 

in case want know context, list of tuples represents measurements. first item in each tuple identification of thing being measured. subsequent items different types of measurements of thing. things measured in random order, each unknown number of times. function collates each things measurements further processing. application evolves, different types of measurements added. when number of types of measurements in client code changes, want collate function not have change.

you can use itertools.groupby group items first using letters, , collect measurements belonging same id using zip(*...) before adding them corresponding dictionary key:

from itertools import groupby, islice import operator  def collate(lst, f=operator.itemgetter(0)):     d = {}     k, g in groupby(sorted(lst, key=f), f):         d[k] = ()         v in islice(zip(*g), 1, none):             d[k] += (list(v),)     return d 

tests:

lst = [('a',), ('b',), ('a',)]   print(collate(lst)) # {'a': (), 'b': ()}  lst = [('a', 1.0), ('b', 2.0), ('a', 3.0)]  print(collate(lst)) # {'a': ([1.0, 3.0],), 'b': ([2.0],)}  lst = [('a', 1.0, 0.1, 7), ('b', 2.0, 0.2, 8), ('a', 1.0, 0.3, 9)]   print(collate(lst)) # {'a': ([1.0, 1.0], [0.1, 0.3], [7, 9]), 'b': ([2.0], [0.2], [8])} 

i have avoided using defaultdict since in case of 0 measurements (i.e. [('a',), ('b',), ('a',)]) still need explicitly set key value; defeats purpose of collection.

in case need handle missing measurements, replace zip itertools.zip_longest, , pass explicit fillvalue replace default none.


Comments

Popular posts from this blog

Command prompt result in label. Python 2.7 -

javascript - How do I use URL parameters to change link href on page? -

amazon web services - AWS Route53 Trying To Get Site To Resolve To www -