python 3.x - Collate a list of tuples into a dictionary of tuples of lists, where the length of the tuples is unknown? -
[('a',), ('b',), ('a',)]
produces
{'a': (), 'b': ()})
[('a', 1.0), ('b', 2.0), ('a', 3.0)]
produces
{'a': ([1.0, 3.0],), 'b': ([2.0],)}
[('a', 1.0, 0.1), ('b', 2.0, 0.2), ('a', 1.0, 0.3)]
produces
{'a': ([1.0, 1.0], [0.1, 0.3]), 'b': ([2.0], [0.2])}
[('a', 1.0, 0.1, 7), ('b', 2.0, 0.2, 8), ('a', 1.0, 0.3, 9)]
produces
{'a': ([1.0, 1.0], [0.1, 0.3], [7, 9]), 'b': ([2.0], [0.2], [8])}
i new python - came with.
def collate(list_of_tuples): if len(list_of_tuples)==0 or len(list_of_tuples[0])==0: return defaultdict(tuple) d = defaultdict(lambda: tuple([] in range(len(list_of_tuples[0])-1))) t in list_of_tuples: d[t[0]] i,v in enumerate(t): if i>0: d[t[0]][i-1].append(v) return d
in case want know context, list of tuples represents measurements. first item in each tuple identification of thing being measured. subsequent items different types of measurements of thing. things measured in random order, each unknown number of times. function collates each things measurements further processing. application evolves, different types of measurements added. when number of types of measurements in client code changes, want collate function not have change.
you can use itertools.groupby
group items first using letters, , collect measurements belonging same id using zip(*...)
before adding them corresponding dictionary key:
from itertools import groupby, islice import operator def collate(lst, f=operator.itemgetter(0)): d = {} k, g in groupby(sorted(lst, key=f), f): d[k] = () v in islice(zip(*g), 1, none): d[k] += (list(v),) return d
tests:
lst = [('a',), ('b',), ('a',)] print(collate(lst)) # {'a': (), 'b': ()} lst = [('a', 1.0), ('b', 2.0), ('a', 3.0)] print(collate(lst)) # {'a': ([1.0, 3.0],), 'b': ([2.0],)} lst = [('a', 1.0, 0.1, 7), ('b', 2.0, 0.2, 8), ('a', 1.0, 0.3, 9)] print(collate(lst)) # {'a': ([1.0, 1.0], [0.1, 0.3], [7, 9]), 'b': ([2.0], [0.2], [8])}
i have avoided using defaultdict
since in case of 0 measurements (i.e. [('a',), ('b',), ('a',)]
) still need explicitly set key value; defeats purpose of collection.
in case need handle missing measurements, replace zip
itertools.zip_longest
, , pass explicit fillvalue
replace default none
.
Comments
Post a Comment