Faking a tuple class in python -
i need have new type, mytuple
, can create objects this: obj = mytuple((1,2,3))
, such as:
obj
behaves native tuple (also in performance)isinstance(obj, tuple)
returnsfalse
.
the reason behind need use tuples indexes in pandas, when pandas detects values of index tuples, uses multiindexes instead, don't want.
thus, following not work:
class mytuple(tuple): pass
this fulfills first requirement not second one, if use mytuple
objects indexes, pandas still creates multiindexes them.
another solution use composition instead of inheritance, implementing sequence
abc , having true tuple object attribute, providing wrapper methods around it:
from collections.abc import sequence class mytuple(sequence): def __init__(self, initlist=none): self.data = () # true tuple stored in object if initlist not none: if type(initlist) == type(self.data): self.data = initlist elif isinstance(initlist, mytuple): self.data = initlist.data else: self.data = tuple(initlist) def __getitem__(self, i): return self.data[i] def __len__(self): return len(self.data) def __hash__(self): return hash(self.data) def __repr__(self): return repr(self.data) def __eq__(self, other): return self.data == other def __iter__(self): yield self.data.__iter__()
this type fulfills second requirement (isinstance(obj, tuple)
returns false
), , provides same interface true tuple
(you can access elements via indexes, can compare tuples, can use dictionary keys, etc). syntactically , semantically solution me.
however not true tuple
in terms of performance. in application have perform tons of comparisons betweens these objects (and of these objects true tuple
s), method mytuple.__eq__()
called tons of times. introduces performance penalty. using mytuple
instead of true tuples, program multiplies 6 runtime.
then, need first attempt (a class inherits tuple
), later can "lie" being tuple, if asked via isinstance()
(because how pandas finds out if tuple , should create multiindex).
i read python's datamodel , __instancecheck__()
methods, think not useful here, because should implement methods in tuple
, instead of mytuple
, not possible.
perhaps tricks metaclasses it, not understand concept see relationship problem.
can achieve goals somehow?
class mytuple(object): def __init__(self, iterable): self.data = tuple(iterable) def __getitem__(self, i): return tuple.__getitem__(self.data, i) t = mytuple((1, 2, 3)) print(t[1]) print(isinstance(t, tuple))
other methods analogously.
still not true tuple
performancewise, closest can think of... probably.
Comments
Post a Comment