Class instantiation performance

I'm writing a GFF3 file parser. I have a class which looks like

class GFF3Feature(object): def __init__(self, seqid, source, type, start, end, score, strand, phase, attributes): self.seqid = seqid self.source = source self.type = type self.start = start self.end = end self.score = score self.strand = strand self.phase = phase self.attributes = attributes

# The normal way class Feature1(object): def __init__(self, seqid, source, type, start, end, score, strand, phase, attributes): self.seqid = seqid self.source = source self.type = type self.start = start self.end = end self.score = score self.strand = strand self.phase = phase self.attributes = attributes # Copy from locals. Learned this one from a posting by Alex Martelli. class Feature2(object): def __init__(self, seqid, source, type, start, end, score, strand, phase, attributes): self.__dict__.update(locals()) del self.self # Bypass normal creation and assume the input is correct class Feature3(object): def __init__(self, **kwargs): self.__dict__.update(kwargs) self.__class__ = Feature1 # __slots__ might give better performance class Feature4(object): __slots__ = ["seqid", "source", "type", "start", "end", "score", "strand", "phase", "attributes"] def __init__(self, seqid, source, type, start, end, score, strand, phase, attributes): self.seqid = seqid self.source = source self.type = type self.start = start self.end = end self.score = score self.strand = strand self.phase = phase self.attributes = attributes

import time, itertools # average and stddev def stats(data): sum = 0.0 sum2 = 0.0 for x in data: sum += x sum2 += x*x N = len(data) return sum/N, ((sum2-sum*sum/N)/(N-1))**0.5 def create(cls, x): for _ in x: a = cls(seqid="qwe", source="wert", type="exon", start=100, end=1000, score=0.1, strand="+", phase=0, attributes={}) return a def main(): clses = (Feature1, Feature2, Feature3, Feature4) tot_times = [[] for cls in clses] x = range(40000) for i in range(7): for cls, tot_time in zip(clses, tot_times): t1 = time.time() a = create(cls, x) t2 = time.time() assert a.end == 1000 tot_time.append(t2-t1) for cls, tot_time in zip(clses, tot_times): avg, stddev = stats(tot_time) tot_time.sort() print cls.__name__, " min=", min(tot_time), "avg=", avg, "stddev=", stddev if __name__=="__main__": main()

Feature1 min= 1.03674292564 avg= 1.09673094749 stddev= 0.0599208818078 Feature2 min= 1.18326091766 avg= 1.28605253356 stddev= 0.13197632388 Feature3 min= 0.890069961548 avg= 0.955237865448 stddev= 0.0523913440787 Feature4 min= 0.908952951431 avg= 0.954915148871 stddev= 0.0310191453362

I need to create a lot of these. What's the fastest way to instantiate them? I know a few ways. My contraint is the normal (public) interface must have a constructor interface which requires all parameters and does not accept other parameters. Here are the ones I tested:A bit of timing harnessThe resultsLooks like my best solution is Feature3, which uses a private interface to build the class, then do an in-place change to the public interface.

That would be wrong though. Python classes have extra overhead with keyword arguments vs. positional ones. I'll change the code a bit to redo the Feature1, Feature2 and Feature4 tests with positional arguments. (The Feature3 approach requires keyword arguments.)

def create(cls, x): for _ in x: a = cls("qwe", "wert", "exon", 100, 1000, 0.1, "+", 0, {}) return a

Feature1 min= 0.485749006271 avg= 0.498355899538 stddev= 0.00987560013671 Feature2 min= 0.613602876663 avg= 0.632926872798 stddev= 0.0336558923515 Feature4 min= 0.344656944275 avg= 0.351382800511 stddev= 0.00944194461424

The times are:Calling Python with positional parameters is faster (by a factor of 2!) than with keyword arguments.

The fastest approach uses __slots__. Guido once wrote:

__slots__ is a terrible hack with nasty, hard-to-fathom side effects that should only be used by programmers at grandmaster and wizard levels. Unfortunately it has gained an enormous undeserved popularity amongst the novices and apprentices, who should know better than to use this magic incantation casually.

Am I a grandmaster? I've never written a metaclass. My view has been to shy away from cute code and stay with obviously readable code.

There is very little in the way of guidance for when to use __slots__. There is guidance of when not to use it, eg, to imply that a class is final/closed and catch accidental typos in instance assignment. Here's another quote, this from Alex Martelli:

__slots__ is, essentially, an optimization (in terms of memory consumption) intended for classes that are likely to be created in very large numbers: you give up the normal flexibility of being able to add arbitrary attributes to an instance, and what you gain in return is that you save one dictionary per instance (dozens and dozens of bytes per instance -- that will, of course, not be all that relevant unless the number of instances that are around is at the very least in the tens of thousands).

I'd recommend against using __slots__ (unless really needed for optimization), because it cripples the usefulness and generality of your classes to your and also *to other people's* code (weakrefs, pickling, reflection -- all compromised). The first two you could fix (but are unlikely to, unless you *yourself* need to use weakrefs or pickling of instances of those classes), the last one you can't.

The fact that someone's code using __slots__ makes other peoples code less useful (because things that work fine for instances without slots suddenly screw up) makes me think introducing __slots__ was a rather bad idea to start with.

So my advice remains: consider __slots__ for classes which may need to be instantiated a HUGE number of times, if the restrictions that __slots__ imposes can be lived with. Like for all optimizations, you'll probably want to measure things, at some level, BEFORE you optimize.

Alexander Schmolck said:Going back to Alex:

My view is that this is a simple data container class. There is no reason anyone will inherit from it. It's more work to inherit from this class than it is to make a new class with the same duck-type interface, such that no one can tell the difference. The negative problems of __slots__ should not arise. The 30% performance boost, at least in micro timings, is significant. I implemented the code and using __slots__ speeds up my overall timings by about 5%.

I think in this case it's worthwhile.

Andrew Dalke is an independent consultant focusing on software development for computational chemistry and biology. Need contract programming, help, or training? Contact me