Addendum to [an Upcoming Post]

In [an upcoming post that will be linked here once available], I will make a comment towards the beginning about how Python “written for performance” is “Python that is very straightforward and does not use many of its features”. It may seem odd to post an addendum prior to the post it is an addendum too, but, let’s just say, I’ve learned a bit about how the Internet reads things over the years, and I both want this out of the main flow of that post, yet, available immediately when I post it.

Since that post is not about Python, it did not fit the flow to dig into it, but I expect this will raise some eyebrows so I wanted to substantiate what I mean by that.

Consider:

class X: pass

value = X()
print("Setting an attribute directly on an object(): ", 
      end = "")

def setval():
    global value
    # Deliberately using an interned constant for 
    # maximum speed:
    # https://www.codesansar.com/python-programming/integer-interning.htm
    value.x = 1

print(timeit.timeit(setval, number=1000000))

This uses the timeit Python standard library module to time how long it takes to set an attribute on a very simple instance a million times.

Your machine may vary, but the machine I am running this on prints 0.032. (And many more digits, since it’s a float, but that’s good enough.)

That is a very simple case. I’m not going to go running around Python trying to see if there’s some incrementally faster similar operation, because I think all Python programmers can agree this is a very basic operation that one performs in Python frequently.

Now consider:

values = {}
def recorder(func):
    def wrapper(self, value):
        values[value] = True
        return func(self, value)
    return wrapper

class Superclass:
      @recorder
      def _set_prop(self, value):
          self._prop = value

      def _get_prop(self):
          return self._prop

      prop = property(fget = _get_prop, fset = _set_prop)

class Subclass1(Superclass): pass
class Subclass2(Subclass1): pass
class Subclass3(Subclass2): pass

val2 = Subclass3()
def setval2():
    global val2
    val2.prop = 1

print ("Setting an attribute on a third-level subclass with a decorated property: ", end = "")
print(timeit.timeit(setval2, number=1000000))

Instead of setting a property on an object directly, this sets a property:

  • On a subclass…
  • of a subclass…
  • of a subclass…
  • that has a property setter…
  • which has been wrapped with a decorator that does additional work.

For the same number of iterations, the result on my computer is 0.147, or 4.6 time the original time.

Python has a lot of convenient features, but if you are doing “high performance” Python it’s important to understand that you shouldn’t use the timings for “how long it takes to directly set an attribute” if you are not in fact directly setting an attribute. Python makes it easy and convenient to stack up a lot of features on top of each other without considering that you may be adding a lot of little multiplicative factors as you use those features.

As in the discussion in A Definition of Magic, this is neither a good nor a bad thing on its own. The goodness or badness depends on whether you get adequate benefits from the costs, which requires exactly the sort of details I’m completely glossing over here. I am not saying this is good or bad, I’m just saying that if you are programming in Python and have some reason to care about the performance of some code it is important to use the correct cost model.

By contrast, the Go code

package main

import "testing"

type IntStruct struct {
	I int
}

func BenchmarkMillionSets(b *testing.B) {
	for range b.N {
		is := IntStruct{}
		for range 1000000 {
			is.I = 1
		}
	}
}

runs in 248 microseconds on this system, or in the same format as the previous numbers, 0.000248 seconds.

Before anyone objects, this is definitely a microbenchmark, and one extremely subject to all the foibles of that genre. You certainly would not be justified in dividing 0.032 by 0.000248 and declaring Go is faster than Python by 130x. The general gap is smaller than that. For running the “same code” Python would generally be more like 20-40x slower as a very rough rule of thumb.

However, what this post is substantiating is that it is very easy to not be “running the same code” and not realize how much code your Python interpreter is actually executing even if your statement superficially looks like x.y = 1. Python may “generally” only be slower than Go by 20-40x but it is very easy in Python to specifically write code in Python that is hundreds or thousands of times slower for the “same thing”.