In an ideal world, you’d be right and some of the inconsistencies you found would be wrong. However, CPython has optimized some scenarios, specifically function locals. These optimizations, together with how the compiler and evaluation loop interact and historical precedent, lead to the confusion.
Python translates code to bytecodes, and those are then interpreted by a interpreter loop. The ‘regular’ opcode for accessing a name is LOAD_NAME
, which looks up a variable name as you would in a dictionary. LOAD_NAME
will first look up a name as a local, and if that fails, looks for a global. LOAD_NAME
throws a NameError
exception when the name is not found.
For nested scopes, looking up names outside of the current scope is implemented using closures; if a name is not assigned to but is available in a nested (not global) scope, then such values are handled as a closure. This is needed because a parent scope can hold different values for a given name at different times; two calls to a parent function can lead to different closure values. So Python has LOAD_CLOSURE
, MAKE_CLOSURE
and LOAD_DEREF
opcodes for that situation; the first two opcodes are used in loading and creating a closure for a nested scope, and the LOAD_DEREF
will load the closed-over value when the nested scope needs it.
Now, LOAD_NAME
is relatively slow; it will consult two dictionaries, which means it has to hash the key first and run a few equality tests (if the name wasn’t interned). If the name isn’t local, then it has to do this again for a global. For functions, that can potentially be called tens of thousands of times, this can get tedious fast. So function locals have special opcodes. Loading a local name is implemented by LOAD_FAST
, which looks up local variables by index in a special local names array. This is much faster, but it does require that the compiler first has to see if a name is a local and not global. To still be able to look up global names, another opcode LOAD_GLOBAL
is used. The compiler explicitly optimizes for this case to generate the special opcodes. LOAD_FAST
will throw an UnboundLocalError
exception when there is not yet a value for the name.
Class definition bodies on the other hand, although they are treated much like a function, do not get this optimization step. Class definitions are not meant to be called all that often; most modules create classes once, when imported. Class scopes don’t count when nesting either, so the rules are simpler. As a result, class definition bodies do not act like functions when you start mixing scopes up a little.
So, for non-function scopes, LOAD_NAME
and LOAD_DEREF
are used for locals and globals, and for closures, respectively. For functions, LOAD_FAST
, LOAD_GLOBAL
and LOAD_DEREF
are used instead.
Note that class bodies are executed as soon as Python executes the class
line! So in example 1, class B
inside class A
is executed as soon as class A
is executed, which is when you import the module. In example 2, C
is not executed until f()
is called, not before.
Lets walk through your examples:
-
You have nested a class
A.B
in a classA
. Class bodies do not form nested scopes, so even though theA.B
class body is executed when classA
is executed, the compiler will useLOAD_NAME
to look upx
.A.B().f()
is a function (bound to theB()
instance as a method), so it usesLOAD_GLOBAL
to loadx
. We’ll ignore attribute access here, that’s a very well defined name pattern. -
Here
f().C.z
is at class scope, so the functionf().C().g()
will skip theC
scope and look at thef()
scope instead, usingLOAD_DEREF
. -
Here
var
was determined to be a local by the compiler because you assign to it within the scope. Functions are optimized, soLOAD_FAST
is used to look up the local and an exception is thrown. -
Now things get a little weird.
class A
is executed at class scope, soLOAD_NAME
is being used.A.x
was deleted from the locals dictionary for the scope, so the second access tox
results in the globalx
being found instead;LOAD_NAME
looked for a local first and didn’t find it there, falling back to the global lookup.Yes, this appears inconsistent with the documentation. Python-the-language and CPython-the implementation are clashing a little here. You are, however, pushing the boundaries of what is possible and practical in a dynamic language; checking if
x
should have been a local inLOAD_NAME
would be possible but takes precious execution time for a corner case that most developers will never run into. -
Now you are confusing the compiler. You used
x = x
in the class scope, and thus you are setting a local from a name outside of the scope. The compiler findsx
is a local here (you assign to it), so it never considers that it could also be a scoped name. The compiler usesLOAD_NAME
for all references tox
in this scope, because this is not an optimized function body.When executing the class definition,
x = x
first requires you to look upx
, so it usesLOAD_NAME
to do so. Nox
is defined,LOAD_NAME
doesn’t find a local, so the globalx
is found. The resulting value is stored as a local, which happens to be namedx
as well.print x
usesLOAD_NAME
again, and now finds the new localx
value. -
Here you did not confuse the compiler. You are creating a local
y
,x
is not local, so the compiler recognizes it as a scoped name from parent functionf2().myfunc()
.x
is looked up withLOAD_DEREF
from the closure, and stored iny
.
You could see the confusion between 5 and 6 as a bug, albeit one that is not worth fixing in my opinion. It was certainly filed as such, see issue 532860 in the Python bug tracker, it has been there for over 10 years now.
The compiler could check for a scoped name x
even when x
is also a local, for that first assignment in example 5. Or LOAD_NAME
could check if the name is meant to be a local, really, and throw an UnboundLocalError
if no local was found, at the expense of more performance. Had this been in a function scope, LOAD_FAST
would have been used for example 5, and an UnboundLocalError
would be thrown immediately.
However, as the referenced bug shows, for historical reasons the behaviour is retained. There probably is code out there today that’ll break were this bug fixed.