Python tips & tricks
Recently i’ve read the book Learning Python, 5th Edition by Mark Lutz. Here is a list of most interesting insights for me.
-
set generation:
{x for x in [1,2]} set(x for x in [1,2]) assert set(x for x in [1,2]) == {x for x in [1,2]}
-
dict generation:
{x:x**2 for x in [1,2]} dict((x, x**2) for x in [1,2]) assert {x:x**2 for x in [1,2]} == dict((x, x**2) for x in [1,2])
-
division of integers
In python 3 division of integers returns float
>>> 1 / 2 0.5 >>> - 1 / 2 -0.5
In python 2 division of integers - round to floor, it is not truncate
>>> 1 / 2 # 0.5 round floor -> 0 0 >>> - 1 / 2 # -0.5 round floor -> -1 (not 0) -1
In python 2 and 3 round to floor integer division
>>> 1 // 2 0 >>> - 1 // 2 -1 >>> 13 // 2.0 6.0
-
is
- check, that variables point to the same address,==
- check, that variables have same values -
python 3:
[1, 'spam'].sort()
raises exception (different types) -
python 3: dict().keys() returns iterator (view object, linked to dict). It is set-like object, we can apply set operations to it (union and so on)
>>> dict(a=1, b=2).keys() dict_keys(['b', 'a']) >>> dict(a=1, b=2).keys() | {'c', 'd'} {'b', 'd', 'a', 'c'}
-
frozenset - immutable set, hashable, can be used as key in dict
>>> fz = frozenset([1,2]) >>> fz.add(3) AttributeError: 'frozenset' object has no attribute 'add' >>> {fz: 5} {frozenset([1, 2]): 5}
-
list support compare operators: ==, <, >, <=, >=. List compare is similiar to string compare. In py3 all objects must be the same type
>>> [1, 2] == [1, 2] True >>> [2, 2] > [1, 2] True >>> [1] > ['sh'] # python2 False >>> [1] > ['sh'] # python3 TypeError: unorderable types: int() > str()
-
dict compare
python 2 and 3
>>> dict(a=1) == dict(a=1) True
python 2 only
>>> dict(a=3) > dict(a=2) True >>> dict(a=3) > dict(a=2, b=1) False
-
list + string, list + tuple is forbidden, but list += string is allowed
>>> L = [] >>> L + 'spam' TypeError: can only concatenate list (not "str") to list >>> L = [] >>> L += 'spam' >>> L ['s', 'p', 'a', 'm']
-
L += a is faster than L = L + a.
-
L += [1,2] is in place modification! (new list is not created)
>>> L = [] >>> id(L) 4368997048 >>> L += [1,2] >>> id(L) 4368997048 >>> L = L + [1,2] >>> id(L) 4368996976
-
‘spam’[0][0][0] can last forever, every time we’ll get single-char-string ‘s’
-
variables unpack in python 3
>>> a, *b = 'spam' >>> a 's' >>> b ['p', 'a', 'm'] >>> *a, b = 'spam' >>> a ['s', 'p', 'a'] >>> b 'm' >>> a, *b, c = 'spam' >>> a 's' >>> b ['p', 'a'] >>> c 'm'
-
python 2: True = 0, but not in python 3
python 2
>>> True = 0 >>> True 0
python 3
>>> True = 0 SyntaxError: can't assign to keyword
-
sys.stdout = open(‘temp.txt’, ‘w’) - all prints goes to file temp.txt
-
and
,or
returns object, not True/False -
while
haselse
-
python 3:
...
is the same aspass
-
reversed works with lists, not generator
>>> reversed([1,2,3]) <list_reverseiterator object at 0x10127c550> >>> reversed((x for x in [1,2,3])) TypeError: argument to reversed() must be a sequence
-
zip iterates until the smallest sequence
>>> [x for x in zip([1,2,3], [4,5])] [(1, 4), (2, 5)]
-
python 2: map(None, s1, s2) is the same as zip, but iterates until longest sequence. Insert None for elements without pair.
python 2
>>> map(None, [1,2,3], [4,5]) [(1, 4), (2, 5), (3, None)] >>> map(None, [1,2], [4,5,6]) [(1, 4), (2, 5), (None, 6)]
python 3
>>> list(map(None, [1,2,3], [4,5])) TypeError: 'NoneType' object is not callable
-
map can take more than one iterators (similiar to zip)
python 2
>>> map(lambda x, y: (x, y), [1,2], [3,4]) [(1, 3), (2, 4)] >>> map(lambda x, y: (x, y), [1,2], [3,4,5]) [(1, 3), (2, 4), (None, 5)]
python 3
>>> list(map(lambda x, y: (x, y), [1,2], [3,4])) [(1, 3), (2, 4)] >>> list(map(lambda x, y: (x, y), [1,2], [3,4,5])) [(1, 3), (2, 4)]
-
nested list comprehensions
>>> [x+y for x in 'abc' for y in 'lmn'] ['al', 'am', 'an', 'bl', 'bm', 'bn', 'cl', 'cm', 'cn'] # flat list of lists >>> csv = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] >>> [col for row in csv for col in row] [1, 2, 3, 4, 5, 6, 7, 8, 9]
-
sorted returns list (not generator) in py2 and py3
>>> sorted(x for x in [2,1,3]) [1, 2, 3]
-
*args accept any iterator, not only list
-
unzip: zip(*zip(a,b))
>>> zip(*zip([1,2],[3,4])) [(1, 2), (3, 4)]
-
py3: map returns generator, it can be iterated only once
>>> m = map(lambda x: x, [1,2,3]) >>> [x for x in m] [1, 2, 3] >>> [x for x in m] []
-
py3: range is not a simple generator, it support len() and index access
>>> r = range(10) >>> r range(0, 10) >>> len(r) 10 >>> r[3] 3
-
generator allows only single scan
-
cycle import works! But only for import without from
-
try has else, will be execution when no exception happened
-
with
similar tofinally
-
except (name1, name2) - orders from top to bottom, from left to right
-
except Exception:
vsexcept:
- first doesn’t catch system errors (KeyboardInterrupt, SystemExit, GeneratorExit например) -
set().remove(x) - removes x or KeyError, set().discard(x) - removes x or nothing
-
py3.3+ accept u””, U”” for backwards compatibility with py2
-
default encoding is in sys module sys.getdefaultencoding()
python 2
>>> sys.getdefaultencoding() 'ascii'
python 3
>>> sys.getdefaultencoding() 'utf-8'
-
[c for c in sorted([1,2,3], key=lambda c: -c)] - variable
c
will not conflict here -
in py2 variable inside comprehension can change outer variables and also accessible after, in py3 - not
python 2
>>> x = 1 >>> [x for x in range(3)] [0, 1, 2] >>> x 2 # creates new var >>> [y for y in range(3)] [0, 1, 2] >>> y 2
python 3
>>> x = 1 >>> [x for x in range(3)] >>> x 1 # no new var >>> [y for y in range(3)] [0, 1, 2] >>> y NameError: name 'y' is not defined
-
py3 has nonlocal statement. It is used to reference the variable in outer def block (in py2 it is not possible to access such variable)
def f(): x = 2 # local for f def g(): nonlocal x # python3 only x = 3 # local for g g() print(x) >>> f() # python3 only 3 >>> f() # with commented nonlocal 2
-
LEGB rule (local, enclosing, global, builtin) or LNGB (N=nonlocal) - order of variable search in python
-
py3 exception variable
as name
is removed after block execution (even if variable was declared before try block)python 2
>>> x = 1 >>> try: ... 1/0 ... except Exception as x: ... pass >>> x ZeroDivisionError('integer division or modulo by zero',)
python 3
>>> x = 1 >>> try: ... 1/0 ... except Exception as x: ... pass >>> x NameError: name 'x' is not defined
-
override builtin and undo override
>>> open = 99 >>> open 99 >>> del open >>> open <built-in function open>
-
py2 fun:
__builtins__.True = False
-
lambda can take default arguments
-
nonlocal functionality can be replaced by mutable object or function attribute
def f(): x = [1] def g(): print x[0] x.append(2) g() print x >>> f() 1 [1, 2] def f(): x = 1 def g(): print g.x g.x = 2 g.x = x g() print g.x >>> f() 1 2
-
py3 keyword only arguments
def f(*args, name): print("args", args) print("name", name) >>> f(1, 2) TypeError: f() missing 1 required keyword-only argument: 'name' >>> f(1, 2, name=3) args (1, 2) name 3 def f(*args, name=3): print("args", args) print("name", name) >>> f(1, 2) args (1, 2) name 3
-
in py3 there is unpack of variables, it returns list. And arguments unpack in function call returns tuple
python 2 and 3
def f(a, *b): print(b) >>> f(1, *[2, 3]) (2, 3)
python 3
>>> a, *b = [1, 2, 3] >>> print(b) [2, 3] >>> a, *b = (1, 2, 3) >>> print(b) [2, 3]
-
add list to the beginning of existing list: L[:0] = [1, 2, 3]
-
get and set maximum recursion limit
>>> sys.getrecursionlimit() # 1000 >>> sys.setrecursionlimit(10000) >>> help(sys.setrecursionlimit)
-
function arguments
>>> def f(a): ... b = 1 ... >>> f.__name__ 'f' >>> f.__code__.co_varnames ('a', 'b') >>> f.__code__.co_argcount 1
-
in py3 we can add annotations to function arguments. This information is saved in
func.__annotations__
. Nothing is done automatically with annotations, but we can work with them manually (for example for checking type and range of argument from decorator)>>> def func(a: 'spam', b: (1, 10), c: float): ... return a + b + c >>> func.__annotations__ {'b': (1, 10), 'c': <class 'float'>, 'a': 'spam'} # default values >>> def func(a: 'spam'=4, b: (1, 10)=5, c: float=0.1): ... return a + b + c
-
it is impossible to use
=
in lambda, but it is possible to usesetattr
,__dict__
-
operator module in std lib
import operator as op reduce(op.add, [2, 4, 6]) # same as reduce(lambda x, y: x+y, [2, 4, 6])
-
KISS: Keep It Simple [Sir/Stupid]
-
comprehension vs map in general (better test on your system)
map(lambda x: x ..)
slower than[x for x ..]
[ord(x) for x ..]
slower thanmap(ord for x ..)
map(lambda x: L.append(x+10), range(10))
even slower thanfor x in range(10): L.append(x+10)
-
unpacking in lambda differs in py2 and py3
python 2
>>> map(lambda (a, b, c): a, [(0,1,2), (3,4,5)]) [0, 3]
python 3
>>> list(map(lambda (a, b, c): a, [(0,1,2), (3,4,5)])) SyntaxError: invalid syntax >>> list(map(lambda a, b, c: a, [(0,1,2), (3,4,5)])) TypeError: <lambda>() missing 2 required positional arguments: 'b' and 'c' >>> list(map(lambda row: row[0], [(0,1,2), (3,4,5)])) [0, 3]
-
many builtin functions accept generators, no additional parenthesis are needed
>>> "".join(str(x) for x in [1, 2]) '12' >>> sorted(str(x) for x in [1, 2]) ['1', '2']
but for args () is needed
>>> sorted(str(x) for x in [1, 2], reverse=True) SyntaxError: Generator expression must be parenthesized if not sole argument >>> sorted((str(x) for x in [1, 2]), reverse=True) ['2', '1']
-
py3: yield from iterator (following functions are the same)
def f(): for i in range(5): yield i def g(): yield from range(5)
-
put last list element to the beginning
L = L[1:] + L[:1]
-
zip for single list
>>> zip([1,2,3]) [(1,), (2,), (3,)]
-
map and zip are similiar
map(lambda x,y: (x,y), S1, S2) == zip(S1, S2)
-
python -m script_name
- runs module (module is a .py file), that can be found from current search path. Module can be placed somewhere in site-packages folder, but is run as main (__name__ = '__main__'
). If script_name is a package (folder with__init__.py
), then file__main__.py
will be launched. If no such file, then error. Some modules are smart and accepts arguments from command line, for example timeit:python -m timeit '"-".join(str(n) for n in range(100))'
-
there is no direct way to use global and local variable with same name simultaneously. We can play with
__main__.my_global_var
# OK X = 99 def f(): print(X) >>> f() 99 # ERROR def f(): print(X) # <- error X = 99 >>> f() UnboundLocalError: local variable 'X' referenced before assignment # global everywhere def f(): global X print(X) X = 88 # hack with main def f(): import __main__ print(__main__.X) X = 88
-
square root performance
math.sqrt(x) # fastest x ** .5 # fast pow(x, .5) # slow
-
py3.2+ creates folder
__pycache__
for saving bytecode of different python versions there and to reuse them in future. There are no *.pyc files outside this folder now. -
.pyc for main script (
__name__ = '__main__'
) is not created, only for import -
import search order (look at sys.path):
- home of program (+ in some versions current dir, from where program is launched, i.e. current dir)
- PYTHONPATH
- std lib dir
- content of any .pth file (if exists)
- site-packages dir
-
sys.path can be changed at runtime, this will impact all program
-
python -O creates a little bit optimized bytecode .pyo instead of .pyc, it ~5% faster. Also this flag removes all asserts from code. And changes value of var
__debug__
# main.py print __debug__ assert True == False # python main.py True AssertionError # python -O main.py False
-
in py2 in function we can do
from some_module import *
, but with warning. In py3 - error# python 2 def f(): from urllib import * print('after import') >>> f() SyntaxWarning: import * only allowed at module level after import # python 3 >>> f() SyntaxError: import * only allowed at module level
-
reload
doesn’t update objects, that are loaded with from:from x import y
.y
will not be reloaded afterreload(x)
-
reload
doesn’t update c modules -
py3: in package there is no package folder in sys.path. If module in package needs to import another module from the same package, relative import must be used:
from . import smth
. However, if module is launched as main program (__main__
), then package folder is in sys.path. -
py2:
from __future__ import absolute_import
makes import in py2 the same as in py3. It allows to import module string from standard library in following case very easy:mypkg ├── __init__.py ├── main.py # import string from std here? └── string.py
-
relative import is forbidden outside the package
# test.py from . import a # python 2 python test.py ValueError: Attempted relative import in non-package # python 3 python test.py SystemError: Parent module '' not loaded, cannot perform relative import
-
cons of relative import:
- module with relative imports can’t be used as script (
__main__
). Solution: use absolute import with package name at the beginning - derives from previous point: we can’t launch tests, that are executed when running module as main program
- module with relative imports can’t be used as script (
-
in py3.3+ there are namespace packages. They don’t have
__init__.py
. Two (or more) namespace packages with same name can be placed in different locations in sys.path. Modules from those packages will be aggregated under same package name. If modules have same name - first found in sys.path will be taken. Namespace package always has lower priority under regular package (with__init__.py
). When regular package is found - all found namespace packages with that name are discarded, normal package is used instead. Namespace package import process is slow.# collect modules in namespace package current_dir └── mypkg └── mymod1.py site-packages └── mypkg └── mymod2.py >>> import mypkg.mymod1 >>> import mypkg.mymod2 # redefine module in namespace package current_dir └── mypkg └── mymod1.py └── mymod2.py site-packages └── mypkg └── mymod2.py >>> import mypkg.mymod1 >>> import mypkg.mymod2 # current_dir.mypkg.mymod2 # regular package is used current_dir └── mypkg └── mymod1.py site-packages └── mypkg └── mymod2.py another-packages └── mypkg └── mymod1.py >>> import sys >>> sys.append('another-packages') >>> import mypkg.mymod1 # another-packages.mypkg.mymod1 >>> import mypkg.mymod2 ImportError: No module named 'mypkg.mymod2'
-
In py2 and py3 new-style classes (inherent from object), when operator is applied, corresponding magic methods are searched in class, ignoring instance (
__getattr__
,__getattribute__
are not invoked). But on direct call of magic method instance is not ignored (__getattr__
,__getattribute__
are invoked).class A(object): def __repr__(self): return "class level repr" def normal_method(self): return "class level normal method" def instance_repr(): return "instance level repr" def instance_normal_method(): return "instance level normal method" a = A() print(a) # class level repr print(a.normal_method()) # class level normal method a.__repr__ = instance_repr a.normal_method = instance_normal_method print(a) # class level repr print(a.normal_method()) # instance level normal method print(a.__repr__()) # instance level repr
-
ZODB - object database for python objects, support ACID-compatible transactions (including savepoints)
-
slice object:
L[2:4] == L[slice(2,4)]
-
iteration context (for, while, …) will try
__iter__
-
__getitem__
class Gen(object): def __getitem__(self, index): if index > 5: raise StopIteration() return index for x in Gen(): print x, # output 0 1 2 3 4 5
-
for calls
__iter__()
. Then calls methodreturned_object.__next__()
(in py2.next()
), untilStopIteration
. It is possible to use yield__item__(): yield smth
, then no need to define__next__
. -
__call__
is invoked, when parentheses()
are applied to instance, not to classclass A(object): def __call__(self): print("call") a = A() # nothing a() # print call
-
__eq__
= True doesn’t mean, that__ne__
= False -
boolean context:
__bool__
(__nonzero__
in py2)__len__
- True
-
OOP patterns
- inheritance - “is a”
- composition - “has a” (container stores other objects)
- delegation - special case of composition, when only one object is stored. Wrapper implement same interface, but add some intermediate steps.
-
class attributes (including methods), that start with double underscores
__
, but don’t end with them, have special behaviour. They do not overlap with same named attributes in child classes. In__dict__
they are stored as_ClassName__attrname
.class A(object): __x = 1 def show_a(self): print self.__x class B(A): def show_b(self): print self.__x >>> a = A() >>> a.show_a() 1 >>> b = B() >>> b.show_a() 1 >>> b.show_b() AttributeError: 'B' object has no attribute '_B__x' class B(A): __x = 2 def show_b(self): print self.__x >>> b = B() >>> b.show_a() 1 >>> b.show_b() 2
-
in py3 in class method we can suppress self argument and use that method only from class (not from instance) - it will behave as static method. But not in py2.
class A(object): def f(): print("f") # python 2 >>> A.f() TypeError: unbound method f() must be called with A instance as first argument (got nothing instead) # python 3 >>> A.f() f >>> a = A() >>> a.f() TypeError: f() takes 0 positional arguments but 1 was given
-
bound function:
class A(object): def f(self): pass a = A() print(a.f.__self__) # that is where self is saved
-
attribute search in classic (old-style) and new-style classes:
- classic. DFLR: Depth First, Left to Right
- new-style. Diamond pattern, L-R, D-F; MRO (more complex, that just LRDF)
MRO guards class, from which >= 2 other classes are subclassed, from being search twice. So class will be searched only once.
# python 2 old-style class A: attr = 1 class B(A): pass class C(A): attr = 2 class D(B,C): pass >>> x = D() >>> print(x.attr) # x, D, B, A 1 # python 2 new-style class A(object): attr = 1 class B(A): pass class C(A): attr = 2 class D(B,C): pass >>> x = D() >>> print(x.attr) # x, D, B, C 2 # scheme A A | | B C \ / | D | X
Check search order in new-style (mro algorithm):
>>> D.__mro__ (<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <type 'object'>) >>> D.mro() # same as list(D.__mro__) [<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <type 'object'>]
-
format() calls method
__format__
. If it is not exist, then TypeError in py2.python 2
>>> print('{0}'.format(object)) <type 'object'> >>> print('{0}'.format(object.__reduce__)) TypeError: Type method_descriptor doesn't define __format__ # call __str__ explictly >>> print('{0!s}'.format(object.__reduce__)) <method '__reduce__' of 'object' objects>
python 3.4
>>> print('{0}'.format(object.__reduce__)) <method '__reduce__' of 'object' objects>
python 2 & 3
class A(object): def __format__(self, *args): return "A.__format__" def __str__(self): return "A.__str__" >>> a = A() >>> "{0}".format(a) 'A.__format__' >>> print(a) A.__str__ >>> '%s' % a 'A.__str__'
-
__dict__
doesn’t contain “virtual” attributes:- new-style properties (
@property
) - slots
- descriptors
- dynamic attrs computed with tools like
__getattr__
- new-style properties (
-
MRO - method resolution order
-
diamond pattern - special case of ‘multi inheritance’, when 2 or more class can be child of the same class (object). This pattern is used in python.
-
proxy object, returned by
super()
, doesn’t work with operators:python 3
class A(list): def get_some(self): return super()[0] >>> a = A([1, 2]) >>> a.get_some() TypeError: 'super' object is not subscriptable class A(list): def get_some(self): return super().__getitem__(0) >>> a = A([1,2]) >>> a.get_some() 1
python 2
class A(list): def get_some(self): return super(A, self)[0] >>> a = A([1,2]) >>> a.get_some() TypeError: 'super' object has no attribute '__getitem__' class A(list): def get_some(self): return super(A, self).__getitem__(0) >>> a = A([1,2]) >>> a.get_some() 1
-
super()
-
super() pros:
- if superclass need to be changed in runtime, we can’t do it without super:
C.__bases__ = (Y, )
-
calls sequence of inherited methods in multi inheritance class, in MRO order.
If we’ll try to do it without super, we can call method of some class twice.
class A(object): def __init__(self): print("A") class B(A): def __init__(self): print("B") super(B, self).__init__() class C(A): def __init__(self): print("C") super(C, self).__init__() class D(B, C): pass >>> d = D() B C A # A only once >>> B.mro() [<class '__main__.B'>, <class '__main__.A'>, <type 'object'>] >>> D.mro() [<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <type 'object'>]
Sequence of methods
class B(object): def __init__(self): print("B") # for B super is C here, by MRO order super(B, self).__init__() class C(object): def __init__(self): print("C") # it is ok here to call super().__init__ # because object also has __init__ super(C, self).__init__() class D(B, C): pass >>> d = D() B C
- super will search attribute in MRO hierarchy. It will search all classes. So, for example hierarchy for super is the following:
A
,C
.A
doesn’t have attribute, whereasC
has, then C.method will be used without error.
- if superclass need to be changed in runtime, we can’t do it without super:
-
super() cons (or features):
- when super is used, all methods in sequence must accept same arguments
- super().m - all classes must have method m and call super().m, except last one, that must not call super.
-
-
inherit method from exact class:
class A(B, C): other = C.other # not B other
-
finally block will be called even if exception was happened in except or else block
-
exception - always instance, even if raise ExceptionClass (without
()
). Instance will be created automatically (without arguments):raise Exception # == raise Exception() raise # reraise caught exception
-
py2, look for builtin exceptions:
import exceptions help(exceptions)
-
the downside of reading bytes from file and further manual decoding: if we’ll read by chunks, then some nasty case can happen, when one byte of one symbol will fall in first chunk, and another byte of same symbol - in second chunk. So it is better to use codecs.open in py2.
-
When file name is given in unicode, python will automatically decode and encode from/to bytes. When file name is in bytes, then no encoding is happen. Default encoding for file names:
>>> sys.getfilesystemencoding() 'utf-8'
-
descriptor - class, that implement one of the following methods
__get__
__set__
__delete__
-
If descriptor doesn’t implement
__set__
, it doesn’t mean, that corresponding attribute is read-only. Attribute will be simply rewritten. To avoid it, implement__set__
with exception. -
decorators can be combined, they will be called from bottom to top:
@A @B @C def f(): pass # same as f = A(B(C(f)))
-
decorator can accept arguments
@dec(a, b) def f(): pass # same as f = dec(a, b)(f) # implementation: def dec(a, b): def actual_dec(f): return f return actual_dec
So decorator can include 3 levels of callables:
- callable to accept decorator args
- callable to serve as decorator
- callable to handle calls to the original function
-
during class creation, two methods of class type are called:
type.__new__(type_class, class_name, super_classes, attr_dict) type.__init__(class, class_name, super_classes, attr_dict) # python 3 class Eggs: pass class Spam(Eggs): data = 1 def method(self, arg): pass # same as Eggs = type('Eggs', (), ...) # in () object will be added automatically in python 3 Spam = type('Spam', (Eggs, ), {'data': 1, 'method': method, '__module__': '__main__'})
-
Set metaclass
python 2
class Spam(object): __metaclass__ = Meta
Inherit from object is not mandatory, but if it is not present, and
__metaclass__
is used, then result will be new-style anyway, and in__bases__
object will be present. But better to use object explicitly, as there can be problems, for example with inheritance.python 3
class Spam(Eggs, metaclass=Meta): pass
attribute
__metaclass__
is just ignored -
Metaclass can not be a class itself. It just must return class. Function also can be a metaclass:
def meta_func(class_name, bases, attr_dict): return type(class_name, bases, attr_dict) # python 2 class Spam(object): __metaclass__ = meta_func
-
Regular classes also have method
__new__
. But it doesn’t create class, it is invoked at instance creation (takes class as input argument). This method calls__init__
. -
Magic methods of metaclass and class:
class Meta(type): pass
on creation of class Class (
class Class(metaclass=Meta): ...
) following methods are called:Meta.__new__ Meta.__init__
on creation of instance of class Class (
instance = Class(...)
) following methods are called:Meta.__call__ calls Class.__new__ calls Class.__init__
on calling of instance of class Class (
instance()
) following method is called:Class.__call__
-
It is not mandatory to subclass metaclass from type. We can use simple class with
__new__
method as metaclass. But in that case methods__init__
and__call__
will not be called:class MySimpleMetaClass(object): def __new__(cls, *args, **kwargs): new_class = type.__new__(type, *args, **kwargs) return new_class def __init__(new_class, *args, **kwargs): print("__init__ won't be called...") def __call__(*args, **kwargs): print("__call__ won't be called...")
-
Metaclass of some class will be invoked for all subclasses. When
__new__
of metaclass is called for parent class, bases will contain(<type 'object'>,)
, and for subclass - parent class. -
Metaclass attributes are inherited by class, not by instances of class.
python 2 (python 3 has some syntax differences)
class MyMetaClass(type): attr = 2 def __new__(*args, **kwargs): return type.__new__(*args, **kwargs) def toast(*args, **kwargs): print(args, kwargs) class A(object): __metaclass__ = MyMetaClass
Metaclass is included in search sequence of class attributes
>>> A.toast() ((<class '__main__.A'>,), {})
Interesting, that method from metaclass is bound, although is called from class, not from instance. In fact class - is an instance of metaclass:
>>> A.toast <bound method MyMetaClass.toast of <class '__main__.A'>>
But metaclass is not present in instance attribute search sequence
>>> a = A() >>> a.toast() AttributeError: 'A' object has no attribute 'toast'
If some superclass has attribute with same name, as in metaclass, it has higher priority (no matter how deep superclass is)
class B(object): attr = 1 class C(B): __metaclass__ = MyMetaClass >>> C.attr 1 # MyMetaClass.attr = 2 is ignored
Instance attributes are searched in its
__dict__
, next in all__dict__
of__class__.__mro__
Class attributes are searched also in__class__.__mro__
, it is different class, from instance it will be__class__.__class__.__mro__
.>>> inst = C() >>> inst.__class__ -> <class '__main__.C'> >>> C.__bases__ -> (<class '__main__.B'>,) >>> C.__class__ -> <class '__main__.MyMetaClass'>
Instance inherit attributes from all superclasses. Class - from superclasses and metaclasses. Metaclasses - from super-metaclasses (and probably from meta-metaclasess).
Data descriptors (those, that define
__set__
) brings some changes in attribute search order for instances. For class instance, data descriptor will have higher priority in search, even if they are declared in superclasess:class DataDescriptor(object): def __get__(self, instance, owner): print("DataDescriptor.__get__") return 5 def __set__(self, instance, value): print("DataDescriptor.__set__", value) class B(object): attr = DataDescriptor() class C(B): pass >>> c = C() >>> c.__dict__['attr'] = 88 >>> c.attr DataDescriptor.__get__ 5 >>> c.attr = 8 ('DataDescriptor.__set__', 8)
Descriptor was called, in spite of attribute with same name is present in
c.__dict__
. Attribute doesn’t hide descriptor of superclass. Such behaviour will not happen in case of nondata descriptor:class SimpleDescriptor(object): def __get__(self, instance, owner): print("SimpleDescriptor.__get__") return 5 class B(object): attr = SimpleDescriptor() class C(B): pass >>> c = C() >>> c.attr SimpleDescriptor.__get__ 5 >>> c.__dict__['attr'] = 88 >>> c.attr 88
Also, for builtin operators that call magic methods implicitly, the search order is special. It ignores
instance.__dict__
, the search goes to__dict__
of classes from__mro__
. -
magic methods, that are called implicitly by builtin operators, are searched in metaclasses, ignoring the class (and all its superclasses)
python 2 (in python 3 syntax differs a little bit)
class MyMetaClass(type): def __new__(*args, **kwargs): return type.__new__(*args, **kwargs) def __str__(cls): return "__str__ from meta" class A(object): __metaclass__ = MyMetaClass def __str__(self): return "__str__ from class A"
Method
MyMetaClass.__str__
will be called, notA.__str__
>>> print A __str__ from meta
And here method
object.__str__
will be called:>>> print MyMetaClass <class '__main__.MyMetaClass'>
-
Author Mark Lutz is a little upset, that python become too complicated nowadays. It have more than one obvious way to do some things:
str.format
and%
with
andtry/finally
It goes contrary with
import this
Zen.