descr
Thanks @Hanaasagi for pointing out the errors #19 of this article
contents
- related file
- how does attribute access work in python?
- method_descriptor
- how to change the behavior of attribute access?
- read more
related file
- cpython/Objects/descrobject.c
- cpython/Include/descrobject.h
- cpython/Objects/object.c
- cpython/Include/object.h
- cpython/Objects/typeobject.c
- cpython/Include/cpython/object.h
how does attribute access work in python
built in types
let’s see an example first before we look into how descriptor object implements
print(type(str.center)) # <class 'method_descriptor'>
print(type("str".center)) # <class 'builtin_function_or_method'>
what is type method_descriptor ? why will str.center returns a method_descriptor object, but "str".center returns a builtin_function_or_method ? how does attribute access work in python ?
instance attribute access
this is the defination of inspect.ismethoddescriptor and inspect.isdatadescriptor
def ismethoddescriptor(object):
if isclass(object) or ismethod(object) or isfunction(object):
# mutual exclusion
return False
tp = type(object)
return hasattr(tp, "__get__") and not hasattr(tp, "__set__")
def isdatadescriptor(object):
"""Return true if the object is a data descriptor.
Data descriptors have both a __get__ and a __set__ attribute. Examples are
properties (defined in Python) and getsets and members (defined in C).
Typically, data descriptors will also have __name__ and __doc__ attributes
(properties, getsets, and members have both of these attributes), but this
is not guaranteed."""
if isclass(object) or ismethod(object) or isfunction(object):
# mutual exclusion
return False
tp = type(object)
return hasattr(tp, "__set__") and hasattr(tp, "__get__")
according to the comments, we know that data descriptor has both __set__ and __get__ attribute defined while method descriptor has only __get__ defined
if we dis the print(type(str.center))
./python.exe -m dis test.py
1 0 LOAD_NAME 0 (print)
2 LOAD_NAME 1 (type)
4 LOAD_NAME 2 (str)
6 LOAD_ATTR 3 (center)
8 CALL_FUNCTION 1
10 CALL_FUNCTION 1
12 POP_TOP
14 LOAD_CONST 0 (None)
16 RETURN_VALUE
we can see that the core opcode is LOAD_ATTR, follow the LOAD_ATTR to the Python/ceval.c, we can find the definition
case TARGET(LOAD_ATTR): {
PyObject *name = GETITEM(names, oparg);
PyObject *owner = TOP();
PyObject *res = PyObject_GetAttr(owner, name);
Py_DECREF(owner);
SET_TOP(res);
if (res == NULL)
goto error;
DISPATCH();
}
// PyObject_GetAttr is defined in Objects/object.c
PyObject *
PyObject_GetAttr(PyObject *v, PyObject *name)
{
PyTypeObject *tp = Py_TYPE(v);
if (!PyUnicode_Check(name)) {
PyErr_Format(PyExc_TypeError,
"attribute name must be string, not '%.200s'",
name->ob_type->tp_name);
return NULL;
}
/* first call the tp_getattro function in C level */
if (tp->tp_getattro != NULL)
return (*tp->tp_getattro)(v, name);
/* if there's not a tp_getattro C function, try to call the tp_getattr function */
if (tp->tp_getattr != NULL) {
const char *name_str = PyUnicode_AsUTF8(name);
if (name_str == NULL)
return NULL;
return (*tp->tp_getattr)(v, (char *)name_str);
}
/* If there's not a tp_getattr C function either, raise an exception */
PyErr_Format(PyExc_AttributeError,
"'%.50s' object has no attribute '%U'",
tp->tp_name, name);
return NULL;
}
tp_getattro is
an optional pointer to the get-attribute function.
it accepts two parameters, PyObject *o, PyObject *attr_name
tp_getattr is
an optional pointer to the get-attribute-string function. This field is deprecated. When it is defined, it should point to a function that acts the same as the tp_getattro function, but taking a C string instead of a Python string object to give the attribute name.
it accepts two parameters, PyObject *o, char *attr_name
The only difference between them is the second parameter, which is of type PyObject * in tp_getattro and type char * in tp_getattr
From the above PyObject_GetAttr function, we can learn that tp_getattro have a higher priority than tp_getattr
we can see how type str is defined in Objects/unicodeobject.c
PyTypeObject PyUnicode_Type = {
PyVarObject_HEAD_INIT(&PyType_Type, 0)
"str", /* tp_name */
sizeof(PyUnicodeObject), /* tp_basicsize */
0, /* tp_itemsize */
/* Slots */
(destructor)unicode_dealloc, /* tp_dealloc */
0, /* tp_print */
0, /* tp_getattr */
0, /* tp_setattr */
0, /* tp_reserved */
unicode_repr, /* tp_repr */
&unicode_as_number, /* tp_as_number */
&unicode_as_sequence, /* tp_as_sequence */
&unicode_as_mapping, /* tp_as_mapping */
(hashfunc) unicode_hash, /* tp_hash*/
0, /* tp_call*/
(reprfunc) unicode_str, /* tp_str */
PyObject_GenericGetAttr, /* tp_getattro */
0, /* tp_setattro */
...
it’s using the widely used c function PyObject_GenericGetAttr in cpython as it’s tp_getattro, which is defined in Objects/object.c
PyObject *
PyObject_GenericGetAttr(PyObject *obj, PyObject *name)
{
return _PyObject_GenericGetAttrWithDict(obj, name, NULL, 0);
}
PyObject *
_PyObject_GenericGetAttrWithDict(PyObject *obj, PyObject *name,
PyObject *dict, int suppress)
{
/* Make sure the logic of _PyObject_GetMethod is in sync with
this method.
When suppress=1, this function suppress AttributeError.
*/
PyTypeObject *tp = Py_TYPE(obj);
PyObject *descr = NULL;
PyObject *res = NULL;
descrgetfunc f;
Py_ssize_t dictoffset;
PyObject **dictptr;
if (!PyUnicode_Check(name)){
PyErr_Format(PyExc_TypeError,
"attribute name must be string, not '%.200s'",
name->ob_type->tp_name);
return NULL;
}
Py_INCREF(name);
if (tp->tp_dict == NULL) {
if (PyType_Ready(tp) < 0)
goto done;
}
/* look for a name through the MRO */
descr = _PyType_Lookup(tp, name);
f = NULL;
if (descr != NULL) {
/* the name in MRO can be found */
Py_INCREF(descr);
/* get the tp_descr_get field of descr object, which is defined as __get__ in python level */
f = descr->ob_type->tp_descr_get;
/* check if the object is data descriptor, PyDescr_IsData is defined as #define PyDescr_IsData(d) (Py_TYPE(d)->tp_descr_set != NULL) */
if (f != NULL && PyDescr_IsData(descr)) {
/* check success, means the object define both __get__ and __set__ */
/* if it's data descriptor, try to call the __get__ function and stores the result in variable res */
res = f(descr, obj, (PyObject *)obj->ob_type);
if (res == NULL && suppress &&
PyErr_ExceptionMatches(PyExc_AttributeError)) {
PyErr_Clear();
}
/* finish */
goto done;
}
}
/* down here, the descr is NULL(means not able to find the name through the MRO) or descr is not NULL, but without __set__ defined */
if (dict == NULL) {
/* find the __dict__ variables inside the object */
/* Inline _PyObject_GetDictPtr */
dictoffset = tp->tp_dictoffset;
if (dictoffset != 0) {
if (dictoffset < 0) {
Py_ssize_t tsize;
size_t size;
tsize = ((PyVarObject *)obj)->ob_size;
if (tsize < 0)
tsize = -tsize;
size = _PyObject_VAR_SIZE(tp, tsize);
_PyObject_ASSERT(obj, size <= PY_SSIZE_T_MAX);
dictoffset += (Py_ssize_t)size;
_PyObject_ASSERT(obj, dictoffset > 0);
_PyObject_ASSERT(obj, dictoffset % SIZEOF_VOID_P == 0);
}
dictptr = (PyObject **) ((char *)obj + dictoffset);
dict = *dictptr;
}
}
if (dict != NULL) {
/* if the __dict__ is not NULL, try to find whether the name is stored inside the __dict__ */
Py_INCREF(dict);
res = PyDict_GetItem(dict, name);
if (res != NULL) {
Py_INCREF(res);
Py_DECREF(dict);
goto done;
}
Py_DECREF(dict);
}
/* down here, either __dict__ is NULL or name is not stored inside the __dict__ or __set__ not defined in descr */
if (f != NULL) {
/* __get__ is defined, but __set__ is not defined */
/* try to call the __get__ function */
res = f(descr, obj, (PyObject *)Py_TYPE(obj));
if (res == NULL && suppress &&
PyErr_ExceptionMatches(PyExc_AttributeError)) {
PyErr_Clear();
}
goto done;
}
/* down here, the descr doesn't define __get__ method and can't find name inside __dict__*/
if (descr != NULL) {
/* if name is found in MRO, return the found object */
res = descr;
descr = NULL;
goto done;
}
if (!suppress) {
PyErr_Format(PyExc_AttributeError,
"'%.50s' object has no attribute '%U'",
tp->tp_name, name);
}
done:
Py_XDECREF(descr);
Py_DECREF(name);
return res;
}
we can draw the process according to the code above

until now, I made a mistake, the tp_getattro in PyUnicode_Type will not be called in str.center, otherwise, it will be called in "str".center, you can tell the differences in the following codes
>>> type("str")
<class 'str'>
>>> "str".center # it calls the tp_getattro in PyUnicode_Type
<built-in method center of str object at 0x10360f500>
>>> type("str".center)
<class 'builtin_function_or_method'>
>>> type(str)
<class 'type'>
>>> str.center # it calls the tp_getattro in a type named PyType_Type
<method 'center' of 'str' objects>
>>> type(str.center)
<class 'method_descriptor'>
so, the procedure above describes the attribute access of "str".center
class attribute access
let’s find the definition of <class 'type'> and how exactly str.center works (mostly same as "str".center)
for the type <class 'type'>, the LOAD_ATTR calls the type_getattro
PyTypeObject PyType_Type = {
PyVarObject_HEAD_INIT(&PyType_Type, 0)
"type", /* tp_name */
sizeof(PyHeapTypeObject), /* tp_basicsize */
sizeof(PyMemberDef), /* tp_itemsize */
(destructor)type_dealloc, /* tp_dealloc */
0, /* tp_print */
0, /* tp_getattr */
0, /* tp_setattr */
0, /* tp_reserved */
(reprfunc)type_repr, /* tp_repr */
0, /* tp_as_number */
0, /* tp_as_sequence */
0, /* tp_as_mapping */
0, /* tp_hash */
(ternaryfunc)type_call, /* tp_call */
0, /* tp_str */
(getattrofunc)type_getattro, /* tp_getattro */
(setattrofunc)type_setattro, /* tp_setattro */
0, /* tp_as_buffer */
...
}
static PyObject *
type_getattro(PyTypeObject *type, PyObject *name)
{
/*
logic mostly same as _PyObject_GenericGetAttrWithDict,
for thoes who are interested, read Objects/typeobject.c directly
*/
}

from the above pictures and codes, we can know that even if the tp_getattro function of str and "str" are different, they both calls the same tp_descr_get (alias of __get__) defined in a type named method_descriptor
/* in cpython/Objects/descrobject.c */
static PyObject *
method_get(PyMethodDescrObject *descr, PyObject *obj, PyObject *type)
{
PyObject *res;
/* descr_check checks whether the descriptor was found on the target object itself (or a base) */
if (descr_check((PyDescrObject *)descr, obj, &res))
/* str.center goes into this branch, returns the type of PyMethodDescrObject */
return res;
/* while "str".center goes into this branch, returns a type of PyCFunction */
return PyCFunction_NewEx(descr->d_method, obj, NULL);
}
now, we have the answers of
- why will
str.centerreturns a method_descriptor object, but"str".centerreturns a builtin_function_or_method ? - how does attribute access work in python ?
self defined types
Assume you’ve read the following two parts
self defined types instance attribute access
From the above analyzation, we can learn that the key point is tp_getattro defined in type(instance), tp_getattro function of those built-in types is pre defined in C, while user defined type is created on the fly, some function will be attached to the newly created type in the tp_getattro field
class A(object):
def __getattr__(self, item):
pass
class B(object):
def __getattribute__(self, item):
pass
class C(object):
pass
What is in the tp_getattro field of class A, class B and class C ?
The following codes are snippests of the creation of a type
/* cpython/Objects/typeobject.c */
static slotdef slotdefs[] = {
/* ... */
TPSLOT("__getattribute__", tp_getattro, slot_tp_getattr_hook,
wrap_binaryfunc,
"__getattribute__($self, name, /)\n--\n\nReturn getattr(self, name)."),
TPSLOT("__getattr__", tp_getattro, slot_tp_getattr_hook, NULL, ""),
TPSLOT("__setattr__", tp_setattro, slot_tp_setattro, wrap_setattr,
/* ... */
}
static PyObject *
type_new(PyTypeObject *metatype, PyObject *args, PyObject *kwds)
{
/* ... */
/* Put the proper slots in place */
fixup_slot_dispatchers(type);
/* ... */
}
/* Store the proper functions in the slot dispatches at class (type)
definition time, based upon which operations the class overrides in its
dict. */
static void
fixup_slot_dispatchers(PyTypeObject *type)
{
slotdef *p;
init_slotdefs();
for (p = slotdefs; p->name; )
p = update_one_slot(type, p);
}
static slotdef *
update_one_slot(PyTypeObject *type, slotdef *p)
{
PyObject *descr;
PyWrapperDescrObject *d;
void *generic = NULL, *specific = NULL;
int use_generic = 0;
int offset = p->offset;
int error;
void **ptr = slotptr(type, offset);
if (ptr == NULL) {
do {
++p;
} while (p->offset == offset);
return p;
}
assert(!PyErr_Occurred());
do {
descr = find_name_in_mro(type, p->name_strobj, &error);
if (descr == NULL) {
if (error == -1) {
PyErr_Clear();
}
if (ptr == (void**)&type->tp_iternext) {
specific = (void *)_PyObject_NextNotImplemented;
}
continue;
}
if (Py_TYPE(descr) == &PyWrapperDescr_Type &&
((PyWrapperDescrObject *)descr)->d_base->name_strobj == p->name_strobj) {
void **tptr = resolve_slotdups(type, p->name_strobj);
if (tptr == NULL || tptr == ptr)
generic = p->function;
d = (PyWrapperDescrObject *)descr;
if (d->d_base->wrapper == p->wrapper &&
PyType_IsSubtype(type, PyDescr_TYPE(d)))
{
if (specific == NULL ||
specific == d->d_wrapped)
specific = d->d_wrapped;
else
use_generic = 1;
}
}
else if (Py_TYPE(descr) == &PyCFunction_Type &&
PyCFunction_GET_FUNCTION(descr) ==
(PyCFunction)(void(*)(void))tp_new_wrapper &&
ptr == (void**)&type->tp_new)
{
/* ... */
specific = (void *)type->tp_new;
/* ... */
}
else if (descr == Py_None &&
ptr == (void**)&type->tp_hash) {
/* ... */
specific = (void *)PyObject_HashNotImplemented;
/* ... */
}
else {
use_generic = 1;
generic = p->function;
}
} while ((++p)->offset == offset);
if (specific && !use_generic)
*ptr = specific;
else
*ptr = generic;
return p;
}
We can learn that for every pre defined slot attribute in the slotdefs, update_one_slot will install the proper function/object to the newly created type
__getattribute__ and __getattr__ have same offset in every newly created type(you can learn from slotdefs structure)
For class A, when installing __getattribute__, offset is 144, the descr is <slot wrapper '__getattribute__' of 'object' objects>, PyType_IsSubtype(type, PyDescr_TYPE(d)) is True, so specific will be d->d_wrapped, which is PyObject_GenericGetAttr. In the next while loop, when installing __getattr__, offset is also 144, this time descr will be <function A.__getattr__ at 0x1013260c0>, the final else inside the while loop will set generic to p->function, which is slot_tp_getattr_hook, now the while loop terminate and offset 144 stores slot_tp_getattr_hook
For class B, when installing __getattribute__, offset is 144, the descr is <function B.__getattribute__ at 0x103ad6140>, the final else inside the while loop will set generic to p->function, which is slot_tp_getattr_hook, In the next while loop, when installing __getattr__, offset is also 144, this time descr is a null pointer, so the continue statement will terminate the while loop, offset 144 stores slot_tp_getattr_hook
For class C, when installing __getattribute__, offset is 144, the descr is <slot wrapper '__getattribute__' of 'object' objects>, PyType_IsSubtype(type, PyDescr_TYPE(d)) is True, so specific will be d->d_wrapped, which is PyObject_GenericGetAttr, In the next while loop, when installing __getattr__, offset is also 144, this time descr is a null pointer, so the continue statement will terminate the while loop, offset 144 stores PyObject_GenericGetAttr
slot_tp_getattr_hook is defined as
/* python/Objects/typeobject.c */
static PyObject *
slot_tp_getattro(PyObject *self, PyObject *name)
{
PyObject *stack[1] = {name};
return call_method(self, &PyId___getattribute__, stack, 1);
}
static PyObject *
slot_tp_getattr_hook(PyObject *self, PyObject *name)
{
PyTypeObject *tp = Py_TYPE(self);
PyObject *getattr, *getattribute, *res;
_Py_IDENTIFIER(__getattr__);
getattr = _PyType_LookupId(tp, &PyId___getattr__);
if (getattr == NULL) {
/* No __getattr__ hook: use a simpler dispatcher */
tp->tp_getattro = slot_tp_getattro;
return slot_tp_getattro(self, name);
}
Py_INCREF(getattr);
getattribute = _PyType_LookupId(tp, &PyId___getattribute__);
if (getattribute == NULL ||
(Py_TYPE(getattribute) == &PyWrapperDescr_Type &&
((PyWrapperDescrObject *)getattribute)->d_wrapped ==
(void *)PyObject_GenericGetAttr))
res = PyObject_GenericGetAttr(self, name);
else {
Py_INCREF(getattribute);
res = call_attribute(self, getattribute, name);
Py_DECREF(getattribute);
}
if (res == NULL && PyErr_ExceptionMatches(PyExc_AttributeError)) {
PyErr_Clear();
res = call_attribute(self, getattr, name);
}
Py_DECREF(getattr);
return res;
}
For slot_tp_getattr_hook
-
If there’s no
___getattr__method,slot_tp_getattr_hookwill only call___getattribute__directly -
If there defines
__getattr__,slot_tp_getattr_hookwill call___getattribute__, if there’s no result and occursPyExc_AttributeError, try to call__getattr__
If you override any of __getattribute__ or __getattr__ , tp_getattro of the newly created class will be slot_tp_getattr_hook
If none of __getattribute__ or __getattr__ are override, tp_getattro of the newly created class will be PyObject_GenericGetAttr(same as built-in types)

self defined types class attribute access
Because type(newly_created_class) will always return <class 'type'>, and the tp_getattro of <class 'type'> is pre defined in C and not able to be customized, the behaviour of attribute accessing is same as class attribute access
method_descriptor
let’s find out the answer of
- what is type method_descriptor ?
it’s defined in Include/descrobject.h
memory layout

we can see that when you try to access "str".center and str.center, they both shares the same PyMethodDescrObject, which is a wrapper of the PyMethodDef object
for those who are interested in PyMethodDef, please refer to method
>>> str.center
descr->d_common->d_type: 0x10fdefc10, descr->d_common->d_type.tp_name: str, repr(d_name): 'center', repr(d_qualname): NULL, PyMethodDef: 0x10fdf0370
>>> "str".center
descr->d_common->d_type: 0x10fdefc10, descr->d_common->d_type.tp_name: str, repr(d_name): 'center', repr(d_qualname): NULL, PyMethodDef: 0x10fdf0370
there exists various descriptor type
PyMemberDescrObject: wrapper of PyMemberDef
PyGetSetDescrObject: wrapper of PyGetSetDef
how to change the behavior of attribute access
we know that when you try to access the attribute of an object, the python virtual machine will
- execute the opcode
LOAD_ATTR LOAD_ATTRwill try to calltp_getattrofunction of the object, if success go to 4- raise an exception
- return what’s returned
the default tp_getattro will be installed in the creation process of the newly created type, different behaviour depends on what methods are override by user, generally, the default tp_getattro is PyObject_GenericGetAttr which implements the descriptor protocol(we learned above from the source code)
when we define a python object, if we need to change the behavior of attribute access
we are not able to change the behavior of opcode LOAD_ATTR, it’s written in C
instead, we can provide our own __getattribute__ and __getattr__ to change the function installed in the tp_getattro slot of the newly created type
notice, provide your own __getattribute__ may violate the descriptor protocol, I will not recommend you to do that(usually we only need to define our own __getattr__)
class A(object):
def __getattribute__(self, item):
print("in __getattribute__", item)
if item in ("noA", "noB"):
raise AttributeError
return "__getattribute__" + str(item)
def __getattr__(self, item):
print("in __getattr__", item)
if item == "noB":
raise AttributeError
return "__getattr__" + str(item)
>>> a = A()
>>> a.x
in __getattribute__ x
'__getattribute__x'
>>> a.noA
in __getattribute__ noA
in __getattr__ noA
'__getattr__noA'
>>> a.noB
in __getattribute__ noB
in __getattr__ noB
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "<input>", line 11, in __getattr__
AttributeError