Skip to the content.

bytes

contents

related file

memory layout

memory layout

The memory layout of PyBytesObject looks like memory layout of tuple object and memory layout of int object, but simpler than any of them.

example

empty bytes

bytes object is an immutable object, whenever you need to modify a bytes object, you need to create a new one, which keeps the implementation simple.

s = b""

empty

ascii characters

let’s initialize a byte object with ascii characters

s = b"abcdefg123"

ascii

nonascii characters

s = "我是帅哥".encode("utf8")

nonascii

summary

ob_shash

The field ob_shash should store the hash value of the byte object, value -1 means not computed yet.

The first time the hash value computed, it will be cached in the ob_shash field

the cached hash value can save recalculation and speeds up dictionary lookups

ob_size

field ob_size is inside every PyVarObject, the PyBytesObject uses this field to store size information to keep O(1) time complexity for len() operation and tracks the size of non-ascii string(may be null characters inside)

summary

The PyBytesObject is a python wrapper of c style null terminate string, with ob_shash for caching hash value and ob_size for storing the size information of PyBytesObject

The implementation of PyBytesObject looks like the embstr encoding in redis

```shell script redis-cli 127.0.0.1:6379> set a “hello” OK 127.0.0.1:6379> object encoding a “embstr”

```