Skip to the content.

C++ extension

contents

overview

We’ve written C extension before to help performance penalty

Now we want to take a further step to develop a complex module in C++ and expose to python API

The C++ compiler is compatible with C and I want to use the power of STL in C++11(or newer) standard

example

# setup.py
from distutils.core import setup, Extension

my_module = Extension('m_example', sources=['./example.cpp'], extra_compile_args=['-std=c++11'])

setup(name='m_example',
      version='1.0',
      description='my module to use C++ STL',
      ext_modules=[my_module])

and the example.cpp get elements from the Python array and stores all integer to a C++ vectors and use std::sort from <algorithm> to sort the vector, finally returns the first element as Python object to the caller

Run the example

cd Cpython-Internals/Extension/CPP/example
python3 setup.py build
mv build/lib.macosx-10.15-x86_64-3.8/m_example.cpython-38-darwin.so ./
zpoint@zpoints-MacBook-Pro example % python3
Python 3.8.4 (default, Jul 14 2020, 02:58:48)
[Clang 11.0.3 (clang-1103.0.32.62)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import m_example
>>> m_example.example([6,4,3])
3
>>>

integrate with NumPy

Sometimes we need to work with NumPy, let’s begin with an example

We take a two dimension numpy array, a one dimension numpy array as input and a double value, do some computation and write the result back to the one dimension array

The numpy array should have dtype float64

two_dim is 3*N array

one_dim is N array

import numpy as np
def compute(two_dim: np.array, one_dim: np.array, val: np.float) -> None:
	# do some computation and store result to one_dim
	for index in range(len(one_dim)):
		one_dim[index] = (two_dim[0][index] + two_dim[1][index] + two_dim[2][index]) * val 
	# ...

We are going to write the above function as extension module in C++

example.cpp

cd Cpython-Internals/Extension/CPP/m_numpy
python3 setup.py build
mv build/lib.macosx-10.15-x86_64-3.8/m_example.cpython-38-darwin.so ./
zpoint@zpoints-MacBook-Pro m_numpy % python3
Python 3.8.4 (default, Jul 14 2020, 02:58:48) 
[Clang 11.0.3 (clang-1103.0.32.62)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> import m_example
>>> one_dim = np.zeros([2], dtype=np.float)
>>> two_dim = np.array([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]], dtype=np.float)
>>> m_example.example(two_dim, one_dim, 0.5)
>>> one_dim
array([4.5, 6. ])

bypass the GIL

What if we want to schedule a parallel task to utilize the run time of our task ?

two_dim becomes X * 3 * N array

one_dim is X * N array

import numpy as np
def compute(two_dim: np.array, one_dim: np.array, val: np.float) -> None:
	# do some computation and store result to one_dim
	for task_index in range(len(one_dim)):
		curr_one_dim = one_dim[task_index]
		curr_two_dim = two_dim[task_index]
	for index in range(len(one_dim)):
		curr_one_dim[index] = (curr_two_dim[0][index] + curr_two_dim[1][index] + curr_two_dim[2][index]) * val
	# ...

I want to seperate these tasks to several different threads, and let os schedule them to work together

We’ve learned GIL before, we know that the interpreter in different thread will accquire the mutex before executing every python byte code, so as long as our code in different thread not executing by the python interpreter, everything will be find

We use the std::future from C++ STL to schedule our thread according to the environment variable CONCURRENCY_NUM

cd Cpython-Internals/Extension/CPP/m_parallel
python3 setup.py build
mv build/lib.macosx-10.15-x86_64-3.8/m_example.cpython-38-darwin.so ./
zpoint@zpoints-MacBook-Pro m_parallel % python3
Python 3.8.4 (default, Jul 14 2020, 02:58:48) 
[Clang 11.0.3 (clang-1103.0.32.62)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> import numpy as np
>>> import m_example
>>> os.putenv("CONCURRENCY_NUM", "2")
>>> one_dim = np.zeros([4, 2], dtype=np.float)
>>> two_dim = np.array([[[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]], [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]], [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]], [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]]], dtype=np.float)
>>> m_example.example(two_dim, one_dim, 0.5)
>>> print(one_dim)
[[4.5 6. ]
 [4.5 6. ]
 [4.5 6. ]
 [4.5 6. ]]

read more