C++ extension
contents
overview
We’ve written C extension before to help performance penalty
Now we want to take a further step to develop a complex module in C++ and expose to python API
The C++ compiler is compatible with C and I want to use the power of STL in C++11(or newer) standard
example
# setup.py
from distutils.core import setup, Extension
my_module = Extension('m_example', sources=['./example.cpp'], extra_compile_args=['-std=c++11'])
setup(name='m_example',
version='1.0',
description='my module to use C++ STL',
ext_modules=[my_module])
and the example.cpp get elements from the Python array and stores all integer to a C++ vectors and use std::sort from <algorithm> to sort the vector, finally returns the first element as Python object to the caller
Run the example
cd Cpython-Internals/Extension/CPP/example
python3 setup.py build
mv build/lib.macosx-10.15-x86_64-3.8/m_example.cpython-38-darwin.so ./
zpoint@zpoints-MacBook-Pro example % python3
Python 3.8.4 (default, Jul 14 2020, 02:58:48)
[Clang 11.0.3 (clang-1103.0.32.62)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import m_example
>>> m_example.example([6,4,3])
3
>>>
integrate with NumPy
Sometimes we need to work with NumPy, let’s begin with an example
We take a two dimension numpy array, a one dimension numpy array as input and a double value, do some computation and write the result back to the one dimension array
The numpy array should have dtype float64
two_dim is 3*N array
one_dim is N array
import numpy as np
def compute(two_dim: np.array, one_dim: np.array, val: np.float) -> None:
# do some computation and store result to one_dim
for index in range(len(one_dim)):
one_dim[index] = (two_dim[0][index] + two_dim[1][index] + two_dim[2][index]) * val
# ...
We are going to write the above function as extension module in C++
cd Cpython-Internals/Extension/CPP/m_numpy
python3 setup.py build
mv build/lib.macosx-10.15-x86_64-3.8/m_example.cpython-38-darwin.so ./
zpoint@zpoints-MacBook-Pro m_numpy % python3
Python 3.8.4 (default, Jul 14 2020, 02:58:48)
[Clang 11.0.3 (clang-1103.0.32.62)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> import m_example
>>> one_dim = np.zeros([2], dtype=np.float)
>>> two_dim = np.array([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]], dtype=np.float)
>>> m_example.example(two_dim, one_dim, 0.5)
>>> one_dim
array([4.5, 6. ])
bypass the GIL
What if we want to schedule a parallel task to utilize the run time of our task ?
two_dim becomes X * 3 * N array
one_dim is X * N array
import numpy as np
def compute(two_dim: np.array, one_dim: np.array, val: np.float) -> None:
# do some computation and store result to one_dim
for task_index in range(len(one_dim)):
curr_one_dim = one_dim[task_index]
curr_two_dim = two_dim[task_index]
for index in range(len(one_dim)):
curr_one_dim[index] = (curr_two_dim[0][index] + curr_two_dim[1][index] + curr_two_dim[2][index]) * val
# ...
I want to seperate these tasks to several different threads, and let os schedule them to work together
We’ve learned GIL before, we know that the interpreter in different thread will accquire the mutex before executing every python byte code, so as long as our code in different thread not executing by the python interpreter, everything will be find
We use the std::future from C++ STL to schedule our thread according to the environment variable CONCURRENCY_NUM
cd Cpython-Internals/Extension/CPP/m_parallel
python3 setup.py build
mv build/lib.macosx-10.15-x86_64-3.8/m_example.cpython-38-darwin.so ./
zpoint@zpoints-MacBook-Pro m_parallel % python3
Python 3.8.4 (default, Jul 14 2020, 02:58:48)
[Clang 11.0.3 (clang-1103.0.32.62)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> import numpy as np
>>> import m_example
>>> os.putenv("CONCURRENCY_NUM", "2")
>>> one_dim = np.zeros([4, 2], dtype=np.float)
>>> two_dim = np.array([[[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]], [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]], [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]], [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]]], dtype=np.float)
>>> m_example.example(two_dim, one_dim, 0.5)
>>> print(one_dim)
[[4.5 6. ]
[4.5 6. ]
[4.5 6. ]
[4.5 6. ]]