This context provides a comprehensive guide on how to use C code in Python, discussing the reasons for doing so, and presenting two methods: using the ctypes library to call C functions and writing a custom Python module in C.
Abstract
The context begins by explaining the reasons for using C code in Python, such as reusing existing C code, speeding up Python programs, and performing low-level tasks. It then introduces a shortcut for speeding up Python programs using PyPy, an alternative implementation of Python that uses just-in-time compilation. The main focus of the context is on two methods for using C code in Python: using the ctypes library and writing a custom Python module in C. The ctypes library allows calling functions in dynamic libraries and wrapping them in pure Python, making it suitable for reusing existing C code. Writing a custom Python module in C is more appropriate for writing new C code specifically for Python, as it provides access to Python built-in data types and their methods. The context provides examples and explanations for both methods, as well as a benchmark comparing the performance of pure Python, Python with ctypes, and a custom Python module in C.
Bullet points
Reasons for using C code in Python: reusing existing C code, speeding up Python programs, and performing low-level tasks
PyPy as a shortcut for speeding up Python programs
Two methods for using C code in Python: using the ctypes library and writing a custom Python module in C
ctypes library: allows calling functions in dynamic libraries and wrapping them in pure Python, suitable for reusing existing C code
Writing a custom Python module in C: more appropriate for writing new C code specifically for Python, provides access to Python built-in data types and their methods
Examples and explanations for both methods
Benchmark comparing the performance of pure Python, Python with ctypes, and a custom Python module in C
Before seeing “how” to use C code from Python let’s see first “why” one may want to do this. If you are reading this article right now you probably want to do one or more of these 3 things:
Reusing existing C code from a Python program
Speed up your Python program
Do some low-level stuff that cannot be done directly in Python
A shortcut…
If all you want is just to speed up your Python program, then there is actually an easier way rather than writing certain parts of your program in C. You can just use PyPy instead of Python when executing your app. PyPy is an alternative implementation of the Python programming language which uses just-in-time compilation to speed up the same python code with little or no changes to your code. You can download and install it from here.
After installation, simply replace python with pypy when executing a Python script. So, instead of:
python my_awesome_program.py
do:
pypy my_awesome_program.py
As you will see in the benchmark at the end of this article, this method can make your code considerably faster and in some situations (as is the case with the simple program example shown below in this article) it can be even faster than a C implementation.
If you really want to write/re-use some C…
Next we are going to see 2 ways to use C in Python, these are:
Using the ctypes module to call C functions
Writing a custom Python module in C
The ctypes library provides C compatible data types, and allows calling functions in DLLs or shared libraries. It can be used to wrap these libraries in pure Python.
As it sounds above, this method of using ctypes may be the best choice if you want to reuse existing C code (your code or even third-party libraries without the source code). With ctypes you can do so without writing any more C. You can just call the C functions using pure Python and wrap all the C functions inside Python functions.
On the other hand, if you don’t have existing C code and want to write it now specifically to be used in Python for speeding up or doing some low-level stuff, then writing a Python module in C may be the best option. When writing a Python module in C you have access to almost all the Python built-in data types and their methods (as plain C functions). So it makes it easier to write your C program specifically for Python, and then wen you import the module in Python you don’t need to any more work to “adapt” it to Python. You just use the module as it were written in Python.
Using ctypes library to call C functions
ctypes exports a few objects that can be used to load dynamic libraries and they use different calling conventions: cdll, windll, oledll. In this article we’ll use cdll.
To load a library:
mylib = cdll.LoadLibrary('library path or name')
After a library has been loaded, we can access a function inside it this way:
my_awesome_func = mylib.my_awesome_func
my_awesome_func() # call the function
But wait… These C functions that we want to call expect C data types as parameters and return C data types. How we can make Python know how to deal with these types? Fortunately, the ctypes library also exports objects that represent C data types and can be used to convert Python variables when passing them as parameters to C functions and to inform Python about what type should expect as return value from a C function.
Here is a list with most of the available types:
c_byte
c_char
c_char_p
c_double
c_longdouble
c_float
c_int
c_int8
c_int16
c_int32
c_int64
c_long
c_longlong
c_short
c_size_t
c_ssize_t
c_ubyte
c_uint
c_uint8
c_uint16
c_uint32
c_uint64
c_ulong
c_ulonglong
c_ushort
c_void_p
c_wchar
c_wchar_p
c_bool
And if you need a pointer of one of these types just use the POINTER(type) function like this:
pointer_to_int_type = POINTER(c_int)
To inform Python of the return type of a C function use the .restype attribute, for example:
my_awesome_func.restype = POINTER(c_int) # my_awesome_func returns a pointer to int
And to pass values to functions in the correct type just wrap them in the corresponding ctype constructor:
my_awesome_func(c_int(300)) # callfunctionwith300as a C inttype
Let’s assume we have the following existing C code that we’d like to use in a Python program:
Quite simple, right? It’s just a function fib() that returns an array with the first n fibonacci numbers. And we want to call this function in Python and show the results on the screen.
First, we need to compile this code to a shared library:
gcc -c -fpic fib.c
gcc -shared -o libfib.so fib.o
Then, to call it in Python we do:
I think the above code with the comments is quite self-explanatory, but let’s still clarify some things:
If we return a C array as a pointer to the first element, Python doesn’t know how to print that. So we created a function for this.
Python doesn’t keep track of the dynamic-allocated stuff that we did in C, so it won’t clean up. We need to do so by calling free() function and we need to find the proper C library for that depending on our OS.
If you want to find out more about ctypes, here is the official documentation.
Writing a custom Python module in C
So how to write such a Python module? We start with just a C file (or more files, if your module is more complex and need to split it into multiple files; but for simplicity, let’s assume we have just one C file). But this C file has to follow some special rules so that it can be compiled to a Python module.
Preprocessor directives
We need to define a macro (#define PY_SSIZE_T_CLEAN) and include the Python header file (#include <Python.h>). This header file declares (actually, it doesn’t declare itself, but includes in turn other files; but you got the idea) all the functions and data types that you need in order to communicate with the Python interpreter (work with Python objects received as parameters, create Python objects and return them, etc.). And make sure you define PY_SSIZE_T_CLEAN before including the Python header file. So, at the beginning of the file, we should have something like:
#define PY_SSIZE_T_CLEAN
#include<Python.h>
Creating the actual functions of our module
The functions that we want to export to Python from our module should be created according to the following template:
// Parse arguments with PyArg_ParseTuple(args, ...)
// Do other stuff, maybe createsome Python objects like a List, Tuple, Integerobject etc.
//Returnoneof these objects, or maybe nothing (NULL)
}
Here, “func_name” can be anything we want but a common naming convention is to use “_” as the name of our functions.
PyObject is a generic data type that can represent any object that is used in Python, such as: Integers, Floats, Lists, Tuples, etc.
These functions receives 2 parameters: self and args, both of them being pointers to PyObject. Self points to the module object, so we can use it to store an internal state of our module, or access other stuff inside our module. Args points to a tuple of positional arguments or a dictionary of keywords -> arguments. To use these arguments in our C code we need to parse them with either PyArg_ParseTuple() or PyArg_ParseTupleAndKeywords().
Where: “format_stringN” is a string that specifies what data type to store in variableN. This code stores the arguments that are passed when the function is called in Python into variable1, variable2, … which are C variables. PyArg_ParseTuple() returns true if succeeds or false otherwise. Here you can find a complete list with format strings.
Then, to do other things that we may want to do inside the our function and work with Python objects, such as Numbers, Lists, Tuples, etc., we need to use the functions defined in Python.h which are too many to say something about each one of them here, so you can have a look in the documentation; here you can find all of these functions.
Creating the methods array
The Python interpreter doesn’t automatically know how to find in our module all the functions that we created. So we need to create a mapping “function name as we want to appear in Python” -> pointer to one of the functions in the previous section. Then, store this mapping in a variable which we will export to Python. This is how we do this:
c_name is the name of our C function (a function pointer)
then we use either METH_VARARGS– for positional arguments, or METH_VARARGS | METH_KEYWORDS – if we want to use keywords
“description” can be any string, or NULL
Our methods_array should always have {NULL, NULL, 0, NULL} as last element.
Creating the module struct
After we created the methods_array, we need to create a variable which holds information about the module as a whole (methods_array being one of its members). We do so like this:
static struct PyModuleDef module_name = {
PyModuleDef_HEAD_INIT,
"module_name",
"module description/documentation",
-1, /* size of per-interpreter state of the module,
or -1 if the module keeps stateinglobal variables. */
methods_array
};
Create the initialization function
Finally, we create the function that initializes our module by using the variable module_name from the previous step. This initialization function should be the only non-static thing in the C file. Here is how it looks like:
As an example, here we will do the same fib function as previously shown in plain C, but this time we will do it as a Python module.
Here is the full code of this Python module:
Build the module
To build the module we use distutils Python library and create a setup.py script like this:
from distutils.core import setup, Extension
module1 = Extension(name='module_name',
sources=['file1.c', 'file2.c', ...],
include_dirs=[], # list of directories to search for C/C++ header files
library_dirs=[], # list of directories to search for static libraries
runtime_library_dirs=[], # list of directories to search for shared libraries
libraries=[] # list of library names)
setup (name = 'module_name',
version = '1.0',
description = '...',
ext_modules = [module1])
After the setup.py script is done, we can run:
python setup.py build
to build it.
If everything worked fine and there are no compilation errors, we should have a build folder in the same parent direcrory as the setup.py file. Inside build you’ll find “lib.” folder and inside it is the compiled Python module.
Or:
python setup.py install
to install it in the proper location.
After it has been installed we should be able to import from it using the module name that we set previously. We can also use it without installing; just after we build it, if we create a Python file in the same folder as the resulting .so (Unix) or .pyd (Windows) file, it should work to import from it using the module name that we defined (not the whole name of the compiled file).
If you are on Windows, you will need to have MS Visual Studio installed (and check the C++ option when installing, because the C/C++ compiler is needed) in order for the setup script to work.
If you want to find out more about the building process, here is the documentation.
Back to our example
Here is the setup.py file for our “my_c_module” with the fib function:
# setup.py
from distutils.core import setup, Extension
module1 = Extension('my_c_module',
sources = ['fib_c_module.c'])
setup (name = 'my_c_module',
version = '1.0',
description = 'This is a demo package',
ext_modules = [module1])
After we build or install it, we can use the module in a Python file:
Now, let’s take all 3 ways of writing our example fib() function: 1) Pure Python, 2) Python C module, 3) Pure C function called from Python, and use them in a single Python file and benchmark it to see which one is faster.
First, let’s implement the fib() function in Python:
And here we use them all:
Now, let’s benchmark it using timeit. We are going to run each fib function for 100,000 iterations and print at the end how much it took for each one to finish.
The results may vary on different runs of the script, but as I can see on my system the order is the same (from fastest to slowest): Python C Module, Pure C function called from Python, Pure Python (executed with the regular python command).
Let’s also see how much time it takes for the pure Python version to run if we execute it with PyPy:
Surprisingly, it’s faster even than the module written in C. However, I don’t think this holds in general for any module you may write, but for our simple fib function that seems to be the case.
I hope you found this information useful and thanks for reading!
This article is also posted on my own website here. Feel free to have a look!