How to wrap libtopotoolbox
functions with the pytopotoolbox
#
What needs to be done to add a new function#
You will need to modify content in the following directories and files:
src/topotoolbox/ Refer to the section on wrapping pybind11 functions.
src/topotoolbox/init.py If you added a new file for a new class, you will need to add it here so it will be automatically imported when importing
topotoolbox
.CMakeLists.txt If you added a new class, you will need to add a link using
target_link_libraries()
,pybind_add_module()
, and the install sectioninstall()
.docs/api.rst To include your function in the API documentation, add it here. Since we are using recursive autosummary, if your function is part of a class, it will automatically be added if the class is added to this file. If your function is not part of a class, you will need to add it manually.
tests/ Include tests for your function here.
examples/ If you want to provide an example as documentation for your function, create a new Jupyter notebook here. You can fill the file however you see fit, but make sure to include a section title for the file by making the first line your title and underlining it with
====
.docs/examples.rst If you added a new example, include it in the example gallery by adding a new line:
/examples/name_of_your_example
docs/conf.py If you added a new example, you can also add a thumbnail for it here under the section
nbsphinx_thumbnails
.
Creating a wrapper for libtopotoolbox functions using pybind11#
We create a separate file for all wrappers of one class. The src/lib/grip.cpp
will contain all wrappers for the GridObject class for example. Each file has to include the following code:
extern "C" {
#include <topotoolbox.h>
}
#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>
namespace py = pybind11;
Create a wrapper by creating a new function, we name them void wrap_func_name() to differentiate them from the libtopotoolbox functions. In this function you create pointers for your arrays and pass them to the function you are wrapping void func_name(). We are always editing our data in place, to all function return void.
void wrap_func_name(py::array_t<float> dem, std::tuple<ptrdiff_t,ptrdiff_t> dims){
float *dem_ptr = output.mutable_data();
std::array<ptrdiff_t, 2> dims_array = {std::get<0>(dims), std::get<1>(dims)};
ptrdiff_t *dims_ptr = dims_array.data();
func_name(dem_ptr, dims_ptr);
}
At the end of the file we will include the function to the Pybind11 module. The module will be named after the class it’s used for. For the grid.cpp
file the module is named _grid
.
PYBIND11_MODULE(_module_name_, m) {
m.def("function_name", &wrap_function_name);
}
When the topotoolbox package is beeing build these modules will be compiled and made available for the python wrappers.
Wrapping pybind11 functions#
Since the the pybind11 functions will only be available during the build process we always disable pylint errors when importing them into our python files. We do this so the code passes the pylint test we run in our .github/workflows and so that your IDE doesn’t yell at you.
# pylint: disable=no-name-in-module
from . import _module_name # type: ignore
When creating your python function you call the pybind11 wrapper like so:
_module_name.function_name()
What order arrays should be in#
Because Matlab arrays are stored in column major order, the libtopotoolbox is using the same logic. While the introduction of the dims[2]
argument added support for other approches, we still generally use the Fortran order.
The following example will showcase what this means in practice.
[1]:
import numpy as np
# creating the same array, once in C and once in F order
array_c = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]], order='C')
array_f = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]], order='F')
dims = array_f.shape
print(f"Both arrays will look like this when plotting at them:\n{array_c}")
array_c = array_c.flatten(order='C')
array_f = array_f.flatten(order='F')
print("\nBut when the arrays are passed to the C function, they will be\n"
"flattened into on dimension like this:"
f"\nC order: {array_c}\nF order: {array_f}")
Both arrays will look like this when plotting at them:
[[ 1 2 3]
[ 4 5 6]
[ 7 8 9]
[10 11 12]]
But when the arrays are passed to the C function, they will be
flattened into on dimension like this:
C order: [ 1 2 3 4 5 6 7 8 9 10 11 12]
F order: [ 1 4 7 10 2 5 8 11 3 6 9 12]
As you can see, the Column major order (F), will result in each column being listed after one another instead of each row being listed after another (C).
The first element of dims[2] (generated from the np.ndarray.shape) should be the size of the dimension that changes fastest as you scan through memory, e.g. rows for column-major/’F’ order and cols for row-major/’C’ order.
In our example for F order you can see that 1, 4 and 7 are next to each other in memory. They are all in different rows. Therefor rows change faster than cols so the value denoting how many rows there are should be first value (dims[0]).
[2]:
print(dims)
(4, 3)
In our example there are 4 rows and 3 columns. Since the rows are first in dims we should pass this array to the C/C++ code in F order.
When looping trough the array we try to access the elements after another that are actually located next to each other in memory. Therefor the outer for loop should loop over dims[1] the inner over dims[0].
[3]:
for j in range(dims[1]):
for i in range(dims[0]):
location = j * dims[0] + i
print(array_f[location], end=', ')
1, 4, 7, 10, 2, 5, 8, 11, 3, 6, 9, 12,
To access neighboring cells in the same col we just need to add or subtract 1 from the index. For neighboring cells in same row we need to either add or subtract the length of one row (dims[0]) from the index.
In our example the neighbors of 5 in memory are 2 and 8. The neigbors on the same row are 4 and 6.
[4]:
print(f"array[5] = {array_f[5]}")
print(f"above = {array_f[5-1]}")
print(f"below = {array_f[5+1]}")
print(f"left = {array_f[5-dims[0]]}")
print(f"right = {array_f[5+dims[0]]}")
array[5] = 5
above = 2
below = 8
left = 4
right = 6
Alternative way to loop through the array:
[5]:
for index in range(dims[0] * dims[1]):
col = index // dims[0]
row = index % dims[0]
print(f"i: {index}, row: {row}, col: {col}, array[i] = {array_f[index]}")
i: 0, row: 0, col: 0, array[i] = 1
i: 1, row: 1, col: 0, array[i] = 4
i: 2, row: 2, col: 0, array[i] = 7
i: 3, row: 3, col: 0, array[i] = 10
i: 4, row: 0, col: 1, array[i] = 2
i: 5, row: 1, col: 1, array[i] = 5
i: 6, row: 2, col: 1, array[i] = 8
i: 7, row: 3, col: 1, array[i] = 11
i: 8, row: 0, col: 2, array[i] = 3
i: 9, row: 1, col: 2, array[i] = 6
i: 10, row: 2, col: 2, array[i] = 9
i: 11, row: 3, col: 2, array[i] = 12