Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork7.9k
Use pybind11 for C/C++ extensions#23787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
@@ -302,6 +302,8 @@ def make_release_tree(self, base_dir, files): | |||
setup_requires=[ | |||
"certifi>=2020.06.20", | |||
"numpy>=1.19", | |||
"pybind11>=2.6", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is a new build dependency, but is not a runtime dependency.
"matplotlib._qhull", ["src/_qhull_wrapper.cpp"], | ||
cxx_std=11, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
We can specify the C++ standard to follow, so here we are explicitly restricting to C++11. This would probably be a configuration param at the top of the file to make it clear what C++ standard the project is currently working to.
PyTuple_SET_ITEM(tuple, 0, triangles.pyobj()); | ||
PyTuple_SET_ITEM(tuple, 1, neighbors.pyobj()); | ||
return tuple; | ||
return py::make_tuple(triangles, neighbors); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Good example of improved handling of creating a tuple to return.
: _x(x), | ||
_y(y), | ||
_triangles(triangles), | ||
_mask(mask), | ||
_edges(edges), | ||
_neighbors(neighbors) | ||
{ | ||
if (_x.ndim() != 1 || _y.ndim() != 1 || _x.shape(0) != _y.shape(0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Argument type checking is handled automatically bypybind11
. Here we need to check that array shapes are consistent. These checks were originally in the wrapper functions that no longer exist.
ianthomas23 commentedAug 31, 2022 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Can install using Edit: this is no longer a problem. |
I am weakly in favor of this. I think getting away from hand-rolled c-extensions is a good idea. Another down side is that this makes the bet that pybind11 is going to last but given that we already have a transient dependency and scipy has picked up a pybind11 dependency, I think we are past avoiding that risk. |
src/_qhull_wrapper.cpp Outdated
@@ -125,8 +135,8 @@ class QhullInfo { | |||
/* Delaunay implementation method. | |||
* If hide_qhull_errors is true then qhull error messages are discarded; | |||
* if it is false then they are written to stderr. */ | |||
static PyObject* | |||
delaunay_impl(npy_intp npoints, const double* x, const double* y, | |||
py::tuple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
These can stay static (i.e. module-private)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Sure.
src/_qhull_wrapper.cpp Outdated
qhull_error_msg[exitcode], exitcode, | ||
hide_qhull_errors ? "; use python verbose option (-v) to see original qhull error." : ""); | ||
return NULL; | ||
std::ostringstream oss; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
FWIW I have a small helper in mplcairo for string formatting:
https://github.com/matplotlib/mplcairo/blob/3d9151da64587d0e1116ccd7a425c6e899efdfa0/src/_util.cpp#L93-L95
https://github.com/matplotlib/mplcairo/blob/3d9151da64587d0e1116ccd7a425c6e899efdfa0/src/_mplcairo.cpp#L336
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
That's nice! I hadn't considered dropping back to Python to do the string formatting.
m.doc() = "Computing Delaunay triangulations.\n"; | ||
m.def("delaunay", &delaunay, | ||
"delaunay(x, y, /)\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
You can usehttps://pybind11.readthedocs.io/en/stable/basics.html instead to specify argument names (possibly together withhttps://pybind11.readthedocs.io/en/stable/reference.html?highlight=kw_only#_CPPv48pos_only if you want to keep them positional-only).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
OK, done this. It does improve the docstring.
src/_qhull_wrapper.cpp Outdated
"triangles, neighbors : int arrays, shape (ntri, 3)\n" | ||
" Indices of triangle vertices and indices of triangle neighbors.\n"); | ||
m.def("version", &version, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I would just make this a lambda[]() { return qh_version; }
(returning achar*
is OK).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Very happy to use a lambda! There are no lambdas yet in our C++ codebase, but if you think other devs are happy to have them...
&yarray.converter_contiguous, &yarray)) { | ||
return NULL; | ||
} | ||
if (x.ndim() != 1 || y.ndim() != 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I usually convert my arrays to use unchecked access (https://pybind11.readthedocs.io/en/stable/advanced/pycpp/numpy.html#direct-access) (which of course means that out of bounds access can result in anything) and implicitly perform the dimensionality check at the same time as the conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I've included examples of both.unchecked<>()
and.data()
to illustrate the possibilities and to minimise the code changes. I would default to using.data()
for these 2 modules as all 2D arrays used have a fixedshape(1)
of either 2 or 3, and use ofdata
and a hardcoded multiple of 2 or 3 is much faster than usingunchecked
which needs to get the strides. For other extensions which accept arbitrarily-shaped arrays then probablyunchecked
would be preferred.
The lifetimes of these arrays is strictly controlled by the calling code so I may as well check the dims are OK just once, in the constructor.
See also#9763 which did the same for ft2font. |
Apart from the question of pybind11 lasting, which I cannot really judge beyond that it seems to quite actively maintained at the moment (and historically), I think this is a good idea! @anntzer it is realistic to get#9763 up and running again if it is decided to go with pybind11? (Quite a lot of conflicts, but no idea if they are just from the import section or more complicated than that.) |
That may need a bit of work, but is certainly doable. Or it could be a small project if any newcomer (well-versed in C++ and/or freetype) wants to pick it up :) |
I added it to today's agenda. If we are going to adopt pybind11 we should write up issues to move all of our wrappers over. |
As@anntzer can't make today's meeting we should probably bump discussing pybind11 to next week? |
I am inferring@anntzer is in favor of picking up the dependency, but bumping to next week is reasonable. |
anntzer commentedSep 1, 2022 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
I am +0.5 on picking the dependency. I really like pybind11; OTOH in our specific case the C code is already there (on the third hand switching to pybind11 is probably better for maintainability). |
the examples@ianthomas23 gave made it seem that way, plus the easier maintainability may make that part of the code base more approachable |
I didn't know about this, but I am glad that it exists as I think that wrapping ft2font is a larger task than the 2 extensions in this PR. I think we should discourage use of C++17. NumPy is using C++11 |
This is building and passing tests in CI on Linux and Windows but not macOS. Problem is the mixture of C and C++ files in I can think of a few ways of attempting to solve this, all of them slightly unpleasant. |
This now passes all CI. Some of it probably isn't yet suitable for production code, but it is a good starting point to demonstrate that the approach could work. |
@@ -62,6 +62,7 @@ install: | |||
- echo %PYTHON_VERSION% %TARGET_ARCH% | |||
# Install dependencies from PyPI. | |||
- python -m pip install --upgrade -r requirements/testing/all.txt %EXTRAREQS% %PINNEDVERS% | |||
- python -m pip install "pybind11<2.10" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
why the pin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I don't recall the precise reason, presumably I was experimenting when the build didn't work. But it doesn't matter now, since merging of#24102 we don't need to preinstallpybind11
for editable installs so changes like this can be reverted.
mpl._tri.Triangulation() | ||
with pytest.raises( | ||
ValueError, match=r'x and y must be 1D arrays of the same length'): | ||
mpl._tri.Triangulation([], [1], [[]], None, None, None, False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
if None is not accepted anymore this warrants a changelog entry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is the internal C++ Triangulation API (matplotllib._tri.Triangulation
) that is used by the Python Triangulation class (matplotlib.tri.Triangulation
). It is all private and nobody should be using it, so it doesn't need a changelog entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Python Triangulation class still acceptsNone
fortriangles
, etc.
py::arg("edges"), | ||
py::arg("neighbors"), | ||
py::arg("correct_triangle_orientations"), | ||
"Triangulation(x, y, triangles, mask, edges, neighbors)\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
You can get rid of this manually typed signature. (Ditto everywhere.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Ah yes, following the use ofpy::arg
.
numpy::array_view<int, ndim> triangles(dims); | ||
int* triangles_ptr = triangles.data(); | ||
int dims[2] = {ntri, 3}; | ||
IndexArray triangles(dims); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
You don't even need to allocate dims on the stack; I usually just writeIndexArray triangle{{ntri, 3}};
. Ditto below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I know. Remember that the idea of this PR is to demonstrate the new approach works and show how to map from the old to the new implementation. There used to be an explicit line declaring thedims
, so the new code has the equivalent line declaring thedims
.
I am going to stop working on this PR. I don't like the solution I came up with for compiling the qhull wrapper (composed of both C and C++ files) as I ended up altering distutils private methods to prevent C++ specific compiler flags being used to compile C code. Instead I will submit a PR just for the triangulation module part of this PR, and I can consider a better approach for the qhull wrapper in time. I'm happy just to close this PR and leave the branch hanging around for future comparison. Alternatively it could be kept open but labelled in some way to indicate that it won't ever be merged? |
Just close the PR. Open PRs anticipate a follow up action on the PR, even if that action is only "decide what to do". |
Uh oh!
There was an error while loading.Please reload this page.
This is a proof of concept of using
pybind11
to wrap and compile MPL C/C++ extensions. It covers just the_qhull
and_tri
extensions as (1) they are relatively small and self-contained, and (2) I understand them well. The motivation is to provide information to other devs on what it would involve and then consider if this is a suitable approach for all of our C/C++ extensions or not.Advantages:
array_view
class innumpy_cpp.h
aspybind11
includes its own NumPy array wrapper.PyArg_ParseTuple
as we get this for free.list
andtuple
, so no longer need to deal with Python C/API reference counting and error checking.pybind11
catches it and converts it to a Python equivalent exception._qhull.so
and_tri.so
are 1.6 MB and 1.3 MB onmain
but only 1.0 MB and 0.7 MB on this branch.Disadvantages:
I have tried to keep the changes as small as possible to make it clear what the minimal set of changes is. But if we were to go ahead with this approach we would probably want to make more, particularly to
typedef
s and numpy array accessors, to standardise and make the code clearer.Performance of a large
qhull
Delaunay triangulation is essentially unchanged by this PR. Performance of a largetricontour
call is ~3% faster. There is a tradeoff here between performance and code clarity, in particular whether we access multi-dimensional numpy arrays via conventional-looking indexing (e.g.array(j, i)
) which is slower than accessing via low-level C pointers where we do the index maths ourselves for speed.I know of one core dev (@anntzer) and one occasional contributor (me) who have
pybind11
experience, viamplcairo
andcontourpy
respectively.(Edited to add another disadvantage).