Make RustPython benchmarks readable #5742

New issue

Open

Make RustPython benchmarks readable#5742

Description

e-dervieux

opened

on Apr 24, 2025

TL;DR: benchmarks are poorly readable and could be greatly improved. This is key element in convincing people of the soundness of RustPython so it should probably not be neglected IMHO.

The violin plots availablehere are not easily readable and their Y-axes labels are hardly readable at all because they got left-cut at some point. This is especially troublesome for theMICROBENCHMARKS section, for which it is impossible to tell RustPython from CPython.

This issue could be alleviated by doing the following:

Use a specific color for CPython and another one for RustPython (andkeep this color pair consistent across all plots).
Always have CPython data on top and RustPython data on bottom (this is not consistent: in theEXECUTION tab, CPython is on top and RustPython on bottom, while in tabPARSE_TO_AST it is the other way around).
Only keep the name of the benchmark in the Y-axis labels, i.e. replaceexecution/mandelbrot.py/cpython by eitherMandelbrot (and use a legend to indicate which color is which interpreter), or make a plot title sayingMandelbrot and use the Y-axis labels to tell whether it is CPython or RustPython.

In addition to these visual issues, some other improvements could be implemented:

Make the plots user-friendly using some interactive backend such asplotly.
Put hyperlinks to the benchmark script location / source-code, so that users can check what the benchmarks are actually doing.
In the same line of thought, add a small descriptive text about what the benchmark does / why it is relevant (for instance "benchmark X is particularly I/O intensive" or whatnot).
On top of the page, give the hash of the commit / version (possibly with release date to know at a glance if they're outdated or not) of both CPython and RustPython binaries that were used, whether they were recompiled with-o3 locally, as well as the machine specs (this would allow for meaningful comparison and reproducibility).

I think that benchmarks one of the key element that might convince anyone to switch from one interpreter to another (apart from functionalities / low-level bindings). Hence they should not be neglected.

If someone could point me to where these plots are generated, I'd be happy to help typesetting them / add further info (although I might need some technical support about why benchmark X is especially relevant or not).

Metadata

Assignees

No one assigned

Labels

No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make RustPython benchmarks readable #5742

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions