- Notifications
You must be signed in to change notification settings - Fork2
A header-only, modern C++ interface to use scripting languages like Python, Ruby or Gnuplot directly from C++
License
wardw/ioscript
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
ioscript is an interface for calling code written in higher level scripting languages directly from C++. The library motivates writing small function objects that embed useful scripting behaviour, either to be called independently or as small scripting 'snippets' to compose as a sequence. Function objects may also be bound to C++ types and so provide a method to invoke script code directly on C++ objects.
The original use case was for quickly plotting C++ objects to debug scientific code using matplotlib and Gnuplot, however you might find if useful in any context where it's helpful to bind script code directly with C++ types for further processing.
// Example usage: Plot vec1 & vec2 using the python code wrapped in a LineChart,// and vec3 using the python code wrapped in a BarChart, then Show to screen Script<Python,MyTypes> script; script.run(LineChart{}, vec1, vec2, BarChart{}, vec3, Show{});
The library is agnostic to the scripting language, it should work with any runtime that's installed on your system and accepts code via it's standard input. There is a set of examples for using the library with Python, Ruby and Gnuplot, but depending on the language it should be reasonably straightforward to set up with different runtimes.
The library is a single header plus optional definitions for each language's runtime. For C++17/1z there are no further dependencies, otherwise there is a toggle to switch to boost forBoost.Variant (also header-only) (edit: but currently broken).
The interface is currently implemented in terms of POSIX pipes and so unfortunately there is no Windows support. Tested only on macOS, although I expect it to work fine on linux.
The following example plots the values of astd::vector
with Python and the matplotlib plotting library.
#include"ioscript/ioscript.h"#include"ioscript/python.h"#include<vector>usingnamespaceiosc;structLineChart{template<typename T>voidoperator()(Process<Python>& python,const T& obj)const { python <<R"(import matplotlib.pyplot as pltvals = map(int, iosc_in[0].readline().split())plt.plot(vals, 'o-'))";for (auto& elem : obj) { python.data_out(0) << elem <<""; } python.data_out(0) << std::endl; }};structShow {voidoperator()(Process<Python>& python)const { python <<"plt.show()\n"; }};// Associate std::vector with a LineCharttemplate<>structbinds_to<std::vector<int>> {using type = variant<LineChart>; };voidmain(){ std::vector<int> series1{0,1,1,2,3,5,8,13,21,34,55,89}; std::vector<int> series2{0,1,3,6,10,15,21,28,36,45,55,66,78,91};using MyTypes = std::tuple<std::vector<int>>; Script<Python,MyTypes> script; script.run(LineChart{}, series1, series2, Show{});}
The sections below outline the basic parts of the above example and for using ioscript in general.
Function objects likeLineChart
embed the scripting code to define the intended behaviour, for example the Python code used to describe how to plot a data type. For functors that take an object as a second parameter, immediately following is a set of C++ statements to accept the object and describe how that data is sent to the Python subprocess. In simple cases, since the two parts are kept together there's a direct continuity between how the data is sent and how the data is received.
In the above example, the callpython.iosc_out(0)
returns astd::ostream
to send the data to the Python subprocess on 'channel 0'. In Python, the read end of channel 0 is represented by the file objectiosc_in[0]
.
If the embedded script code doesn't operate on any C++ object the second parameter to the overloadedoperator()
can be omitted. Since the library only requires that the function object is callable, you could write the code inline with a lambda:
script.run(LineChart{}, series1, series2, [](Process<Python>& py) { py <<"plt.savefig('vals1.png')" << std::endl;});
Alternatively keep it with a name:
structTitle {voidoperator()(Process<Python>& python)const { python <<"plt.title('" << title <<"')" << std::endl; }constchar* title ="Default title";};
Function objects passed to ioscript must be default, copy and copy-assign constructable, but in broad terms the library imposes few requirements on the nature of the functor classes.
The following specifies the relationship that should we pass an object of typestd::vector
torun()
, we want ioscript to call aLineChart
function object on thatstd::vector
. To do this, first wrapLineChart
in avariant
(of which we only have one alternative) and then add a specialization for the type traitbinds_to<std::vector<int>>
to refer to this variant:
// std::vector currently binds only to one alternative (LineChart)template<>structbinds_to<std::vector<int>> {using type = variant<LineChart>; };
Should you want to associate more data types to work with aLineChart
, add additionalbinds_to<>
statements. Templates work as expected, too. The following makes astd::array<T,N>
of any typeT
and sizeN
bind for use with ourLineChart
'snippet':
template<typename T,size_t N>template<>structbinds_to<std::array<T,N>> {using type = variant<LineChart>; }
To call out one piece of housekeeping, you must register the types you expect to use in calls torun()
. That is, you must register at least one of every type such that all snippet types are visible to ioscript. This isn't ideal, but is the fundamental mechanism by which the static binding can work. To do so, define astd::tuple
that lists your types, and use this as the second parameter to instantiating aScript<Process,MyTypes>
:
using MyTypes = std::tuple<std::vector<int> std::array<int,0>>;// `0` is sufficient for all array<int,N>// LaterScript<Python, MyTypes> script;script.run(...);
If you forget to do so ioscript should provide a useful error message to that effect, rather than leave you at the fate of your compiler's template error messages[1].
To execute a sequence of script snippets, instantiate aScript<Python, MyTypes>
object and call itsrun()
member function, passing objects of your snippet types. The call torun()
takes any number of parameters, with arguments always parsed from left to right and processed in sequential order:
std::vector<int> vec1, vec2;std::array<int,10> arr1;// Fill containers ...Script<Python, MyStyles> script;script.run(LineChart{}, Title{"Number sequences"}, vec1, vec2, arr1, Show{});
In the above example, therun()
call will fork a new Python subprocess, keep a copy of our default-constructedLineChart
, send ourTitle
code to the Python interpreter and then call theLineChart
copy with eachvector
andarray
argument in turn.
Notice each snippet object can have its own state. For example, you might modifyLineChart
to take a line colour parameter:
script.run(LineChart{'r'}, vec1, LineChart{'b'}, vec2, Show{});
The basic approach adopted is
- Any snippet that defines the one-parameter
operator()(Process<P>&)
will be called immediately in sequence. - Any snippet that defines the two parameter form
operator()(Process<P>&, const T&)
will instead be called once with each object of typeT
that binds to it. - A function object can contain either form of
operator()
, or both, with the above semantics applied to each form in turn.
It's possible to declare a C++ object to be bound to more than one snippet alternative, with the choice of functor to be selected as part of each call torun()
. To do so, add alternative snippets as additional types to the relevantvariant<>
's parameter list. For example, had we defined aBarChart
andPieChart
along similar lines toLineChart
, then
// A std::vector<int> is now available for use with three alternative snippet typesstructbinds_to<std::vector<int>> {using type = variant<LineChart,BarChart,PieChart>; };
Where more than one alternative exists, the alternative chosen is the 'most recent' type passed that binds to that object (that is, as if reading from left to right). For example
script.run(LineChart{}, vec1, vec2, BarChart{}, vec2, Show{});
will plotvec1
andvec2
as aLineChart
, andvec2
as aBarChart
.
If no snippet is explicitly specified, a default-initialized snippet of the type of the first alternative is constructed. This follows immediately fromstd::variant
orboost::variant
.
By default, objects passed that do not associate with any snippets will cause a deliberate compile error (under the assumption that there is never a purposeful reason to do so). You can change this behaviour inioscript.h
- the error will lead you directly to the line to change.
Each call torun()
forks a new subprocess, sends the sequence of code snippets and terminates the process. No state is shared between run calls. Whilst this is noticeably slower, this significantly helps reason about each plot call independently.[2]
If you do want to share state that's common to a number of successive runs, pass the snippets in theScript
constructor:
Script<Gnuplot,MyTypes>script(ImageSize{800,600});script.run(vec1);// plots 800x600script.run(ImageSize{640,480}, vec2);// plots 640x480script.run(vec3);// plots 800x600
All snippets passed in the constructor are stored in a common 'header' that's replayed at the beginning of each call torun()
. Therun()
call remains independent, but each call is first prepended with any statements in the header. The semantics for passing arguments in the constructor work analogously to therun()
member function.
TheScript
constructor is useful for initializing script-wide variables, importing modules or defining default behaviour. Conceptually it may be thought of as analogous to a constructor for your script.
structHeader{voidoperator()(Process<Python>& python)const { python <<"import matplotlib.pyplot as plt\n" <<"figtitle = 'Default title'\n"; }};
In this sense additional snippets that mutate script-wide state might be thought of as interface methods to the script 'object' itself. From my experience, ioscript works best for relatively specific self-contained tasks, or those that perhaps share just a few script-wide variables (like plotting an object). I suspect the library may be less suited at larger scales that require more structure.
If at some later time you want to add additional code to be available for all subsequent runs, you can use the methodaddToHeader(args...)
. This likewise adopts the same semantics asrun()
.
script.addToHeader(Colours{OCEAN});// Use OCEAN colour palette from now onscript.run(vec2);
Since a variant default-initializes its first alternative, the simplest usage reduces to
// Call vec1 with the default bound function objectscript.run(vec1);
Depending on your propensity for singletons, one approach may be to wrap aScript
as a singleton for quick one-liner debugging.
TheScript
class uses the lower levelProcess<Type>
to abstract handling of the subprocess itself. If you didn't want to use theScript
interface, you could useProcess<Type>
independently and send data directly to that subprocess' standard input.
TheProcess<Type>
class is parametrised on a type that has just one static membercmd
- the command for the subprocess to execute. For example, the following example forks aruby
subprocess and sends it a Ruby statement to be executed by the ruby interpreter:
#include"ioscript.h"structRuby {staticconstexprconstchar* cmd ="ruby"; };voidmain(){ Process<Ruby> ruby{1};// Open 1 channel ruby <<"puts 'Hello, world!'" << std::endl;}
Since the subprocess' standard input is tied to sending the code to execute, to send data, theProcess<Type>
class opens additional pipes before forking the process. In C++, the write end of these channels is available via thedata_out()
member function that returns a type inherited fromstd::ostream
.
// Send data via the first data channelruby.data_out(0) <<"22 67 14 42" << endl;
To access the data from the subprocess, use the member functionfd_r()
to get the value of the open file descriptor that refers to subprocess' inherited the read end of the pipe. Typically there will be some kind of I/O facility to map this file descriptor to a useful abstraction in the scripting language. In Ruby:
// Get data as IO objectruby <<"io = IO.open(" << ruby.fd_r(0) <<", mode=\"r\")" << std::endl// Process e.g. as a single integer array per lineruby <<R"(io.each_line {|line| puts "vals: #{line.split().map { |s| s.to_i }}" })" << std::endl;
It seems simplest to pass thefd
value inline with the code string, rather than say trying to get a list of open file descriptors from the system via a system call on the subprocess' side.
One note: if you read past the end of the input stream Ruby will wait for further input as long as theProcess<Type>
object remains in scope. However, Ruby will also wait on its own input - indefinitely - as Ruby's inherited write end of the pipe remains open. In this case it's customary to first close the unused write end on the Ruby side:
ruby <<"a = IO.new(" << ruby.fd_w(0) <<",\"w\")" << std::endl <<"a.close" << std::endl;
The approach in Python is similar:
Process<Python> python;// Send codepython <<"print\"Hello, world!\"" <<'\n';// Send datapython.data_out(0) <<"22 67 14 42" << endl;// Close the write endpython <<"import os" <<'\n' <<"os.close(" << python.fd_w(0) <<")" <<'\n';// Create a file object to the read endpython <<R"(iosc_in = os.fdopen()" << python.fd_r(0) <<R"(, 'r')for line in fo: print "vals:", map(int, line.split()))";
See example_process.cpp for a working example.
This work is a tidying up of a previous version used for a project that's now finished. As yet - I've not yet had cause to use this more thoroughly, so this refactoring remains largely untested in real use. It's also fair to concede that while usage remains fairly simple in practice, the use of templates and static binding can cause a number of gotchas for common errors. There is still a lot of scope to smooth the experience. However, I wanted to get this down before moving on and if anyone finds all or parts of this useful they're welcome to hack it/raise an issue/get in touch.
1: There are a number ofstatic_assert
s that hopefully catch the most commonly expected issues, although at times where this isn't the case the (perhaps over-zealous) use of templates does unfortunately mean your typical compiler errors can be a little, lets say, verbose and indirect.↩
2: Depending on the scripting language, there may be effective methods for cleanly resetting the interpreter's global state. In such cases there might be mileage in adding such behaviour to the script header and so modifying theScript
class to sustain a single subprocess for the lifetime of the C++ program.↩