Argument Clinic How-To¶
- author
Larry Hastings
Abstract
Argument Clinic is a preprocessor for CPython C files.Its purpose is to automate all the boilerplate involvedwith writing argument parsing code for “builtins”.This document shows you how to convert your first Cfunction to work with Argument Clinic, and then introducessome advanced topics on Argument Clinic usage.
Currently Argument Clinic is considered internal-onlyfor CPython. Its use is not supported for files outsideCPython, and no guarantees are made regarding backwardscompatibility for future versions. In other words: if youmaintain an external C extension for CPython, you’re welcometo experiment with Argument Clinic in your own code. But theversion of Argument Clinic that ships with the next versionof CPythoncould be totally incompatible and break all your code.
The Goals Of Argument Clinic¶
Argument Clinic’s primary goalis to take over responsibility for all argument parsing codeinside CPython. This means that, when you convert a functionto work with Argument Clinic, that function should no longerdo any of its own argument parsing—the code generated byArgument Clinic should be a “black box” to you, where CPythoncalls in at the top, and your code gets called at the bottom,withPyObject*args (and maybePyObject*kwargs)magically converted into the C variables and types you need.
In order for Argument Clinic to accomplish its primary goal,it must be easy to use. Currently, working with CPython’sargument parsing library is a chore, requiring maintainingredundant information in a surprising number of places.When you use Argument Clinic, you don’t have to repeat yourself.
Obviously, no one would want to use Argument Clinic unlessit’s solving their problem—and without creating new problems ofits own.So it’s paramount that Argument Clinic generate correct code.It’d be nice if the code was faster, too, but at the very leastit should not introduce a major speed regression. (Eventually ArgumentClinicshould make a major speedup possible—we couldrewrite its code generator to produce tailor-made argumentparsing code, rather than calling the general-purpose CPythonargument parsing library. That would make for the fastestargument parsing possible!)
Additionally, Argument Clinic must be flexible enough towork with any approach to argument parsing. Python hassome functions with some very strange parsing behaviors;Argument Clinic’s goal is to support all of them.
Finally, the original motivation for Argument Clinic wasto provide introspection “signatures” for CPython builtins.It used to be, the introspection query functions would throwan exception if you passed in a builtin. With ArgumentClinic, that’s a thing of the past!
One idea you should keep in mind, as you work withArgument Clinic: the more information you give it, thebetter job it’ll be able to do.Argument Clinic is admittedly relatively simple rightnow. But as it evolves it will get more sophisticated,and it should be able to do many interesting and smartthings with all the information you give it.
Basic Concepts And Usage¶
Argument Clinic ships with CPython; you’ll find it inTools/clinic/clinic.py.If you run that script, specifying a C file as an argument:
$python3Tools/clinic/clinic.pyfoo.cArgument Clinic will scan over the file looking for lines thatlook exactly like this:
/*[clinic input]
When it finds one, it reads everything up to a line that looksexactly like this:
[clinic start generated code]*/
Everything in between these two lines is input for Argument Clinic.All of these lines, including the beginning and ending commentlines, are collectively called an Argument Clinic “block”.
When Argument Clinic parses one of these blocks, itgenerates output. This output is rewritten into the C fileimmediately after the block, followed by a comment containing a checksum.The Argument Clinic block now looks like this:
/*[clinic input]... clinic input goes here ...[clinic start generated code]*/... clinic output goes here .../*[clinic end generated code: checksum=...]*/
If you run Argument Clinic on the same file a second time, Argument Clinicwill discard the old output and write out the new output with a fresh checksumline. However, if the input hasn’t changed, the output won’t change either.
You should never modify the output portion of an Argument Clinic block. Instead,change the input until it produces the output you want. (That’s the purpose of thechecksum—to detect if someone changed the output, as these edits would be lostthe next time Argument Clinic writes out fresh output.)
For the sake of clarity, here’s the terminology we’ll use with Argument Clinic:
The first line of the comment (
/*[clinicinput]) is thestart line.The last line of the initial comment (
[clinicstartgeneratedcode]*/) is theend line.The last line (
/*[clinicendgeneratedcode:checksum=...]*/) is thechecksum line.In between the start line and the end line is theinput.
In between the end line and the checksum line is theoutput.
All the text collectively, from the start line to the checksum line inclusively,is theblock. (A block that hasn’t been successfully processed by ArgumentClinic yet doesn’t have output or a checksum line, but it’s still considereda block.)
Converting Your First Function¶
The best way to get a sense of how Argument Clinic works is toconvert a function to work with it. Here, then, are the bareminimum steps you’d need to follow to convert a function towork with Argument Clinic. Note that for code you plan tocheck in to CPython, you really should take the conversion farther,using some of the advanced concepts you’ll see later on inthe document (like “return converters” and “self converters”).But we’ll keep it simple for this walkthrough so you can learn.
Let’s dive in!
Make sure you’re working with a freshly updated checkoutof the CPython trunk.
Find a Python builtin that calls either
PyArg_ParseTuple()orPyArg_ParseTupleAndKeywords(), and hasn’t been convertedto work with Argument Clinic yet.For my example I’m using_pickle.Pickler.dump().If the call to the
PyArg_Parsefunction uses any of thefollowing format units:O&O!eses#etet#
or if it has multiple calls to
PyArg_ParseTuple(),you should choose a different function. Argument Clinicdoessupport all of these scenarios. But these are advancedtopics—let’s do something simpler for your first function.Also, if the function has multiple calls to
PyArg_ParseTuple()orPyArg_ParseTupleAndKeywords()where it supports differenttypes for the same argument, or if the function uses something besidesPyArg_Parse functions to parse its arguments, it probablyisn’t suitable for conversion to Argument Clinic. Argument Clinicdoesn’t support generic functions or polymorphic parameters.Add the following boilerplate above the function, creating our block:
/*[clinic input][clinic start generated code]*/
Cut the docstring and paste it in between the
[clinic]lines,removing all the junk that makes it a properly quoted C string.When you’re done you should have just the text, based at the leftmargin, with no line wider than 80 characters.(Argument Clinic will preserve indents inside the docstring.)If the old docstring had a first line that looked like a functionsignature, throw that line away. (The docstring doesn’t need itanymore—when you use
help()on your builtin in the future,the first line will be built automatically based on the function’ssignature.)Sample:
/*[clinic input]Write a pickled representation of obj to the open file.[clinic start generated code]*/
If your docstring doesn’t have a “summary” line, Argument Clinic willcomplain. So let’s make sure it has one. The “summary” line shouldbe a paragraph consisting of a single 80-column lineat the beginning of the docstring.
(Our example docstring consists solely of a summary line, so the samplecode doesn’t have to change for this step.)
Above the docstring, enter the name of the function, followedby a blank line. This should be the Python name of the function,and should be the full dotted pathto the function—it should start with the name of the module,include any sub-modules, and if the function is a method ona class it should include the class name too.
Sample:
/*[clinic input]_pickle.Pickler.dumpWrite a pickled representation of obj to the open file.[clinic start generated code]*/
If this is the first time that module or class has been used with ArgumentClinic in this C file,you must declare the module and/or class. Proper Argument Clinic hygieneprefers declaring these in a separate block somewhere near thetop of the C file, in the same way that include files and statics go atthe top. (In our sample code we’ll just show the two blocks next toeach other.)
The name of the class and module should be the same as the oneseen by Python. Check the name defined in the
PyModuleDeforPyTypeObjectas appropriate.When you declare a class, you must also specify two aspects of its typein C: the type declaration you’d use for a pointer to an instance ofthis class, and a pointer to the
PyTypeObjectfor this class.Sample:
/*[clinic input]module _pickleclass _pickle.Pickler "PicklerObject *" "&Pickler_Type"[clinic start generated code]*//*[clinic input]_pickle.Pickler.dumpWrite a pickled representation of obj to the open file.[clinic start generated code]*/
Declare each of the parameters to the function. Each parametershould get its own line. All the parameter lines should beindented from the function name and the docstring.
The general form of these parameter lines is as follows:
name_of_parameter: converter
If the parameter has a default value, add that after theconverter:
name_of_parameter: converter = default_value
Argument Clinic’s support for “default values” is quite sophisticated;please seethe section below on default valuesfor more information.
Add a blank line below the parameters.
What’s a “converter”? It establishes both the typeof the variable used in C, and the method to convert the Pythonvalue into a C value at runtime.For now you’re going to use what’s called a “legacy converter”—aconvenience syntax intended to make porting old code into ArgumentClinic easier.
For each parameter, copy the “format unit” for thatparameter from the
PyArg_Parse()format argument andspecifythat as its converter, as a quotedstring. (“format unit” is the formal name for the one-to-threecharacter substring of theformatparameter that tellsthe argument parsing function what the type of the variableis and how to convert it. For more on format units pleaseseeParsing arguments and building values.)For multicharacter format units like
z#, use theentire two-or-three character string.Sample:
/*[clinic input] module _pickle class _pickle.Pickler "PicklerObject *" "&Pickler_Type" [clinic start generated code]*//*[clinic input] _pickle.Pickler.dump obj: 'O'Write a pickled representation of obj to the open file.[clinic start generated code]*/
If your function has
|in the format string, meaning someparameters have default values, you can ignore it. ArgumentClinic infers which parameters are optional based on whetheror not they have default values.If your function has
$in the format string, meaning ittakes keyword-only arguments, specify*on a line byitself before the first keyword-only argument, indented thesame as the parameter lines.(
_pickle.Pickler.dumphas neither, so our sample is unchanged.)If the existing C function calls
PyArg_ParseTuple()(as opposed toPyArg_ParseTupleAndKeywords()), then all itsarguments are positional-only.To mark all parameters as positional-only in Argument Clinic,add a
/on a line by itself after the last parameter,indented the same as the parameter lines.Currently this is all-or-nothing; either all parameters arepositional-only, or none of them are. (In the future ArgumentClinic may relax this restriction.)
Sample:
/*[clinic input]module _pickleclass _pickle.Pickler "PicklerObject *" "&Pickler_Type"[clinic start generated code]*//*[clinic input]_pickle.Pickler.dump obj: 'O' /Write a pickled representation of obj to the open file.[clinic start generated code]*/
It’s helpful to write a per-parameter docstring for each parameter.But per-parameter docstrings are optional; you can skip this stepif you prefer.
Here’s how to add a per-parameter docstring. The first lineof the per-parameter docstring must be indented further than theparameter definition. The left margin of this first line establishesthe left margin for the whole per-parameter docstring; all the textyou write will be outdented by this amount. You can write as muchtext as you like, across multiple lines if you wish.
Sample:
/*[clinic input]module _pickleclass _pickle.Pickler "PicklerObject *" "&Pickler_Type"[clinic start generated code]*//*[clinic input]_pickle.Pickler.dump obj: 'O' The object to be pickled. /Write a pickled representation of obj to the open file.[clinic start generated code]*/
Save and close the file, then run
Tools/clinic/clinic.pyonit. With luck everything worked—your block now has output, anda.c.hfile has been generated! Reopen the file in yourtext editor to see:/*[clinic input]_pickle.Pickler.dump obj: 'O' The object to be pickled. /Write a pickled representation of obj to the open file.[clinic start generated code]*/staticPyObject*_pickle_Pickler_dump(PicklerObject*self,PyObject*obj)/*[clinic end generated code: output=87ecad1261e02ac7 input=552eb1c0f52260d9]*/
Obviously, if Argument Clinic didn’t produce any output, it’s becauseit found an error in your input. Keep fixing your errors and retryinguntil Argument Clinic processes your file without complaint.
For readability, most of the glue code has been generated to a
.c.hfile. You’ll need to include that in your original.cfile,typically right after the clinic module block:#include"clinic/_pickle.c.h"
Double-check that the argument-parsing code Argument Clinic generatedlooks basically the same as the existing code.
First, ensure both places use the same argument-parsing function.The existing code must call either
PyArg_ParseTuple()orPyArg_ParseTupleAndKeywords();ensure that the code generated by Argument Clinic calls theexact same function.Second, the format string passed in to
PyArg_ParseTuple()orPyArg_ParseTupleAndKeywords()should beexactly the sameas the hand-written one in the existing function, up to the colonor semi-colon.(Argument Clinic always generates its format stringswith a
:followed by the name of the function. If theexisting code’s format string ends with;, to provideusage help, this change is harmless—don’t worry about it.)Third, for parameters whose format units require two arguments(like a length variable, or an encoding string, or a pointerto a conversion function), ensure that the second argument isexactly the same between the two invocations.
Fourth, inside the output portion of the block you’ll find a preprocessormacro defining the appropriate static
PyMethodDefstructure forthis builtin:#define __PICKLE_PICKLER_DUMP_METHODDEF \{"dump", (PyCFunction)__pickle_Pickler_dump, METH_O, __pickle_Pickler_dump__doc__},
This static structure should beexactly the same as the existing static
PyMethodDefstructure for this builtin.If any of these items differ inany way,adjust your Argument Clinic function specification and rerun
Tools/clinic/clinic.pyuntil theyare the same.Notice that the last line of its output is the declarationof your “impl” function. This is where the builtin’s implementation goes.Delete the existing prototype of the function you’re modifying, but leavethe opening curly brace. Now delete its argument parsing code and thedeclarations of all the variables it dumps the arguments into.Notice how the Python arguments are now arguments to this impl function;if the implementation used different names for these variables, fix it.
Let’s reiterate, just because it’s kind of weird. Your code should nowlook like this:
staticreturn_typeyour_function_impl(...)/*[clinic end generated code: checksum=...]*/{...
Argument Clinic generated the checksum line and the function prototype justabove it. You should write the opening (and closing) curly braces for thefunction, and the implementation inside.
Sample:
/*[clinic input]module _pickleclass _pickle.Pickler "PicklerObject *" "&Pickler_Type"[clinic start generated code]*//*[clinic end generated code: checksum=da39a3ee5e6b4b0d3255bfef95601890afd80709]*//*[clinic input]_pickle.Pickler.dump obj: 'O' The object to be pickled. /Write a pickled representation of obj to the open file.[clinic start generated code]*/PyDoc_STRVAR(__pickle_Pickler_dump__doc__,"Write a pickled representation of obj to the open file.\n""\n"...staticPyObject*_pickle_Pickler_dump_impl(PicklerObject*self,PyObject*obj)/*[clinic end generated code: checksum=3bd30745bf206a48f8b576a1da3d90f55a0a4187]*/{/* Check whether the Pickler was initialized correctly (issue3664). Developers often forget to call __init__() in their subclasses, which would trigger a segfault without this check. */if(self->write==NULL){PyErr_Format(PicklingError,"Pickler.__init__() was not called by %s.__init__()",Py_TYPE(self)->tp_name);returnNULL;}if(_Pickler_ClearBuffer(self)<0)returnNULL;...
Remember the macro with the
PyMethodDefstructure for thisfunction? Find the existingPyMethodDefstructure for thisfunction and replace it with a reference to the macro. (If the builtinis at module scope, this will probably be very near the end of the file;if the builtin is a class method, this will probably be below but relativelynear to the implementation.)Note that the body of the macro contains a trailing comma. So when youreplace the existing static
PyMethodDefstructure with the macro,don’t add a comma to the end.Sample:
staticstructPyMethodDefPickler_methods[]={__PICKLE_PICKLER_DUMP_METHODDEF__PICKLE_PICKLER_CLEAR_MEMO_METHODDEF{NULL,NULL}/* sentinel */};
Compile, then run the relevant portions of the regression-test suite.This change should not introduce any new compile-time warnings or errors,and there should be no externally-visible change to Python’s behavior.
Well, except for one difference:
inspect.signature()run on your functionshould now provide a valid signature!Congratulations, you’ve ported your first function to work with Argument Clinic!
Advanced Topics¶
Now that you’ve had some experience working with Argument Clinic, it’s timefor some advanced topics.
Symbolic default values¶
The default value you provide for a parameter can’t be any arbitraryexpression. Currently the following are explicitly supported:
Numeric constants (integer and float)
String constants
True,False, andNoneSimple symbolic constants like
sys.maxsize, which muststart with the name of the module
In case you’re curious, this is implemented infrom_builtin()inLib/inspect.py.
(In the future, this may need to get even more elaborate,to allow full expressions likeCONSTANT-1.)
Renaming the C functions and variables generated by Argument Clinic¶
Argument Clinic automatically names the functions it generates for you.Occasionally this may cause a problem, if the generated name collides withthe name of an existing C function. There’s an easy solution: override the namesused for the C functions. Just add the keyword"as"to your function declaration line, followed by the function name you wish to use.Argument Clinic will use that function name for the base (generated) function,then add"_impl" to the end and use that for the name of the impl function.
For example, if we wanted to rename the C function names generated forpickle.Pickler.dump, it’d look like this:
/*[clinic input]pickle.Pickler.dump as pickler_dumper...
The base function would now be namedpickler_dumper(),and the impl function would now be namedpickler_dumper_impl().
Similarly, you may have a problem where you want to give a parametera specific Python name, but that name may be inconvenient in C. ArgumentClinic allows you to give a parameter different names in Python and in C,using the same"as" syntax:
/*[clinic input]pickle.Pickler.dump obj: object file as file_obj: object protocol: object = NULL * fix_imports: bool = True
Here, the name used in Python (in the signature and thekeywordsarray) would befile, but the C variable would be namedfile_obj.
You can use this to rename theself parameter too!
Converting functions using PyArg_UnpackTuple¶
To convert a function parsing its arguments withPyArg_UnpackTuple(),simply write out all the arguments, specifying each as anobject. Youmay specify thetype argument to cast the type as appropriate. Allarguments should be marked positional-only (add a/ on a line by itselfafter the last argument).
Currently the generated code will usePyArg_ParseTuple(), but thiswill change soon.
Optional Groups¶
Some legacy functions have a tricky approach to parsing their arguments:they count the number of positional arguments, then use aswitch statementto call one of several differentPyArg_ParseTuple() calls depending onhow many positional arguments there are. (These functions cannot acceptkeyword-only arguments.) This approach was used to simulate optionalarguments back beforePyArg_ParseTupleAndKeywords() was created.
While functions using this approach can often be converted tousePyArg_ParseTupleAndKeywords(), optional arguments, and default values,it’s not always possible. Some of these legacy functions havebehaviorsPyArg_ParseTupleAndKeywords() doesn’t directly support.The most obvious example is the builtin functionrange(), which hasan optional argument on theleft side of its required argument!Another example iscurses.window.addch(), which has a group of twoarguments that must always be specified together. (The arguments arecalledx andy; if you call the function passing inx,you must also pass iny—and if you don’t pass inx you may notpass iny either.)
In any case, the goal of Argument Clinic is to support argument parsingfor all existing CPython builtins without changing their semantics.Therefore Argument Clinic supportsthis alternate approach to parsing, using what are calledoptional groups.Optional groups are groups of arguments that must all be passed in together.They can be to the left or the right of the required arguments. Theycanonly be used with positional-only parameters.
Note
Optional groups areonly intended for use when convertingfunctions that make multiple calls toPyArg_ParseTuple()!Functions that useany other approach for parsing argumentsshouldalmost never be converted to Argument Clinic usingoptional groups. Functions using optional groups currentlycannot have accurate signatures in Python, because Python justdoesn’t understand the concept. Please avoid using optionalgroups wherever possible.
To specify an optional group, add a[ on a line by itself beforethe parameters you wish to group together, and a] on a line by itselfafter these parameters. As an example, here’s howcurses.window.addchuses optional groups to make the first two parameters and the lastparameter optional:
/*[clinic input]curses.window.addch [ x: int X-coordinate. y: int Y-coordinate. ] ch: object Character to add. [ attr: long Attributes for the character. ] /...
Notes:
For every optional group, one additional parameter will be passed into theimpl function representing the group. The parameter will be an int named
group_{direction}_{number},where{direction}is eitherrightorleftdepending on whether the groupis before or after the required parameters, and{number}is a monotonicallyincreasing number (starting at 1) indicating how far away the group is fromthe required parameters. When the impl is called, this parameter will be setto zero if this group was unused, and set to non-zero if this group was used.(By used or unused, I mean whether or not the parameters received argumentsin this invocation.)If there are no required arguments, the optional groups will behaveas if they’re to the right of the required arguments.
In the case of ambiguity, the argument parsing codefavors parameters on the left (before the required parameters).
Optional groups can only contain positional-only parameters.
Optional groups areonly intended for legacy code. Please do notuse optional groups for new code.
Using real Argument Clinic converters, instead of “legacy converters”¶
To save time, and to minimize how much you need to learnto achieve your first port to Argument Clinic, the walkthrough above tellsyou to use “legacy converters”. “Legacy converters” are a convenience,designed explicitly to make porting existing code to Argument Cliniceasier. And to be clear, their use is acceptable when porting code forPython 3.4.
However, in the long term we probably want all our blocks touse Argument Clinic’s real syntax for converters. Why? A couplereasons:
The proper converters are far easier to read and clearer in their intent.
There are some format units that are unsupported as “legacy converters”,because they require arguments, and the legacy converter syntax doesn’tsupport specifying arguments.
In the future we may have a new argument parsing library that isn’trestricted to what
PyArg_ParseTuple()supports; this flexibilitywon’t be available to parameters using legacy converters.
Therefore, if you don’t mind a little extra effort, please use the normalconverters instead of legacy converters.
In a nutshell, the syntax for Argument Clinic (non-legacy) converterslooks like a Python function call. However, if there are no explicitarguments to the function (all functions take their default values),you may omit the parentheses. Thusbool andbool() are exactlythe same converters.
All arguments to Argument Clinic converters are keyword-only.All Argument Clinic converters accept the following arguments:
c_defaultThe default value for this parameter when defined in C.Specifically, this will be the initializer for the variable declaredin the “parse function”. Seethe section on default valuesfor how to use this.Specified as a string.
annotationThe annotation value for this parameter. Not currently supported,becausePEP 8 mandates that the Python library may not useannotations.
In addition, some converters accept additional arguments. Here is a listof these arguments, along with their meanings:
acceptA set of Python types (and possibly pseudo-types);this restricts the allowable Python argument to values of these types.(This is not a general-purpose facility; as a rule it only supportsspecific lists of types as shown in the legacy converter table.)
To accept
None, addNoneTypeto this set.bitwiseOnly supported for unsigned integers. The native integer value of thisPython argument will be written to the parameter without any range checking,even for negative values.
converterOnly supported by the
objectconverter. Specifies the name of aC “converter function”to use to convert this object to a native type.encodingOnly supported for strings. Specifies the encoding to use when convertingthis string from a Python str (Unicode) value into a C
char*value.subclass_ofOnly supported for the
objectconverter. Requires that the Pythonvalue be a subclass of a Python type, as expressed in C.typeOnly supported for the
objectandselfconverters. Specifiesthe C type that will be used to declare the variable. Default value is"PyObject*".zeroesOnly supported for strings. If true, embedded NUL bytes (
'\\0') arepermitted inside the value. The length of the string will be passed into the impl function, just after the string parameter, as a parameter named<parameter_name>_length.
Please note, not every possible combination of arguments will work.Usually these arguments are implemented by specificPyArg_ParseTupleformat units, with specific behavior. For example, currently you cannotcallunsigned_short without also specifyingbitwise=True.Although it’s perfectly reasonable to think this would work, these semantics don’tmap to any existing format unit. So Argument Clinic doesn’t support it. (Or, atleast, not yet.)
Below is a table showing the mapping of legacy converters into realArgument Clinic converters. On the left is the legacy converter,on the right is the text you’d replace it with.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
As an example, here’s our samplepickle.Pickler.dump using the properconverter:
/*[clinic input]pickle.Pickler.dump obj: object The object to be pickled. /Write a pickled representation of obj to the open file.[clinic start generated code]*/
One advantage of real converters is that they’re more flexible than legacyconverters. For example, theunsigned_int converter (and all theunsigned_ converters) can be specified withoutbitwise=True. Theirdefault behavior performs range checking on the value, and they won’t acceptnegative numbers. You just can’t do that with a legacy converter!
Argument Clinic will show you all the converters it hasavailable. For each converter it’ll show you all the parametersit accepts, along with the default value for each parameter.Just runTools/clinic/clinic.py--converters to see the full list.
Py_buffer¶
When using thePy_buffer converter(or the's*','w*','*y', or'z*' legacy converters),youmust not callPyBuffer_Release() on the provided buffer.Argument Clinic generates code that does it for you (in the parsing function).
Advanced converters¶
Remember those format units you skipped for your firsttime because they were advanced? Here’s how to handle those too.
The trick is, all those format units take arguments—eitherconversion functions, or types, or strings specifying an encoding.(But “legacy converters” don’t support arguments. That’s why weskipped them for your first function.) The argument you specifiedto the format unit is now an argument to the converter; thisargument is eitherconverter (forO&),subclass_of (forO!),orencoding (for all the format units that start withe).
When usingsubclass_of, you may also want to use the othercustom argument forobject():type, which lets you set the typeactually used for the parameter. For example, if you want to ensurethat the object is a subclass ofPyUnicode_Type, you probably wantto use the converterobject(type='PyUnicodeObject*',subclass_of='&PyUnicode_Type').
One possible problem with using Argument Clinic: it takes away some possibleflexibility for the format units starting withe. When writing aPyArg_Parse call by hand, you could theoretically decide at runtime whatencoding string to pass in toPyArg_ParseTuple(). But now this string mustbe hard-coded at Argument-Clinic-preprocessing-time. This limitation is deliberate;it made supporting this format unit much easier, and may allow for future optimizations.This restriction doesn’t seem unreasonable; CPython itself always passes in statichard-coded encoding strings for parameters whose format units start withe.
Parameter default values¶
Default values for parameters can be any of a number of values.At their simplest, they can be string, int, or float literals:
foo: str = "abc"bar: int = 123bat: float = 45.6
They can also use any of Python’s built-in constants:
yep: bool = Truenope: bool = Falsenada: object = None
There’s also special support for a default value ofNULL, andfor simple expressions, documented in the following sections.
TheNULL default value¶
For string and object parameters, you can set them toNone to indicatethat there’s no default. However, that means the C variable will beinitialized toPy_None. For convenience’s sakes, there’s a specialvalue calledNULL for just this reason: from Python’s perspective itbehaves like a default value ofNone, but the C variable is initializedwithNULL.
Expressions specified as default values¶
The default value for a parameter can be more than just a literal value.It can be an entire expression, using math operators and looking up attributeson objects. However, this support isn’t exactly simple, because of somenon-obvious semantics.
Consider the following example:
foo: Py_ssize_t = sys.maxsize - 1
sys.maxsize can have different values on different platforms. ThereforeArgument Clinic can’t simply evaluate that expression locally and hard-code itin C. So it stores the default in such a way that it will get evaluated atruntime, when the user asks for the function’s signature.
What namespace is available when the expression is evaluated? It’s evaluatedin the context of the module the builtin came from. So, if your module has anattribute called “max_widgets”, you may simply use it:
foo: Py_ssize_t = max_widgets
If the symbol isn’t found in the current module, it fails over to looking insys.modules. That’s how it can findsys.maxsize for example. (Since youdon’t know in advance what modules the user will load into their interpreter,it’s best to restrict yourself to modules that are preloaded by Python itself.)
Evaluating default values only at runtime means Argument Clinic can’t computethe correct equivalent C default value. So you need to tell it explicitly.When you use an expression, you must also specify the equivalent expressionin C, using thec_default parameter to the converter:
foo: Py_ssize_t(c_default="PY_SSIZE_T_MAX - 1") = sys.maxsize - 1
Another complication: Argument Clinic can’t know in advance whether or not theexpression you supply is valid. It parses it to make sure it looks legal, butit can’tactually know. You must be very careful when using expressions tospecify values that are guaranteed to be valid at runtime!
Finally, because expressions must be representable as static C values, thereare many restrictions on legal expressions. Here’s a list of Python featuresyou’re not permitted to use:
Function calls.
Inline if statements (
3iffooelse5).Automatic sequence unpacking (
*[1,2,3]).List/set/dict comprehensions and generator expressions.
Tuple/list/set/dict literals.
Using a return converter¶
By default the impl function Argument Clinic generates for you returnsPyObject*.But your C function often computes some C type, then converts it into thePyObject*at the last moment. Argument Clinic handles converting your inputs from Python typesinto native C types—why not have it convert your return value from a native C typeinto a Python type too?
That’s what a “return converter” does. It changes your impl function to returnsome C type, then adds code to the generated (non-impl) function to handle convertingthat value into the appropriatePyObject*.
The syntax for return converters is similar to that of parameter converters.You specify the return converter like it was a return annotation on thefunction itself. Return converters behave much the same as parameter converters;they take arguments, the arguments are all keyword-only, and if you’re not changingany of the default arguments you can omit the parentheses.
(If you use both"as"and a return converter for your function,the"as" should come before the return converter.)
There’s one additional complication when using return converters: how do youindicate an error has occurred? Normally, a function returns a valid (non-NULL)pointer for success, andNULL for failure. But if you use an integer return converter,all integers are valid. How can Argument Clinic detect an error? Its solution: each returnconverter implicitly looks for a special value that indicates an error. If you returnthat value, and an error has been set (PyErr_Occurred() returns a truevalue), then the generated code will propagate the error. Otherwise it willencode the value you return like normal.
Currently Argument Clinic supports only a few return converters:
boolintunsigned intlongunsigned intsize_tPy_ssize_tfloatdoubleDecodeFSDefault
None of these take parameters. For the first three, return -1 to indicateerror. ForDecodeFSDefault, the return type isconstchar*; return aNULLpointer to indicate an error.
(There’s also an experimentalNoneType converter, which lets youreturnPy_None on success orNULL on failure, without havingto increment the reference count onPy_None. I’m not sure it addsenough clarity to be worth using.)
To see all the return converters Argument Clinic supports, along withtheir parameters (if any),just runTools/clinic/clinic.py--converters for the full list.
Cloning existing functions¶
If you have a number of functions that look similar, you may be able touse Clinic’s “clone” feature. When you clone an existing function,you reuse:
its parameters, including
their names,
their converters, with all parameters,
their default values,
their per-parameter docstrings,
theirkind (whether they’re positional only,positional or keyword, or keyword only), and
its return converter.
The only thing not copied from the original function is its docstring;the syntax allows you to specify a new docstring.
Here’s the syntax for cloning a function:
/*[clinic input]module.class.new_function [as c_basename] = module.class.existing_functionDocstring for new_function goes here.[clinic start generated code]*/
(The functions can be in different modules or classes. I wrotemodule.class in the sample just to illustrate that you mustuse the full path toboth functions.)
Sorry, there’s no syntax for partially-cloning a function, or cloning a functionthen modifying it. Cloning is an all-or nothing proposition.
Also, the function you are cloning from must have been previously definedin the current file.
Calling Python code¶
The rest of the advanced topics require you to write Python codewhich lives inside your C file and modifies Argument Clinic’sruntime state. This is simple: you simply define a Python block.
A Python block uses different delimiter lines than an ArgumentClinic function block. It looks like this:
/*[python input]# python code goes here[python start generated code]*/
All the code inside the Python block is executed at thetime it’s parsed. All text written to stdout inside the blockis redirected into the “output” after the block.
As an example, here’s a Python block that adds a static integervariable to the C code:
/*[python input]print('static int __ignored_unused_variable__ = 0;')[python start generated code]*/staticint__ignored_unused_variable__=0;/*[python checksum:...]*/
Using a “self converter”¶
Argument Clinic automatically adds a “self” parameter for youusing a default converter. It automatically sets thetypeof this parameter to the “pointer to an instance” you specifiedwhen you declared the type. However, you can overrideArgument Clinic’s converter and specify one yourself.Just add your ownself parameter as the first parameter in ablock, and ensure that its converter is an instance ofself_converter or a subclass thereof.
What’s the point? This lets you override the type ofself,or give it a different default name.
How do you specify the custom type you want to castself to?If you only have one or two functions with the same type forself,you can directly use Argument Clinic’s existingself converter,passing in the type you want to use as thetype parameter:
/*[clinic input]_pickle.Pickler.dump self: self(type="PicklerObject *") obj: object /Write a pickled representation of the given object to the open file.[clinic start generated code]*/
On the other hand, if you have a lot of functions that will use the sametype forself, it’s best to create your own converter, subclassingself_converter but overwriting thetype member:
/*[python input]class PicklerObject_converter(self_converter): type = "PicklerObject *"[python start generated code]*//*[clinic input]_pickle.Pickler.dump self: PicklerObject obj: object /Write a pickled representation of the given object to the open file.[clinic start generated code]*/
Writing a custom converter¶
As we hinted at in the previous section… you can write your own converters!A converter is simply a Python class that inherits fromCConverter.The main purpose of a custom converter is if you have a parameter usingtheO& format unit—parsing this parameter means callingaPyArg_ParseTuple() “converter function”.
Your converter class should be named*something*_converter.If the name follows this convention, then your converter classwill be automatically registered with Argument Clinic; its namewill be the name of your class with the_converter suffixstripped off. (This is accomplished with a metaclass.)
You shouldn’t subclassCConverter.__init__. Instead, you shouldwrite aconverter_init() function.converter_init()always accepts aself parameter; after that, all additionalparametersmust be keyword-only. Any arguments passed in tothe converter in Argument Clinic will be passed along to yourconverter_init().
There are some additional members ofCConverter you may wishto specify in your subclass. Here’s the current list:
typeThe C type to use for this variable.
typeshould be a Python string specifying the type, e.g.int.If this is a pointer type, the type string should end with'*'.defaultThe Python default value for this parameter, as a Python value.Or the magic value
unspecifiedif there is no default.py_defaultdefaultas it should appear in Python code,as a string.OrNoneif there is no default.c_defaultdefaultas it should appear in C code,as a string.OrNoneif there is no default.c_ignored_defaultThe default value used to initialize the C variable whenthere is no default, but not specifying a default mayresult in an “uninitialized variable” warning. This caneasily happen when using option groups—althoughproperly-written code will never actually use this value,the variable does get passed in to the impl, and theC compiler will complain about the “use” of theuninitialized value. This value should always be anon-empty string.
converterThe name of the C converter function, as a string.
impl_by_referenceA boolean value. If true,Argument Clinic will add a
&in front of the name ofthe variable when passing it into the impl function.parse_by_referenceA boolean value. If true,Argument Clinic will add a
&in front of the name ofthe variable when passing it intoPyArg_ParseTuple().
Here’s the simplest example of a custom converter, fromModules/zlibmodule.c:
/*[python input]class ssize_t_converter(CConverter): type = 'Py_ssize_t' converter = 'ssize_t_converter'[python start generated code]*//*[python end generated code: output=da39a3ee5e6b4b0d input=35521e4e733823c7]*/
This block adds a converter to Argument Clinic namedssize_t. Parametersdeclared asssize_t will be declared as typePy_ssize_t, and willbe parsed by the'O&' format unit, which will call thessize_t_converter converter function.ssize_t variablesautomatically support default values.
More sophisticated custom converters can insert custom C code tohandle initialization and cleanup.You can see more examples of custom converters in the CPythonsource tree; grep the C files for the stringCConverter.
Writing a custom return converter¶
Writing a custom return converter is much like writinga custom converter. Except it’s somewhat simpler, because returnconverters are themselves much simpler.
Return converters must subclassCReturnConverter.There are no examples yet of custom return converters,because they are not widely used yet. If you wish towrite your own return converter, please readTools/clinic/clinic.py,specifically the implementation ofCReturnConverter andall its subclasses.
METH_O and METH_NOARGS¶
To convert a function usingMETH_O, make sure the function’ssingle argument is using theobject converter, and mark thearguments as positional-only:
/*[clinic input]meth_o_sample argument: object /[clinic start generated code]*/
To convert a function usingMETH_NOARGS, just don’t specifyany arguments.
You can still use a self converter, a return converter, and specifyatype argument to the object converter forMETH_O.
tp_new and tp_init functions¶
You can converttp_new andtp_init functions. Just namethem__new__ or__init__ as appropriate. Notes:
The function name generated for
__new__doesn’t end in__new__like it would by default. It’s just the name of the class, convertedinto a valid C identifier.No
PyMethodDef#defineis generated for these functions.__init__functions returnint, notPyObject*.Use the docstring as the class docstring.
Although
__new__and__init__functions must alwaysaccept both theargsandkwargsobjects, when convertingyou may specify any signature for these functions that you like.(If your function doesn’t support keywords, the parsing functiongenerated will throw an exception if it receives any.)
Changing and redirecting Clinic’s output¶
It can be inconvenient to have Clinic’s output interspersed withyour conventional hand-edited C code. Luckily, Clinic is configurable:you can buffer up its output for printing later (or earlier!), or writeits output to a separate file. You can also add a prefix or suffix toevery line of Clinic’s generated output.
While changing Clinic’s output in this manner can be a boon to readability,it may result in Clinic code using types before they are defined, oryour code attempting to use Clinic-generated code before it is defined.These problems can be easily solved by rearranging the declarations in your file,or moving where Clinic’s generated code goes. (This is why the default behaviorof Clinic is to output everything into the current block; while many peopleconsider this hampers readability, it will never require rearranging yourcode to fix definition-before-use problems.)
Let’s start with defining some terminology:
- field
A field, in this context, is a subsection of Clinic’s output.For example, the
#definefor thePyMethodDefstructureis a field, calledmethoddef_define. Clinic has sevendifferent fields it can output per function definition:docstring_prototypedocstring_definitionmethoddef_defineimpl_prototypeparser_prototypeparser_definitionimpl_definition
All the names are of the form
"<a>_<b>",where"<a>"is the semantic object represented (the parsing function,the impl function, the docstring, or the methoddef structure) and"<b>"represents what kind of statement the field is. Field names that end in"_prototype"represent forward declarations of that thing, without the actual body/dataof the thing; field names that end in"_definition"represent the actualdefinition of the thing, with the body/data of the thing. ("methoddef"is special, it’s the only one that ends with"_define", representing thatit’s a preprocessor #define.)- destination
A destination is a place Clinic can write output to. There arefive built-in destinations:
blockThe default destination: printed in the output section ofthe current Clinic block.
bufferA text buffer where you can save text for later. Text senthere is appended to the end of any existing text. It’s anerror to have any text left in the buffer when Clinic finishesprocessing a file.
fileA separate “clinic file” that will be created automatically by Clinic.The filename chosen for the file is
{basename}.clinic{extension},wherebasenameandextensionwere assigned the outputfromos.path.splitext()run on the current file. (Example:thefiledestination for_pickle.cwould be written to_pickle.clinic.c.)Important: When using a
filedestination, youmust check inthe generated file!two-passA buffer like
buffer. However, a two-pass buffer can onlybe dumped once, and it prints out all text sent to it duringall processing, even from Clinic blocksafter the dumping point.suppressThe text is suppressed—thrown away.
Clinic defines five new directives that let you reconfigure its output.
The first new directive isdump:
dump <destination>
This dumps the current contents of the named destination into the output ofthe current block, and empties it. This only works withbuffer andtwo-pass destinations.
The second new directive isoutput. The most basic form ofoutputis like this:
output <field> <destination>
This tells Clinic to outputfield todestination.output alsosupports a special meta-destination, calledeverything, which tellsClinic to outputall fields to thatdestination.
output has a number of other functions:
output pushoutput popoutput preset <preset>
outputpush andoutputpop allow you to push and popconfigurations on an internal configuration stack, so that youcan temporarily modify the output configuration, then easily restorethe previous configuration. Simply push before your change to savethe current configuration, then pop when you wish to restore theprevious configuration.
outputpreset sets Clinic’s output to one of several built-inpreset configurations, as follows:
blockClinic’s original starting configuration. Writes everythingimmediately after the input block.
Suppress the
parser_prototypeanddocstring_prototype, write everything else toblock.fileDesigned to write everything to the “clinic file” that it can.You then
#includethis file near the top of your file.You may need to rearrange your file to make this work, thoughusually this just means creating forward declarations for varioustypedefandPyTypeObjectdefinitions.Suppress the
parser_prototypeanddocstring_prototype, write theimpl_definitiontoblock, and write everything else tofile.The default filename is
"{dirname}/clinic/{basename}.h".bufferSave up most of the output from Clinic, to be written intoyour file near the end. For Python files implementing modulesor builtin types, it’s recommended that you dump the bufferjust above the static structures for your module orbuiltin type; these are normally very near the end. Using
buffermay require even more editing thanfile, ifyour file has staticPyMethodDefarrays defined in themiddle of the file.Suppress the
parser_prototype,impl_prototype,anddocstring_prototype, write theimpl_definitiontoblock, and write everything else tofile.two-passSimilar to the
bufferpreset, but writes forward declarations tothetwo-passbuffer, and definitions to thebuffer.This is similar to thebufferpreset, but may requireless editing thanbuffer. Dump thetwo-passbuffernear the top of your file, and dump thebuffernearthe end just like you would when using thebufferpreset.Suppresses the
impl_prototype, write theimpl_definitiontoblock, writedocstring_prototype,methoddef_define,andparser_prototypetotwo-pass, write everything elsetobuffer.partial-bufferSimilar to the
bufferpreset, but writes more things toblock,only writing the really big chunks of generated code tobuffer.This avoids the definition-before-use problem ofbuffercompletely,at the small cost of having slightly more stuff in the block’s output.Dump thebuffernear the end, just like you would when usingthebufferpreset.Suppresses the
impl_prototype, write thedocstring_definitionandparser_definitiontobuffer, write everything else toblock.
The third new directive isdestination:
destination <name> <command> [...]
This performs an operation on the destination namedname.
There are two defined subcommands:new andclear.
Thenew subcommand works like this:
destination <name> new <type>
This creates a new destination with name<name> and type<type>.
There are five destination types:
suppressThrows the text away.
blockWrites the text to the current block. This is what Clinicoriginally did.
bufferA simple text buffer, like the “buffer” builtin destination above.
fileA text file. The file destination takes an extra argument,a template to use for building the filename, like so:
destination <name> new <type> <file_template>
The template can use three strings internally that will be replacedby bits of the filename:
- {path}
The full path to the file, including directory and full filename.
- {dirname}
The name of the directory the file is in.
- {basename}
Just the name of the file, not including the directory.
- {basename_root}
Basename with the extension clipped off(everything up to but not including the last ‘.’).
- {basename_extension}
The last ‘.’ and everything after it. If the basenamedoes not contain a period, this will be the empty string.
If there are no periods in the filename, {basename} and {filename}are the same, and {extension} is empty. “{basename}{extension}”is always exactly the same as “{filename}”.”
two-passA two-pass buffer, like the “two-pass” builtin destination above.
Theclear subcommand works like this:
destination <name> clear
It removes all the accumulated text up to this point in the destination.(I don’t know what you’d need this for, but I thought maybe it’d beuseful while someone’s experimenting.)
The fourth new directive isset:
set line_prefix "string"set line_suffix "string"
set lets you set two internal variables in Clinic.line_prefix is a string that will be prepended to every line of Clinic’s output;line_suffix is a string that will be appended to every line of Clinic’s output.
Both of these support two format strings:
{blockcommentstart}Turns into the string
/*, the start-comment text sequence for C files.{blockcommentend}Turns into the string
*/, the end-comment text sequence for C files.
The final new directive is one you shouldn’t need to use directly,calledpreserve:
preserve
This tells Clinic that the current contents of the output should be kept, unmodified.This is used internally by Clinic when dumping output intofile files; wrappingit in a Clinic block lets Clinic use its existing checksum functionality to ensurethe file was not modified by hand before it gets overwritten.
The #ifdef trick¶
If you’re converting a function that isn’t available on all platforms,there’s a trick you can use to make life a little easier. The existingcode probably looks like this:
#ifdef HAVE_FUNCTIONNAMEstaticmodule_functionname(...){...}#endif/* HAVE_FUNCTIONNAME */
And then in thePyMethodDef structure at the bottom the existing codewill have:
#ifdef HAVE_FUNCTIONNAME{'functionname', ... },#endif /* HAVE_FUNCTIONNAME */In this scenario, you should enclose the body of your impl function inside the#ifdef,like so:
#ifdef HAVE_FUNCTIONNAME/*[clinic input]module.functionname...[clinic start generated code]*/staticmodule_functionname(...){...}#endif/* HAVE_FUNCTIONNAME */
Then, remove those three lines from thePyMethodDef structure,replacing them with the macro Argument Clinic generated:
MODULE_FUNCTIONNAME_METHODDEF
(You can find the real name for this macro inside the generated code.Or you can calculate it yourself: it’s the name of your function as definedon the first line of your block, but with periods changed to underscores,uppercased, and"_METHODDEF" added to the end.)
Perhaps you’re wondering: what ifHAVE_FUNCTIONNAME isn’t defined?TheMODULE_FUNCTIONNAME_METHODDEF macro won’t be defined either!
Here’s where Argument Clinic gets very clever. It actually detects that theArgument Clinic block might be deactivated by the#ifdef. When thathappens, it generates a little extra code that looks like this:
#ifndef MODULE_FUNCTIONNAME_METHODDEF#define MODULE_FUNCTIONNAME_METHODDEF#endif/* !defined(MODULE_FUNCTIONNAME_METHODDEF) */
That means the macro always works. If the function is defined, this turnsinto the correct structure, including the trailing comma. If the function isundefined, this turns into nothing.
However, this causes one ticklish problem: where should Argument Clinic put thisextra code when using the “block” output preset? It can’t go in the output block,because that could be deactivated by the#ifdef. (That’s the whole point!)
In this situation, Argument Clinic writes the extra code to the “buffer” destination.This may mean that you get a complaint from Argument Clinic:
Warning in file "Modules/posixmodule.c" on line 12357:Destination buffer 'buffer' not empty at end of file, emptying.
When this happens, just open your file, find thedumpbuffer block thatArgument Clinic added to your file (it’ll be at the very bottom), thenmove it above thePyMethodDef structure where that macro is used.
Using Argument Clinic in Python files¶
It’s actually possible to use Argument Clinic to preprocess Python files.There’s no point to using Argument Clinic blocks, of course, as the outputwouldn’t make any sense to the Python interpreter. But using Argument Clinicto run Python blocks lets you use Python as a Python preprocessor!
Since Python comments are different from C comments, Argument Clinicblocks embedded in Python files look slightly different. They look like this:
#/*[python input]#print("def foo(): pass")#[python start generated code]*/deffoo():pass#/*[python checksum:...]*/