Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 436 – The Argument Clinic DSL

Author:
Larry Hastings <larry at hastings.org>
Discussions-To:
Python-Dev list
Status:
Final
Type:
Standards Track
Created:
22-Feb-2013
Python-Version:
3.4

Table of Contents

Abstract

This document proposes “Argument Clinic”, a DSL to facilitateargument processing for built-in functions in the implementation ofCPython.

Rationale and Goals

The primary implementation of Python, “CPython”, is written in amixture of Python and C. One implementation detail of CPythonis what are called “built-in” functions – functions available toPython programs but written in C. When a Python program calls abuilt-in function and passes in arguments, those arguments must betranslated from Python values into C values. This process is called“parsing arguments”.

As of CPython 3.3, builtin functions nearly always parse their argumentswith one of two functions: the originalPyArg_ParseTuple(),[1] andthe more modernPyArg_ParseTupleAndKeywords().[2] The formeronly handles positional parameters; the latter also accommodates keywordand keyword-only parameters, and is preferred for new code.

With either function, the caller specifies the translation forparsing arguments in a “format string”:[3] each parameter correspondsto a “format unit”, a short character sequence telling the parsingfunction what Python types to accept and how to translate them intothe appropriate C value for that parameter.

PyArg_ParseTuple() was reasonable when it was first conceived.There were only a dozen or so of these “format units”; each onewas distinct, and easy to understand and remember.But over the years thePyArg_Parse interface has been extendedin numerous ways. The modern API is complex, to the point that itis somewhat painful to use. Consider:

  • There are now forty different “format units”; a few are even threecharacters long. This makes it difficult for the programmer tounderstand what the format string says–or even perhaps to parseit–without constantly cross-indexing it with the documentation.
  • There are also six meta-format units that may be buried in theformat string. (They are:"()|$:;".)
  • The more format units are added, the less likely it is theimplementer can pick an easy-to-use mnemonic for the format unit,because the character of choice is probably already in use. Inother words, the more format units we have, the more obtuse theformat units become.
  • Several format units are nearly identical to others, having onlysubtle differences. This makes understanding the exact semanticsof the format string even harder, and can make it difficult tofigure out exactly which format unit you want.
  • The docstring is specified as a static C string, making it mildlybothersome to read and edit since it must obey C string quoting rules.
  • When adding a new parameter to a function usingPyArg_ParseTupleAndKeywords(), it’s necessary to touch sixdifferent places in the code:[4]
    • Declaring the variable to store the argument.
    • Passing in a pointer to that variable in the correct spot inPyArg_ParseTupleAndKeywords(), also passing in any“length” or “converter” arguments in the correct order.
    • Adding the name of the argument in the correct spot of the“keywords” array passed in toPyArg_ParseTupleAndKeywords().
    • Adding the format unit to the correct spot in the formatstring.
    • Adding the parameter to the prototype in the docstring.
    • Documenting the parameter in the docstring.
  • There is currently no mechanism for builtin functions to providetheir “signature” information (seeinspect.getfullargspec andinspect.Signature). Adding this information using a mechanismsimilar to the existingPyArg_Parse functions would requirerepeating ourselves yet again.

The goal of Argument Clinic is to replace this API with a mechanisminheriting none of these downsides:

  • You need specify each parameter only once.
  • All information about a parameter is kept together in one place.
  • For each parameter, you specify a conversion function; ArgumentClinic handles the translation from Python value into C value foryou.
  • Argument Clinic also allows for fine-tuning of argument processingbehavior with parameterized conversion functions.
  • Docstrings are written in plain text. Function docstrings arerequired; per-parameter docstrings are encouraged.
  • From this, Argument Clinic generates for you all the mundane,repetitious code and data structures CPython needs internally.Once you’ve specified the interface, the next step is simply towrite your implementation using native C types. Every detail ofargument parsing is handled for you.

Argument Clinic is implemented as a preprocessor. It draws inspirationfor its workflow directly from[Cog] by Ned Batchelder. To use Clinic,add a block comment to your C source code beginning and ending withspecial text strings, then run Clinic on the file. Clinic will find theblock comment, process the contents, and write the output back into yourC source file directly after the comment. The intent is that Clinic’soutput becomes part of your source code; it’s checked in to revisioncontrol, and distributed with source packages. This means that Pythonwill still ship ready-to-build. It does complicate development slightly;in order to add a new function, or modify the arguments or documentationof an existing function using Clinic, you’ll need a working Python 3interpreter.

Future goals of Argument Clinic include:

  • providing signature information for builtins,
  • enabling alternative implementations of Python to createautomated library compatibility tests, and
  • speeding up argument parsing with improvements to thegenerated code.

DSL Syntax Summary

The Argument Clinic DSL is specified as a comment embedded in a Cfile, as follows. The “Example” column on the right shows you sampleinput to the Argument Clinic DSL, and the “Section” column on the leftspecifies what each line represents in turn.

Argument Clinic’s DSL syntax mirrors the Pythondefstatement, lending it some familiarity to Python core developers.

+-----------------------+-----------------------------------------------------------------+|Section|Example|+-----------------------+-----------------------------------------------------------------+|ClinicDSLstart|/*[clinic]||Moduledeclaration|modulemodule_name||Classdeclaration|classmodule_name.class_name||Functiondeclaration|module_name.function_name->return_annotation||Parameterdeclaration|name:converter(param=value)||Parameterdocstring|Loremipsumdolorsitamet,consectetur|||adipisicingelit,seddoeiusmodtempor||Functiondocstring|Loremipsumdolorsitamet,consecteturadipisicing|||elit,seddoeiusmodtemporincididuntutlaboreet||ClinicDSLend|[clinic]*/||Clinicoutput|...||Clinicoutputend|/*[clinicendoutput:<checksum>]*/|+-----------------------+-----------------------------------------------------------------+

To give some flavor of the proposed DSL syntax, here are some sample Cliniccode blocks. This first block reflects the normally preferred style, includingblank lines between parameters and per-argument docstrings.It also includes a user-defined converter (path_t) createdlocally:

/*[clinic]os.statasos_stat_fn->statresultpath:path_t(allow_fd=1)Pathtobeexamined;canbestring,bytes,oropen-file-descriptorint.*dir_fd:OS_STAT_DIR_FD_CONVERTER=DEFAULT_DIR_FDIfnotNone,itshouldbeafiledescriptoropentoadirectory,andpathshouldbearelativestring;pathwillthenberelativetothatdirectory.follow_symlinks:bool=TrueIfFalse,andthelastelementofthepathisasymboliclink,statwillexaminethesymboliclinkitselfinsteadofthefilethelinkpointsto.Performastatsystemcallonthegivenpath.{parameters}dir_fdandfollow_symlinksmaynotbeimplementedonyourplatform.Iftheyareunavailable,usingthemwillraiseaNotImplementedError.It's an error to use dir_fd or follow_symlinks when specifying path asanopenfiledescriptor.[clinic]*/

This second example shows a minimal Clinic code block, omitting allparameter docstrings and non-significant blank lines:

/*[clinic]os.accesspath:pathmode:int*dir_fd:OS_ACCESS_DIR_FD_CONVERTER=1effective_ids:bool=Falsefollow_symlinks:bool=TrueUsetherealuid/gidtotestforaccesstoapath.ReturnsTrueifgranted,Falseotherwise.{parameters}dir_fd,effective_ids,andfollow_symlinksmaynotbeimplementedonyourplatform.Iftheyareunavailable,usingthemwillraiseaNotImplementedError.Notethatmostoperationswillusetheeffectiveuid/gid,thereforethisroutinecanbeusedinasuid/sgidenvironmenttotestiftheinvokinguserhasthespecifiedaccesstothepath.[clinic]*/

This final example shows a Clinic code block handling groups ofoptional parameters, including parameters on the left:

/*[clinic]curses.window.addch[y:intY-coordinate.x:intX-coordinate.]ch:charCharactertoadd.[attr:longAttributesforthecharacter.]/Paintcharacterchat(y,x)withattributesattr,overwritinganycharacterpreviouslypainteratthatlocation.Bydefault,thecharacterpositionandattributesarethecurrentsettingsforthewindowobject.[clinic]*/

General Behavior Of the Argument Clinic DSL

All lines support# as a line comment delimiterexceptdocstrings. Blank lines are always ignored.

Like Python itself, leading whitespace is significant in the ArgumentClinic DSL. The first line of the “function” section is thefunction declaration. Indented lines below the function declarationdeclare parameters, one per line; lines below those that are indented evenfurther are per-parameter docstrings. Finally, the first line dedentedback to column 0 end parameter declarations and start the function docstring.

Parameter docstrings are optional; function docstrings are not.Functions that specify no arguments may simply specify the functiondeclaration followed by the docstring.

Module and Class Declarations

When a C file implements a module or class, this should be declared toClinic. The syntax is simple:

modulemodule_name

or

classmodule_name.class_name

(Note that these are not actually special syntax; they are implementedasDirectives.)

The module name or class name should always be the full dotted pathfrom the top-level module. Nested modules and classes are supported.

Function Declaration

The full form of the function declaration is as follows:

dotted.name[aslegal_c_id][->return_annotation]

The dotted name should be the full name of the function, startingwith the highest-level package (e.g. “os.stat” or “curses.window.addch”).

The “as legal_c_id” syntax is optional.Argument Clinic uses the name of the function to create the names ofthe generated C functions. In some circumstances, the generated namemay collide with other global names in the C program’s namespace.The “as legal_c_id” syntax allows you to override the generated namewith your own; substitute “legal_c_id” with any legal C identifier.If skipped, the “as” keyword must also be omitted.

The return annotation is also optional. If skipped, the arrow (”->”)must also be omitted. If specified, the value for the return annotationmust be compatible withast.literal_eval, and it is interpreted asareturn converter.

Parameter Declaration

The full form of the parameter declaration line as follows:

name:converter[(parameter=value[,parameter2=value2])][=default]

The “name” must be a legal C identifier. Whitespace is permitted betweenthe name and the colon (though this is not the preferred style). Whitespaceis permitted (and encouraged) between the colon and the converter.

The “converter” is the name of one of the “converter functions” registeredwith Argument Clinic. Clinic will ship with a number of built-in converters;new converters can also be added dynamically. In choosing a converter, youare automatically constraining what Python types are permitted on the input,and specifying what type the output variable (or variables) will be. Althoughmany of the converters will resemble the names of C types or perhaps Pythontypes, the name of a converter may be any legal Python identifier.

If the converter is followed by parentheses, these parentheses encloseparameter to the conversion function. The syntax mirrors providing argumentsa Python function call: the parameter must always be named, as if they were“keyword-only parameters”, and the values provided for the parameters willsyntactically resemble Python literal values. These parameters are alwaysoptional, permitting all conversion functions to be called withoutany parameters. In this case, you may also omit the parentheses entirely;this is always equivalent to specifying empty parentheses. The valuessupplied for these parameters must be compatible withast.literal_eval.

The “default” is a Python literal value. Default values are optional;if not specified you must omit the equals sign too. Parameters whichdon’t have a default are implicitly required. The default value isdynamically assigned, “live” in the generated C code, and althoughit’s specified as a Python value, it’s translated into a native Cvalue in the generated C code. Few default values are permitted,owing to this manual translation step.

If this were a Python function declaration, a parameter declarationwould be delimited by either a trailing comma or an ending parenthesis.However, Argument Clinic uses neither; parameter declarations aredelimited by a newline. A trailing comma or right parenthesis is notpermitted.

The first parameter declaration establishes the indent for all parameterdeclarations in a particular Clinic code block. All subsequent parametersmust be indented to the same level.

Legacy Converters

For convenience’s sake in converting existing code to Argument Clinic,Clinic provides a set of legacy converters that matchPyArg_ParseTupleformat units. They are specified as a C string containing the formatunit. For example, to specify a parameter “foo” as taking a Python“int” and emitting a C int, you could specify:

foo:"i"

(To more closely resemble a C string, these must always use double quotes.)

Although these resemblePyArg_ParseTuple format units, no guarantee ismade that the implementation will call aPyArg_Parse function for parsing.

This syntax does not support parameters. Therefore, it doesn’t support anyof the format units that require input parameters ("O!","O&","es","es#","et","et#"). Parameters requiring one of these conversions cannot use thelegacy syntax. (You may still, however, supply a default value.)

Parameter Docstrings

All lines that appear below and are indented further than a parameter declarationare the docstring for that parameter. All such lines are “dedented” until thefirst line is flush left.

Special Syntax For Parameter Lines

There are four special symbols that may be used in the parameter section. Eachof these must appear on a line by itself, indented to the same level as parameterdeclarations. The four symbols are:

*
Establishes that all subsequent parameters are keyword-only.
[
Establishes the start of an optional “group” of parameters.Note that “groups” may nest inside other “groups”.SeeFunctions With Positional-Only Parameters below.Note that currently[ is only legal for use in functionswhereall parameters are marked positional-only, see/ below.
]
Ends an optional “group” of parameters.
/
Establishes that all theproceeding arguments arepositional-only. For now, Argument Clinic does notsupport functions with both positional-only andnon-positional-only arguments. Therefore: if/is specified for a function, it must currently alwaysbe after thelast parameter. Also, Argument Clinicdoes not currently support default values forpositional-only parameters.

(The semantics of/ follow a syntax for positional-onlyparameters in Python once proposed by Guido.[5] )

Function Docstring

The first line with no leading whitespace after the function declaration is thefirst line of the function docstring. All subsequent lines of the Clinic blockare considered part of the docstring, and their leading whitespace is preserved.

If the string{parameters} appears on a line by itself inside the functiondocstring, Argument Clinic will insert a list of all parameters that havedocstrings, each such parameter followed by its docstring. The name of theparameter is on a line by itself; the docstring starts on a subsequent line,and all lines of the docstring are indented by two spaces. (Parameters withno per-parameter docstring are suppressed.) The entire list is indented by theleading whitespace that appeared before the{parameters} token.

If the string{parameters} doesn’t appear in the docstring, Argument Clinicwill append one to the end of the docstring, inserting a blank line above it ifthe docstring does not end with a blank line, and with the parameter list atcolumn 0.

Converters

Argument Clinic contains a pre-initialized registry of converter functions.Example converter functions:

int
Accepts a Python object implementing__int__; emits a Cint.
byte
Accepts a Python int; emits anunsignedchar. The integermust be in the range [0, 256).
str
Accepts a Python str object; emits a Cchar*. Automaticallyencodes the string using theascii codec.
PyObject
Accepts any object; emits a CPyObject* without any conversion.

All converters accept the following parameters:

doc_default
The Python value to use in place of the parameter’s actual defaultin Python contexts. In other words: when specified, this value willbe used for the parameter’s default in the docstring, and in theSignature. (TBD alternative semantics: If the string is a validPython expression which can be rendered into a Python value usingeval(), then the result ofeval() on it will be used as thedefault in theSignature.) Ignored if there is no default.
required
Normally any parameter that has a default value is automaticallyoptional. A parameter that has “required” set will be consideredrequired (non-optional) even if it has a default value. Thegenerated documentation will also not show any default value.

Additionally, converters may accept one or more of these optionalparameters, on an individual basis:

annotation
Explicitly specifies the per-parameter annotation for thisparameter. Normally it’s the responsibility of the conversionfunction to generate the annotation (if any).
bitwise
For converters that accept unsigned integers. If the Python integerpassed in is signed, copy the bits directly even if it is negative.
encoding
For converters that accept str. Encoding to use when encoding aUnicode string to achar*.
immutable
Only accept immutable values.
length
For converters that accept iterable types. Requests that the converteralso emit the length of the iterable, passed in to the_impl functionin aPy_ssize_t variable; its name will be thisparameter’s name appended with “_length”.
nullable
This converter normally does not acceptNone, but in this caseit should. IfNone is supplied on the Python side, the equivalentC argument will beNULL. (The_impl argument emitted by thisconverter will presumably be a pointer type.)
types
A list of strings representing acceptable Python types for this object.There are also four strings which represent Python protocols:
  • “buffer”
  • “mapping”
  • “number”
  • “sequence”
zeroes
For converters that accept string types. The converted value shouldbe allowed to have embedded zeroes.

Return Converters

Areturn converter conceptually performs the inverse operation ofa converter: it converts a native C value into its equivalent Pythonvalue.

Directives

Argument Clinic also permits “directives” in Clinic code blocks.Directives are similar topragmas in C; they are statementsthat modify Argument Clinic’s behavior.

The format of a directive is as follows:

directive_name[argument[second_argument[...]]]

Directives only take positional arguments.

A Clinic code block must contain either one or more directives,or a function declaration. It may contain both, in whichcase all directives must come before the function declaration.

Internally directives map directly to Python callables.The directive’s arguments are passed directly to the callableas positional arguments of typestr().

Example possible directives include the production,suppression, or redirection of Clinic output. Also, the“module” and “class” keywords are implementedas directives in the prototype.

Python Code

Argument Clinic also permits embedding Python code inside C files,which is executed in-place when Argument Clinic processes the file.Embedded code looks like this:

/*[python]# this is python code!print("/" + "* Hello world! *" + "/")[python]*//* Hello world! *//*[python end:da39a3ee5e6b4b0d3255bfef95601890afd80709]*/

The"/*Helloworld!*/" line above was generated by running the Pythoncode in the preceding comment.

Any Python code is valid. Python code sections in Argument Clinic canalso be used to directly interact with Clinic; seeArgument Clinic Programmatic Interfaces.

Output

Argument Clinic writes its output inline in the C file, immediatelyafter the section of Clinic code. For “python” sections, the outputis everything printed usingbuiltins.print. For “clinic”sections, the output is valid C code, including:

  • a#define providing the correctmethoddef structure for thefunction
  • a prototype for the “impl” function – this is what you’ll writeto implement this function
  • a function that handles all argument processing, which calls your“impl” function
  • the definition line of the “impl” function
  • and a comment indicating the end of output.

The intention is that you write the body of your impl function immediatelyafter the output – as in, you write a left-curly-brace immediately afterthe end-of-output comment and implement builtin in the body there.(It’s a bit strange at first, but oddly convenient.)

Argument Clinic will define the parameters of the impl function foryou. The function will take the “self” parameter passed inoriginally, all the parameters you define, and possibly some extragenerated parameters (“length” parameters; also “group” parameters,see next section).

Argument Clinic also writes a checksum for the output section. Thisis a valuable safety feature: if you modify the output by hand, Clinicwill notice that the checksum doesn’t match, and will refuse tooverwrite the file. (You can force Clinic to overwrite with the“-f” command-line argument; Clinic will also ignore the checksumswhen using the “-o” command-line argument.)

Finally, Argument Clinic can also emit the boilerplate definitionof the PyMethodDef array for the defined classes and modules.

Functions With Positional-Only Parameters

A significant fraction of Python builtins implemented in C use theolder positional-only API for processing arguments(PyArg_ParseTuple()). In some instances, these builtins parsetheir arguments differently based on how many arguments were passedin. This can provide some bewildering flexibility: there may begroups of optional parameters, which must either all be specified ornone specified. And occasionally these groups are on theleft! (Arepresentative example:curses.window.addch().)

Argument Clinic supports these legacy use-cases by allowing you tospecify parameters in groups. Each optional group of parametersis marked with square brackets. Note that these groups are permittedon the rightor left of any required parameters!

The impl function generated by Clinic will add an extra parameter forevery group, “intgroup_{left|right}_<x>”, where x is a monotonicallyincreasing number assigned to each group as it builds away from therequired arguments. This argument will be nonzero if the group wasspecified on this call, and zero if it was not.

Note that when operating in this mode, you cannot specify defaultarguments.

Also, note that it’s possible to specify a set of groups to a functionsuch that there are several valid mappings from the number ofarguments to a valid set of groups. If this happens, Clinic will abortwith an error message. This should not be a problem, aspositional-only operation is only intended for legacy use cases, andall the legacy functions using this quirky behavior have unambiguousmappings.

Current Status

As of this writing, there is a working prototype implementation ofArgument Clinic available online (though the syntax may be out of dateas you read this).[6] The prototype generates code using theexistingPyArg_Parse APIs. It supports translating to all currentformat units except the mysterious"w*". Sample functions usingArgument Clinic exercise all major features, including positional-onlyargument parsing.

Argument Clinic Programmatic Interfaces

The prototype also currently provides an experimental extensionmechanism, allowing adding support for new types on-the-fly. SeeModules/posixmodule.c in the prototype for an example of its use.

In the future, Argument Clinic is expected to be automatable enoughto allow querying, modification, or outright new construction offunction declarations through Python code. It may even permitdynamically adding your own custom DSL!

Notes / TBD

  • The API for supplying inspect.Signature metadata for builtins iscurrently under discussion. Argument Clinic will add support forthe prototype when it becomes viable.
  • Alyssa Coghlan suggests that we a) only support at most one left-optionalgroup per function, and b) in the face of ambiguity, prefer the leftgroup over the right group. This would solve all our existing use casesincluding range().
  • Optimally we’d want Argument Clinic run automatically as part of thenormal Python build process. But this presents a bootstrapping problem;if you don’t have a system Python 3, you need a Python 3 executable tobuild Python 3. I’m sure this is a solvable problem, but I don’t knowwhat the best solution might be. (Supporting this will also requirea parallel solution for Windows.)
  • On a related note: inspect.Signature has no way of representingblocks of arguments, like the left-optional block ofy andxforcurses.window.addch. How far are we going to go in supportingthis admittedly aberrant parameter paradigm?
  • During the PyCon US 2013 Language Summit, there was discussion of havingArgument Clinic also generate the actual documentation (in ReST, processedby Sphinx) for the function. The logistics of this are TBD, but it wouldrequire that the docstrings be written in ReST, and require that Pythonship a ReST -> ascii converter. It would be best to come to a decisionabout this before we begin any large-scale conversion of the CPythonsource tree to using Clinic.
  • Guido proposed having the “function docstring” be hand-written inline,in the middle of the output, something like this:
    /*[clinic]  ... prototype and parameters (including parameter docstrings) go here[clinic]*/... some output .../*[clinic docstring start]*/... hand-edited function docstring goes here   <-- you edit this by hand!/*[clinic docstring end]*/... more output/*[clinic output end]*/

    I tried it this way and don’t like it – I think it’s clumsy. Iprefer that everything you write goes in one place, rather thanhaving an island of hand-edited stuff in the middle of the DSLoutput.

  • Argument Clinic does not support automatic tuple unpacking(the “(OOO)” style format string forPyArg_ParseTuple().)
  • Argument Clinic removes some dynamism / flexibility. WithPyArg_ParseTuple() one could theoretically pass in differentencodings at runtime for the “es”/”et” format units.AFAICT CPython doesn’t do this itself, however it’s possibleexternal users might do this. (Trivia: there are no uses of“es” exercised by regrtest, and all the uses of “et”exercised are in socketmodule.c, except for one in _ssl.c.They’re all static, specifying the encoding"idna".)

Acknowledgements

The PEP author wishes to thank Ned Batchelder for permission toshamelessly rip off his clever design for Cog–“my favorite toolthat I’ve never gotten to use”. Thanks also to everyone who providedfeedback on the [bugtracker issue] and on python-dev. Special thanksto Alyssa (Nick) Coghlan and Guido van Rossum for a rousing two-hour in-persondeep dive on the topic at PyCon US 2013.

References

[Cog]
Cog:http://nedbatchelder.com/code/cog/
[1]
PyArg_ParseTuple():http://docs.python.org/3/c-api/arg.html#PyArg_ParseTuple
[2]
PyArg_ParseTupleAndKeywords():http://docs.python.org/3/c-api/arg.html#PyArg_ParseTupleAndKeywords
[3]
PyArg_ format units:http://docs.python.org/3/c-api/arg.html#strings-and-buffers
[4]
Keyword parameters for extension functions:http://docs.python.org/3/extending/extending.html#keyword-parameters-for-extension-functions
[5]
Guido van Rossum, posting to python-ideas, March 2012:https://mail.python.org/pipermail/python-ideas/2012-March/014364.htmlandhttps://mail.python.org/pipermail/python-ideas/2012-March/014378.htmlandhttps://mail.python.org/pipermail/python-ideas/2012-March/014417.html
[6]
Argument Clinic prototype:https://bitbucket.org/larry/python-clinic/

Copyright

This document has been placed in the public domain.


Source:https://github.com/python/peps/blob/main/peps/pep-0436.rst

Last modified:2025-02-01 08:59:27 GMT


[8]ページ先頭

©2009-2025 Movatter.jp