cgi — Common Gateway Interface support

Source code:Lib/cgi.py

Deprecated since version 3.11, will be removed in version 3.13:Thecgi module is deprecated(seePEP 594 for details and alternatives).

TheFieldStorage class can typically be replaced withurllib.parse.parse_qsl() forGET andHEAD requests,and theemail.message module ormultipart forPOST andPUT.Mostutility functions have replacements.


Support module for Common Gateway Interface (CGI) scripts.

This module defines a number of utilities for use by CGI scripts written inPython.

The global variablemaxlen can be set to an integer indicating the maximumsize of a POST request. POST requests larger than this size will result in aValueError being raised during parsing. The default value of thisvariable is0, meaning the request size is unlimited.

Διαθεσιμότητα: not Emscripten, not WASI.

This module does not work or is not available on WebAssembly platformswasm32-emscripten andwasm32-wasi. SeeWebAssembly platforms for more information.

Introduction

A CGI script is invoked by an HTTP server, usually to process user inputsubmitted through an HTML<FORM> or<ISINDEX> element.

Most often, CGI scripts live in the server’s specialcgi-bin directory.The HTTP server places all sorts of information about the request (such as theclient’s hostname, the requested URL, the query string, and lots of othergoodies) in the script’s shell environment, executes the script, and sends thescript’s output back to the client.

The script’s input is connected to the client too, and sometimes the form datais read this way; at other times the form data is passed via the «query string»part of the URL. This module is intended to take care of the different casesand provide a simpler interface to the Python script. It also provides a numberof utilities that help in debugging scripts, and the latest addition is supportfor file uploads from a form (if your browser supports it).

The output of a CGI script should consist of two sections, separated by a blankline. The first section contains a number of headers, telling the client whatkind of data is following. Python code to generate a minimal header sectionlooks like this:

print("Content-Type: text/html")# HTML is followingprint()# blank line, end of headers

The second section is usually HTML, which allows the client software to displaynicely formatted text with header, in-line images, etc. Here’s Python code thatprints a simple piece of HTML:

print("<TITLE>CGI script output</TITLE>")print("<H1>This is my first CGI script</H1>")print("Hello, world!")

Using the cgi module

Begin by writingimportcgi.

When you write a new script, consider adding these lines:

importcgitbcgitb.enable()

This activates a special exception handler that will display detailed reports inthe web browser if any errors occur. If you’d rather not show the guts of yourprogram to users of your script, you can have the reports saved to filesinstead, with code like this:

importcgitbcgitb.enable(display=0,logdir="/path/to/logdir")

It’s very helpful to use this feature during script development. The reportsproduced bycgitb provide information that can save you a lot of time intracking down bugs. You can always remove thecgitb line later when youhave tested your script and are confident that it works correctly.

To get at submitted form data, use theFieldStorage class. If the formcontains non-ASCII characters, use theencoding keyword parameter set to thevalue of the encoding defined for the document. It is usually contained in theMETA tag in the HEAD section of the HTML document or by theContent-Type header. This reads the form contents from thestandard input or the environment (depending on the value of variousenvironment variables set according to the CGI standard). Since it may consumestandard input, it should be instantiated only once.

TheFieldStorage instance can be indexed like a Python dictionary.It allows membership testing with thein operator, and also supportsthe standard dictionary methodkeys() and the built-in functionlen(). Form fields containing empty strings are ignored and do not appearin the dictionary; to keep such values, provide a true value for the optionalkeep_blank_values keyword parameter when creating theFieldStorageinstance.

For instance, the following code (which assumes that theContent-Type header and blank line have already been printed)checks that the fieldsname andaddr are both set to a non-emptystring:

form=cgi.FieldStorage()if"name"notinformor"addr"notinform:print("<H1>Error</H1>")print("Please fill in the name and addr fields.")returnprint("<p>name:",form["name"].value)print("<p>addr:",form["addr"].value)...furtherformprocessinghere...

Here the fields, accessed throughform[key], are themselves instances ofFieldStorage (orMiniFieldStorage, depending on the formencoding). Thevalue attribute of the instance yieldsthe string value of the field. Thegetvalue() methodreturns this string value directly; it also accepts an optional second argumentas a default to return if the requested key is not present.

If the submitted form data contains more than one field with the same name, theobject retrieved byform[key] is not aFieldStorage orMiniFieldStorage instance but a list of such instances. Similarly, inthis situation,form.getvalue(key) would return a list of strings. If youexpect this possibility (when your HTML form contains multiple fields with thesame name), use thegetlist() method, which always returnsa list of values (so that you do not need to special-case the single itemcase). For example, this code concatenates any number of username fields,separated by commas:

value=form.getlist("username")usernames=",".join(value)

If a field represents an uploaded file, accessing the value via thevalue attribute or thegetvalue()method reads the entire file in memory as bytes. This may not be what youwant. You can test for an uploaded file by testing either thefilename attribute or thefileattribute. You can then read the data from thefileattribute before it is automatically closed as part of the garbage collection oftheFieldStorage instance(theread() andreadline() methods willreturn bytes):

fileitem=form["userfile"]iffileitem.file:# It's an uploaded file; count lineslinecount=0whileTrue:line=fileitem.file.readline()ifnotline:breaklinecount=linecount+1

FieldStorage objects also support being used in awithstatement, which will automatically close them when done.

If an error is encountered when obtaining the contents of an uploaded file(for example, when the user interrupts the form submission by clicking ona Back or Cancel button) thedone attribute of theobject for the field will be set to the value -1.

The file upload draft standard entertains the possibility of uploading multiplefiles from one field (using a recursivemultipart/* encoding).When this occurs, the item will be a dictionary-likeFieldStorage item.This can be determined by testing itstype attribute, which should bemultipart/form-data (or perhaps another MIME type matchingmultipart/*). In this case, it can be iterated over recursivelyjust like the top-level form object.

When a form is submitted in the «old» format (as the query string or as a singledata part of typeapplication/x-www-form-urlencoded), the items willactually be instances of the classMiniFieldStorage. In this case, thelist,file, andfilename attributes are alwaysNone.

A form submitted via POST that also has a query string will contain bothFieldStorage andMiniFieldStorage items.

Άλλαξε στην έκδοση 3.4:Thefile attribute is automatically closed upon thegarbage collection of the creatingFieldStorage instance.

Άλλαξε στην έκδοση 3.5:Added support for the context management protocol to theFieldStorage class.

Higher Level Interface

The previous section explains how to read CGI form data using theFieldStorage class. This section describes a higher level interfacewhich was added to this class to allow one to do it in a more readable andintuitive way. The interface doesn’t make the techniques described in previoussections obsolete — they are still useful to process file uploads efficiently,for example.

The interface consists of two simple methods. Using the methods you can processform data in a generic way, without the need to worry whether only one or morevalues were posted under one name.

In the previous section, you learned to write following code anytime youexpected a user to post more than one value under one name:

item=form.getvalue("item")ifisinstance(item,list):# The user is requesting more than one item.else:# The user is requesting only one item.

This situation is common for example when a form contains a group of multiplecheckboxes with the same name:

<inputtype="checkbox"name="item"value="1"/><inputtype="checkbox"name="item"value="2"/>

In most situations, however, there’s only one form control with a particularname in a form and then you expect and need only one value associated with thisname. So you write a script containing for example this code:

user=form.getvalue("user").upper()

The problem with the code is that you should never expect that a client willprovide valid input to your scripts. For example, if a curious user appendsanotheruser=foo pair to the query string, then the script would crash,because in this situation thegetvalue("user") method call returns a listinstead of a string. Calling theupper() method on a list is not valid(since lists do not have a method of this name) and results in anAttributeError exception.

Therefore, the appropriate way to read form data values was to always use thecode which checks whether the obtained value is a single value or a list ofvalues. That’s annoying and leads to less readable scripts.

A more convenient approach is to use the methodsgetfirst()andgetlist() provided by this higher level interface.

FieldStorage.getfirst(name,default=None)

This method always returns only one value associated with form fieldname.The method returns only the first value in case that more values were postedunder such name. Please note that the order in which the values are receivedmay vary from browser to browser and should not be counted on.[1] If no suchform field or value exists then the method returns the value specified by theoptional parameterdefault. This parameter defaults toNone if notspecified.

FieldStorage.getlist(name)

This method always returns a list of values associated with form fieldname.The method returns an empty list if no such form field or value exists forname. It returns a list consisting of one item if only one such value exists.

Using these methods you can write nice compact code:

importcgiform=cgi.FieldStorage()user=form.getfirst("user","").upper()# This way it's safe.foriteminform.getlist("item"):do_something(item)

Functions

These are useful if you want more control, or if you want to employ some of thealgorithms implemented in this module in other circumstances.

cgi.parse(fp=None,environ=os.environ,keep_blank_values=False,strict_parsing=False,separator='&')

Parse a query in the environment or from a file (the file defaults tosys.stdin). Thekeep_blank_values,strict_parsing andseparator parameters arepassed tourllib.parse.parse_qs() unchanged.

Deprecated since version 3.11, will be removed in version 3.13:This function, like the rest of thecgi module, is deprecated.It can be replaced by callingurllib.parse.parse_qs() directlyon the desired query string (except formultipart/form-data input,which can be handled as described forparse_multipart()).

cgi.parse_multipart(fp,pdict,encoding='utf-8',errors='replace',separator='&')

Parse input of typemultipart/form-data (for file uploads).Arguments arefp for the input file,pdict for a dictionary containingother parameters in theContent-Type header, andencoding,the request encoding.

Returns a dictionary just likeurllib.parse.parse_qs(): keys are thefield names, each value is a list of values for that field. For non-filefields, the value is a list of strings.

This is easy to use but not much good if you are expecting megabytes to beuploaded — in that case, use theFieldStorage class insteadwhich is much more flexible.

Άλλαξε στην έκδοση 3.7:Added theencoding anderrors parameters. For non-file fields, thevalue is now a list of strings, not bytes.

Άλλαξε στην έκδοση 3.10:Added theseparator parameter.

Deprecated since version 3.11, will be removed in version 3.13:This function, like the rest of thecgi module, is deprecated.It can be replaced with the functionality in theemail package(e.g.email.message.EmailMessage/email.message.Message)which implements the same MIME RFCs, or with themultipart PyPI project.

cgi.parse_header(string)

Parse a MIME header (such asContent-Type) into a main value and adictionary of parameters.

Deprecated since version 3.11, will be removed in version 3.13:This function, like the rest of thecgi module, is deprecated.It can be replaced with the functionality in theemail package,which implements the same MIME RFCs.

For example, withemail.message.EmailMessage:

fromemail.messageimportEmailMessagemsg=EmailMessage()msg['content-type']='application/json; charset="utf8"'main,params=msg.get_content_type(),msg['content-type'].params
cgi.test()

Robust test CGI script, usable as main program. Writes minimal HTTP headers andformats all information provided to the script in HTML format.

cgi.print_environ()

Format the shell environment in HTML.

cgi.print_form(form)

Format a form in HTML.

cgi.print_directory()

Format the current directory in HTML.

cgi.print_environ_usage()

Print a list of useful (used by CGI) environment variables in HTML.

Caring about security

There’s one important rule: if you invoke an external program (viaos.system(),os.popen() or other functions with similarfunctionality), make very sure you don’t pass arbitrary strings received fromthe client to the shell. This is a well-known security hole whereby cleverhackers anywhere on the web can exploit a gullible CGI script to invokearbitrary shell commands. Even parts of the URL or field names cannot betrusted, since the request doesn’t have to come from your form!

To be on the safe side, if you must pass a string gotten from a form to a shellcommand, you should make sure the string contains only alphanumeric characters,dashes, underscores, and periods.

Installing your CGI script on a Unix system

Read the documentation for your HTTP server and check with your local systemadministrator to find the directory where CGI scripts should be installed;usually this is in a directorycgi-bin in the server tree.

Make sure that your script is readable and executable by «others»; the Unix filemode should be0o755 octal (usechmod0755filename). Make sure that thefirst line of the script contains#! starting in column 1 followed by thepathname of the Python interpreter, for instance:

#!/usr/local/bin/python

Make sure the Python interpreter exists and is executable by «others».

Make sure that any files your script needs to read or write are readable orwritable, respectively, by «others» — their mode should be0o644 forreadable and0o666 for writable. This is because, for security reasons, theHTTP server executes your script as user «nobody», without any specialprivileges. It can only read (write, execute) files that everybody can read(write, execute). The current directory at execution time is also different (itis usually the server’s cgi-bin directory) and the set of environment variablesis also different from what you get when you log in. In particular, don’t counton the shell’s search path for executables (PATH) or the Python modulesearch path (PYTHONPATH) to be set to anything interesting.

If you need to load modules from a directory which is not on Python’s defaultmodule search path, you can change the path in your script, before importingother modules. For example:

importsyssys.path.insert(0,"/usr/home/joe/lib/python")sys.path.insert(0,"/usr/local/lib/python")

(This way, the directory inserted last will be searched first!)

Instructions for non-Unix systems will vary; check your HTTP server’sdocumentation (it will usually have a section on CGI scripts).

Testing your CGI script

Unfortunately, a CGI script will generally not run when you try it from thecommand line, and a script that works perfectly from the command line may failmysteriously when run from the server. There’s one reason why you should stilltest your script from the command line: if it contains a syntax error, thePython interpreter won’t execute it at all, and the HTTP server will most likelysend a cryptic error to the client.

Assuming your script has no syntax errors, yet it does not work, you have nochoice but to read the next section.

Debugging CGI scripts

First of all, check for trivial installation errors — reading the sectionabove on installing your CGI script carefully can save you a lot of time. Ifyou wonder whether you have understood the installation procedure correctly, tryinstalling a copy of this module file (cgi.py) as a CGI script. Wheninvoked as a script, the file will dump its environment and the contents of theform in HTML format. Give it the right mode etc., and send it a request. If it’sinstalled in the standardcgi-bin directory, it should be possible tosend it a request by entering a URL into your browser of the form:

http://yourhostname/cgi-bin/cgi.py?name=Joe+Blow&addr=At+Home

If this gives an error of type 404, the server cannot find the script – perhapsyou need to install it in a different directory. If it gives another error,there’s an installation problem that you should fix before trying to go anyfurther. If you get a nicely formatted listing of the environment and formcontent (in this example, the fields should be listed as «addr» with value «AtHome» and «name» with value «Joe Blow»), thecgi.py script has beeninstalled correctly. If you follow the same procedure for your own script, youshould now be able to debug it.

The next step could be to call thecgi module’stest() functionfrom your script: replace its main code with the single statement

cgi.test()

This should produce the same results as those gotten from installing thecgi.py file itself.

When an ordinary Python script raises an unhandled exception (for whateverreason: of a typo in a module name, a file that can’t be opened, etc.), thePython interpreter prints a nice traceback and exits. While the Pythoninterpreter will still do this when your CGI script raises an exception, mostlikely the traceback will end up in one of the HTTP server’s log files, or bediscarded altogether.

Fortunately, once you have managed to get your script to executesome code,you can easily send tracebacks to the web browser using thecgitb module.If you haven’t done so already, just add the lines:

importcgitbcgitb.enable()

to the top of your script. Then try running it again; when a problem occurs,you should see a detailed report that will likely make apparent the cause of thecrash.

If you suspect that there may be a problem in importing thecgitb module,you can use an even more robust approach (which only uses built-in modules):

importsyssys.stderr=sys.stdoutprint("Content-Type: text/plain")print()...yourcodehere...

This relies on the Python interpreter to print the traceback. The content typeof the output is set to plain text, which disables all HTML processing. If yourscript works, the raw HTML will be displayed by your client. If it raises anexception, most likely after the first two lines have been printed, a tracebackwill be displayed. Because no HTML interpretation is going on, the tracebackwill be readable.

Common problems and solutions

  • Most HTTP servers buffer the output from CGI scripts until the script iscompleted. This means that it is not possible to display a progress report onthe client’s display while the script is running.

  • Check the installation instructions above.

  • Check the HTTP server’s log files. (tail-flogfile in a separate windowmay be useful!)

  • Always check a script for syntax errors first, by doing something likepythonscript.py.

  • If your script does not have any syntax errors, try addingimportcgitb;cgitb.enable() to the top of the script.

  • When invoking external programs, make sure they can be found. Usually, thismeans using absolute path names —PATH is usually not set to a veryuseful value in a CGI script.

  • When reading or writing external files, make sure they can be read or writtenby the userid under which your CGI script will be running: this is typically theuserid under which the web server is running, or some explicitly specifieduserid for a web server’ssuexec feature.

  • Don’t try to give a CGI script a set-uid mode. This doesn’t work on mostsystems, and is a security liability as well.

Footnotes

[1]

Note that some recent versions of the HTML specification do state whatorder the field values should be supplied in, but knowing whether a requestwas received from a conforming browser, or even from a browser at all, istedious and error-prone.