Movatterモバイル変換


[0]ホーム

URL:


PHP-FIGThe PHP Framework Interop Group

PSR-7 Meta Document

HTTP Message Meta Document

1. Summary

The purpose of this proposal is to provide a set of common interfaces for HTTPmessages as described inRFC 7230 andRFC 7231, and URIs as described inRFC 3986 (in the context of HTTP messages).

All HTTP messages consist of the HTTP protocol version being used, headers, anda message body. ARequest builds on the message to include the HTTP methodused to make the request, and the URI to which the request is made. AResponse includes the HTTP status code and reason phrase.

In PHP, HTTP messages are used in two contexts:

  • To send an HTTP request, via theext/curl extension, PHP's native streamlayer, etc., and process the received HTTP response. In other words, HTTPmessages are used when using PHP as anHTTP client.
  • To process an incoming HTTP request to the server, and return an HTTP responseto the client making the request. PHP can use HTTP messages when used as aserver-side application to fulfill HTTP requests.

This proposal presents an API for fully describing all parts of the variousHTTP messages within PHP.

2. HTTP Messages in PHP

PHP does not have built-in support for HTTP messages.

Client-side HTTP support

PHP supports sending HTTP requests via several mechanisms:

PHP streams are the most convenient and ubiquitous way to send HTTP requests,but pose a number of limitations with regards to properly configuring SSLsupport, and provide a cumbersome interface around setting things such asheaders. cURL provides a complete and expanded feature-set, but, as it is not adefault extension, is often not present. The http extension suffers from thesame problem as cURL, as well as the fact that it has traditionally had farfewer examples of usage.

Most modern HTTP client libraries tend to abstract the implementation, toensure they can work on whatever environment they are executed on, and acrossany of the above layers.

Server-side HTTP Support

PHP uses Server APIs (SAPI) to interpret incoming HTTP requests, marshal input,and pass off handling to scripts. The original SAPI design mirroredCommonGateway Interface, which would marshal request dataand push it into environment variables before passing delegation to a script;the script would then pull from the environment variables in order to processthe request and return a response.

PHP's SAPI design abstracts common input sources such as cookies, query stringarguments, and url-encoded POST content via superglobals ($_COOKIE,$_GET,and$_POST, respectively), providing a layer of convenience for web developers.

On the response side of the equation, PHP was originally developed as atemplating language, and allows intermixing HTML and PHP; any HTML portions ofa file are immediately flushed to the output buffer. Modern applications andframeworks eschew this practice, as it can lead to issues withregards to emitting a status line and/or response headers; they tend toaggregate all headers and content, and emit them at once when all otherapplication processing is complete. Special care needs to be paid to ensurethat error reporting and other actions that send content to the output bufferdo not flush the output buffer.

3. Why Bother?

HTTP messages are used in a wide number of PHP projects -- both clients andservers. In each case, we observe one or more of the following patterns orsituations:

  1. Projects use PHP's superglobals directly.
  2. Projects will create implementations from scratch.
  3. Projects may require a specific HTTP client/server library that providesHTTP message implementations.
  4. Projects may create adapters for common HTTP message implementations.

As examples:

  1. Just about any application that began development before the rise offrameworks, which includes a number of very popular CMS, forum, and shoppingcart systems, have historically used superglobals.
  2. Frameworks such as Symfony and Zend Framework each define HTTP componentsthat form the basis of their MVC layers; even small, single-purposelibraries such as oauth2-server-php provide and require their own HTTPrequest/response implementations. Guzzle, Buzz, and other HTTP clientimplementations each create their own HTTP message implementations as well.
  3. Projects such as Silex, Stack, and Drupal 8 have hard dependencies onSymfony's HTTP kernel. Any SDK built on Guzzle has a hard requirement onGuzzle's HTTP message implementations.
  4. Projects such as Geocoder create redundantadapters for commonlibraries.

Direct usage of superglobals has a number of concerns. First, these aremutable, which makes it possible for libraries and code to alter the values,and thus alter state for the application. Additionally, superglobals make unitand integration testing difficult and brittle, leading to code qualitydegradation.

In the current ecosystem of frameworks that implement HTTP message abstractions,the net result is that projects are not capable of interoperability orcross-pollination. In order to consume code targeting one framework fromanother, the first order of business is building a bridge layer between theHTTP message implementations. On the client-side, if a particular library doesnot have an adapter you can utilize, you need to bridge the request/responsepairs if you wish to use an adapter from another library.

Finally, when it comes to server-side responses, PHP gets in its own way: anycontent emitted before a call toheader() will result in that call becoming ano-op; depending on error reporting settings, this can often mean headersand/or response status are not correctly sent. One way to work around this isto use PHP's output buffering features, but nesting of output buffers canbecome problematic and difficult to debug. Frameworks and applications thustend to create response abstractions for aggregating headers and content thatcan be emitted at once - and these abstractions are often incompatible.

Thus, the goal of this proposal is to abstract both client- and server-siderequest and response interfaces in order to promote interoperability betweenprojects. If projects implement these interfaces, a reasonable level ofcompatibility may be assumed when adopting code from different libraries.

It should be noted that the goal of this proposal is not to obsolete thecurrent interfaces utilized by existing PHP libraries. This proposal is aimedat interoperability between PHP packages for the purpose of describing HTTPmessages.

4. Scope

4.1 Goals

  • Provide the interfaces needed for describing HTTP messages.
  • Focus on practical applications and usability.
  • Define the interfaces to model all elements of the HTTP message and URIspecifications.
  • Ensure that the API does not impose arbitrary limits on HTTP messages. Forexample, some HTTP message bodies can be too large to store in memory, so wemust account for this.
  • Provide useful abstractions both for handling incoming requests forserver-side applications and for sending outgoing requests in HTTP clients.

4.2 Non-Goals

  • This proposal does not expect all HTTP client libraries or server-sideframeworks to change their interfaces to conform. It is strictly meant forinteroperability.
  • While everyone's perception of what is and is not an implementation detailvaries, this proposal should not impose implementation details. AsRFCs 7230, 7231, and 3986 do not force any particular implementation,there will be a certain amount of invention needed to describe HTTP messageinterfaces in PHP.

5. Design Decisions

Message design

TheMessageInterface provides accessors for the elements common to all HTTPmessages, whether they are for requests or responses. These elements include:

  • HTTP protocol version (e.g., "1.0", "1.1")
  • HTTP headers
  • HTTP message body

More specific interfaces are used to describe requests and responses, and morespecifically the context of each (client- vs. server-side). These divisions arepartly inspired by existing PHP usage, but also by other languages such asRuby'sRack,Python'sWSGI,Go'shttp package,Node'shttp module, etc.

Why are there header methods on messages rather than in a header bag?

The message itself is a container for the headers (as well as the other messageproperties). How these are represented internally is an implementation detail,but uniform access to headers is a responsibility of the message.

Why are URIs represented as objects?

URIs are values, with identity defined by the value, and thus should be modeledas value objects.

Additionally, URIs contain a variety of segments which may be accessed manytimes in a given request -- and which would require parsing the URI in order todetermine (e.g., viaparse_url()). Modeling URIs as value objects allowsparsing once only, and simplifies access to individual segments. It alsoprovides convenience in client applications by allowing users to create newinstances of a base URI instance with only the segments that change (e.g.,updating the path only).

Why does the request interface have methods for dealing with the request-target AND compose a URI?

RFC 7230 details the request line as containing a "request-target". Of the fourforms of request-target, only one is a URI compliant with RFC 3986; the mostcommon form used is origin-form, which represents the URI without thescheme or authority information. Moreover, since all forms are valid forpurposes of requests, the proposal must accommodate each.

RequestInterface thus has methods relating to the request-target. By default,it will use the composed URI to present an origin-form request-target, and, inthe absence of a URI instance, return the string "/". Another method,withRequestTarget(), allows specifying an instance with a specificrequest-target, allowing users to create requests that use one of the othervalid request-target forms.

The URI is kept as a discrete member of the request for a variety of reasons.For both clients and servers, knowledge of the absolute URI is typicallyrequired. In the case of clients, the URI, and specifically the scheme andauthority details, is needed in order to make the actual TCP connection. Forserver-side applications, the full URI is often required in order to validatethe request or to route to an appropriate handler.

Why value objects?

The proposal models messages and URIs asvalue objects.

Messages are values where the identity is the aggregate of all parts of themessage; a change to any aspect of the message is essentially a new message.This is the very definition of a value object. The practice by which changesresult in a new instance is termedimmutability,and is a feature designed to ensure the integrity of a given value.

The proposal also recognizes that most clients and server-sideapplications will need to be able to easily update message aspects, and, assuch, provides interface methods that will create new message instances withthe updates. These are generally prefixed with the verbiagewith orwithout.

Value objects provides several benefits when modeling HTTP messages:

  • Changes in URI state cannot alter the request composing the URI instance.
  • Changes in headers cannot alter the message composing them.

In essence, modeling HTTP messages as value objects ensures the integrity ofthe message state, and prevents the need for bi-directional dependencies, whichcan often go out-of-sync or lead to debugging or performance issues.

For HTTP clients, they allow consumers to build a base request with data suchas the base URI and required headers, without needing to build a brand newrequest or reset request state for each message the client sends:

$uri =new Uri('http://api.example.com');$baseRequest =new Request($uri,null, ['Authorization' =>'Bearer ' . $token,'Accept'        =>'application/json',]);$request = $baseRequest->withUri($uri->withPath('/user'))->withMethod('GET');$response = $client->sendRequest($request);// get user id from $response$body =new StringStream(json_encode(['tasks' => ['Code','Coffee',]]));$request = $baseRequest    ->withUri($uri->withPath('/tasks/user/' . $userId))    ->withMethod('POST')    ->withHeader('Content-Type','application/json')    ->withBody($body);$response = $client->sendRequest($request)// No need to overwrite headers or body!$request = $baseRequest->withUri($uri->withPath('/tasks'))->withMethod('GET');$response = $client->sendRequest($request);

On the server-side, developers will need to:

  • Deserialize the request message body.
  • Decrypt HTTP cookies.
  • Write to the response.

These operations can be accomplished with value objects as well, with a numberof benefits:

  • The original request state can be stored for retrieval by any consumer.
  • A default response state can be created with default headers and/or message body.

Most popular PHP frameworks have fully mutable HTTP messages today. The mainchanges necessary in consuming true value objects are:

  • Instead of calling setter methods or setting public properties, mutatormethods will be called, and the result assigned.
  • Developers must notify the application on a change in state.

As an example, in Zend Framework 2, instead of the following:

function(MvcEvent $e){    $response = $e->getResponse();    $response->setHeaderLine('x-foo','bar');}

one would now write:

function(MvcEvent $e){    $response = $e->getResponse();    $e->setResponse(        $response->withHeader('x-foo','bar')    );}

The above combines assignment and notification in a single call.

This practice has a side benefit of making explicit any changes to applicationstate being made.

New instances vs returning $this

One observation made on the variouswith*() methods is that they can likelysafelyreturn $this; if the argument presented will not result in a change inthe value. One rationale for doing so is performance (as this will not result ina cloning operation).

The various interfaces have been written with verbiage indicating thatimmutability MUST be preserved, but only indicate that "an instance" must bereturned containing the new state. Since instances that represent the same valueare considered equal, returning$this is functionally equivalent, and thusallowed.

Using streams instead of X

MessageInterface uses a body value that must implementStreamInterface. Thisdesign decision was made so that developers can send and receive (and/or receiveand send) HTTP messages that contain more data than can practically be stored inmemory while still allowing the convenience of interacting with message bodiesas a string. While PHP provides a stream abstraction by way of stream wrappers,stream resources can be cumbersome to work with: stream resources can only becast to a string usingstream_get_contents() or manually reading the remainderof a string. Adding custom behavior to a stream as it is consumed or populatedrequires registering a stream filter; however, stream filters can only be addedto a stream after the filter is registered with PHP (i.e., there is no streamfilter autoloading mechanism).

The use of a well- defined stream interface allows for the potential offlexible stream decorators that can be added to a request or responsepre-flight to enable things like encryption, compression, ensuring that thenumber of bytes downloaded reflects the number of bytes reported in theContent-Length of a response, etc. Decorating streams is a well-establishedpattern in the JavaandNodecommunities that allows for very flexible streams.

The majority of theStreamInterface API is based onPython's io module, which providesa practical and consumable API. Instead of implementing streamcapabilities using something like aWritableStreamInterface andReadableStreamInterface, the capabilities of a stream are provided by methodslikeisReadable(),isWritable(), etc. This approach is used by Python,C#, C++,Ruby,Node, and likely others.

What if I just want to return a file?

In some cases, you may want to return a file from the filesystem. The typicalway to do this in PHP is one of the following:

readfile($filename);stream_copy_to_stream(fopen($filename,'r'), fopen('php://output','w'));

Note that the above omits sending appropriateContent-Type andContent-Length headers; the developer would need to emit these prior tocalling the above code.

The equivalent using HTTP messages would be to use aStreamInterfaceimplementation that accepts a filename and/or stream resource, and to providethis to the response instance. A complete example, including setting appropriateheaders:

// where Stream is a concrete StreamInterface:$stream   =new Stream($filename);$finfo    =new finfo(FILEINFO_MIME);$response = $response    ->withHeader('Content-Type', $finfo->file($filename))    ->withHeader('Content-Length', (string) filesize($filename))    ->withBody($stream);

Emitting this response will send the file to the client.

What if I want to directly emit output?

Directly emitting output (e.g. viaecho,printf, or writing to thephp://output stream) is generally only advisable as a performance optimizationor when emitting large data sets. If it needs to be done and you still wishto work in an HTTP message paradigm, one approach would be to use acallback-basedStreamInterface implementation, perthisexample. Wrap any codeemitting output directly in a callback, pass that to an appropriateStreamInterface implementation, and provide it to the message body:

$output =new CallbackStream(function()use($request){    printf("The requested URI was: %s<br>\n", $request->getUri());return'';});return (new Response())    ->withHeader('Content-Type','text/html')    ->withBody($output);

What if I want to use an iterator for content?

Ruby's Rack implementation uses an iterator-based approach for server-sideresponse message bodies. This can be emulated using an HTTP message paradigm viaan iterator-backedStreamInterface approach, asdetailed in thepsr7examples repository.

Why are streams mutable?

TheStreamInterface API includes methods such aswrite() which canchange the message content -- which directly contradicts having immutablemessages.

The problem that arises is due to the fact that the interface is intended towrap a PHP stream or similar. A write operation therefore will proxy to writingto the stream. Even if we madeStreamInterface immutable, once the streamhas been updated, any instance that wraps that stream will also be updated --making immutability impossible to enforce.

Our recommendation is that implementations use read-only streams forserver-side requests and client-side responses.

Rationale for ServerRequestInterface

TheRequestInterface andResponseInterface have essentially 1:1correlations with the request and response messages described inRFC 7230. They provide interfaces forimplementing value objects that correspond to the specific HTTP message typesthey model.

For server-side applications there are other considerations forincoming requests:

  • Access to server parameters (potentially derived from the request, but alsopotentially the result of server configuration, and generally representedvia the$_SERVER superglobal; these are part of the PHP Server API (SAPI)).
  • Access to the query string arguments (usually encapsulated in PHP via the$_GET superglobal).
  • Access to the parsed body (i.e., data deserialized from the incoming requestbody; in PHP, this is typically the result of POST requests usingapplication/x-www-form-urlencoded content types, and encapsulated in the$_POST superglobal, but for non-POST, non-form-encoded data, could bean array or an object).
  • Access to uploaded files (encapsulated in PHP via the$_FILES superglobal).
  • Access to cookie values (encapsulated in PHP via the$_COOKIE superglobal).
  • Access to attributes derived from the request (usually, but not limited to,those matched against the URL path).

Uniform access to these parameters increases the viability of interoperabilitybetween frameworks and libraries, as they can now assume that if a requestimplementsServerRequestInterface, they can get at these values. It alsosolves problems within the PHP language itself:

  • Until 5.6.0,php://input was read-once; as such, instantiating multiplerequest instances from multiple frameworks/libraries could lead toinconsistent state, as the first to accessphp://input would be the onlyone to receive the data.
  • Unit testing against superglobals (e.g.,$_GET,$_FILES, etc.) isdifficult and typically brittle. Encapsulating them inside theServerRequestInterface implementation eases testing considerations.

Why "parsed body" in the ServerRequestInterface?

Arguments were made to use the terminology "BodyParams", and require the valueto be an array, with the following rationale:

  • Consistency with other server-side parameter access.
  • $_POST is an array, and the 80% use case would target that superglobal.
  • A single type makes for a strong contract, simplifying usage.

The main argument is that if the body parameters are an array, developers havepredictable access to values:

$foo =isset($request->getBodyParams()['foo'])    ? $request->getBodyParams()['foo']    :null;

The argument for using "parsed body" was made by examining the domain. A messagebody can contain literally anything. While traditional web applications useforms and submit data using POST, this is a use case that is quickly beingchallenged in current web development trends, which are often API-centric, andthus use alternate request methods (notably PUT and PATCH), as well asnon-form-encoded content (generally JSON or XML) thatcan be coerced to arraysin many cases, but in many cases alsocannot orshould not.

If forcing the property representing the parsed body to be only an array,developers then need a shared convention about where to put the results ofparsing the body. These might include:

  • A special key under the body parameters, such as__parsed__.
  • A specially named attribute, such as__body__.

The end result is that a developer now has to look in multiple locations:

$data = $request->getBodyParams();if (isset($data['__parsed__']) && is_object($data['__parsed__'])) {    $data = $data['__parsed__'];}// or:$data = $request->getBodyParams();if ($request->hasAttribute('__body__')) {    $data = $request->getAttribute('__body__');}

The solution presented is to use the terminology "ParsedBody", which impliesthat the values are the results of parsing the message body. This also meansthat the return valuewill be ambiguous; however, because this is an attributeof the domain, this is also expected. As such, usage will become:

$data = $request->getParsedBody();if (! $datainstanceof \stdClass) {// raise an exception!}// otherwise, we have what we expected

This approach removes the limitations of forcing an array, at the expense ofambiguity of return value. Considering that the other suggested solutions —pushing the parsed data into a special body parameter key or into an attribute —also suffer from ambiguity, the proposed solution is simpler as it does notrequire additions to the interface specification. Ultimately, the ambiguityenables the flexibility required when representing the results of parsing thebody.

Why is no functionality included for retrieving the "base path"?

Many frameworks provide the ability to get the "base path," usually consideredthe path up to and including the front controller. As an example, if theapplication is served athttp://example.com/b2b/index.php, and the current URIused to request it ishttp://example.com/b2b/index.php/customer/register, thefunctionality to retrieve the base path would return/b2b/index.php. This valuecan then be used by routers to strip that path segment prior to attempting amatch.

This value is often also then used for URI generation within applications;parameters will be passed to the router, which will generate the path, andprefix it with the base path in order to return a fully-qualified URI. Othertools — typically view helpers, template filters, or template functions — areused to resolve a path relative to the base path in order to generate a URI forlinking to resources such as static assets.

On examination of several different implementations, we noticed the following:

  • The logic for determining the base path varies widely between implementations.As an example, compare thelogic in ZF2to thelogic in Symfony 2.
  • Most implementations appear to allow manual injection of a base path to therouter and/or any facilities used for URI generation.
  • The primary use cases — routing and URI generation — typically are the onlyconsumers of the functionality; developers usually do not need to be awareof the base path concept as other objects take care of that detail for them.As examples:
    • A router will strip off the base path for you during routing; you do notneed to pass the modified path to the router.
    • View helpers, template filters, etc. typically are injected with a base pathprior to invocation. Sometimes this is manually done, though more often itis the result of framework wiring.
  • All sources necessary for calculating the base pathare already in theRequestInterface instance, via server parameters and the URI instance.

Our stance is that base path detection is framework and/or applicationspecific, and the results of detection can be easily injected into objects thatneed it, and/or calculated as needed using utility functions and/or classes fromtheRequestInterface instance itself.

Why does getUploadedFiles() return objects instead of arrays?

getUploadedFiles() returns a tree ofPsr\Http\Message\UploadedFileInterfaceinstances. This is done primarily to simplify specification: instead ofrequiring paragraphs of implementation specification for an array, we specify aninterface.

Additionally, the data in anUploadedFileInterface is normalized to work inboth SAPI and non-SAPI environments. This allows the creation of processes to parsethe message body manually and assign contents to streams without first writingto the filesystem, while still allowing proper handling of file uploads in SAPIenvironments.

What about "special" header values?

A number of header values contain unique representation requirements which canpose problems both for consumption as well as generation; in particular, cookiesand theAccept header.

This proposal does not provide any special treatment of any header types. ThebaseMessageInterface provides methods for header retrieval and setting, andall header values are, in the end, string values.

Developers are encouraged to write commodity libraries for interacting withthese header values, either for the purposes of parsing or generation. Users maythen consume these libraries when needing to interact with those values.Examples of this practice already exist in libraries such aswilldurand/Negotiation andAura.Accept. So long as the objecthas functionality for casting the value to a string, these objects can beused to populate the headers of an HTTP message.

6. People

6.1 Editor(s)

  • Matthew Weier O'Phinney

6.2 Sponsors

  • Paul M. Jones
  • Beau Simensen (coordinator)

6.3 Contributors

  • Michael Dowling
  • Larry Garfield
  • Evert Pot
  • Tobias Schultze
  • Bernhard Schussek
  • Anton Serdyuk
  • Phil Sturgeon
  • Chris Wilkinson

7. Errata

7.1 Validation of Header Names and Values

Some special characters within the name or value of an HTTP header might affectthe parsing of the serialized message in a way that the contents of unrelatedheaders are changed. This misparsing can open up an application to securityvulnerabilities. A common type of vulnerability is CRLF injection, allowingan attacker to inject additional headers or to end the list of headers early.

For this reason classes implementing theMessageInterface SHOULD strictlyvalidate the header names and contents according to the most recent HTTPspecification (RFC 7230#3.2 at the time of writing). The implementationSHOULD reject invalid values and SHOULD NOT make any attempt to automaticallycorrect the provided values.

A minimally viable validator is expected to reject header names containing thefollowing characters:

  • NUL (0x00)
  • \r (0x0D)
  • \n (0x0A)
  • Any character less than or equal to 0x20.

Further characters or sequences in header names should be rejected accordingto the HTTP specification.

A minimally viable validator is expected to reject header values containing thefollowing characters:

  • NUL (0x00)
  • \r (0x0D)
  • \n (0x0A)

If compatibility with older systems is desired then the sequence\r\n (0x0D0A)within a header value MAY be accepted if and only if it is immediately followedby either SPACE (0x20) or\t (0x09). The full sequence SHOULD then internallybe normalized to a single SPACE (0x20).

Further characters or sequences in header values should be rejected accordingto the HTTP specification.

7.2 Type Additions

The 1.1 release of thepsr/http-message package includes scalar parameter types.The 2.0 release of the package includes return types.This structure leverages PHP 7.2 covariance support to allow for a gradual upgrade process, but requires PHP 8.0 for type compatibility.

Implementers MAY add return types to their own packages at their discretion, provided that:

  • the return types match those in the 2.0 package.
  • the implementation specifies a minimum PHP version of 7.2.0 or later.

Implementers MAY add parameter types to their own packages in a new major release, either at the same time as adding return types or in a subsequent release, provided that:

  • the parameter types match those in the 1.1 package.
  • the implementation specifies a minimum PHP version of 7.2.0 or later.
  • the implementation depends on"psr/http-message": "^1.1 || ^2.0" so as to exclude the untyped 1.0 version.

Implementers are encouraged but not required to transition their packages toward the 2.0 version of the package at their earliest convenience.

7.3 Escaping User Info

Some characters are reserved in the user info part of the authority section.According to (RFC3986 2.2 and 3.2.1)[https://www.rfc-editor.org/rfc/rfc3986], the reserved characters are"/" / "?" / "#" / "[" / "]" / "@".Additionally,: must be encoded when in the username because it is used to separate username and password.

UriInterface::withUserInfo() MUST NOT double encode reserved characters.

UriInterface::getUserInfo() MUST encode the reserved characters according to RFC3986 when returning the authority.If there is a password, the: between username and password MUST NOT be encoded.


[8]ページ先頭

©2009-2025 Movatter.jp