- Notifications
You must be signed in to change notification settings - Fork42
Description
draft-ietf-httpbis-semantics-latest currently says:
Field values containing control (<x:ref>CTL</x:ref>) characters such as CR or LF are invalid; recipients &MUST; either reject a field value containing control characters, or convert them to SP before processing or forwarding the message.
However, when I implemented an HTTP/1.1 parsing library that followed this rule, it turned out to broken in the real world. Specifically, users reported that sites using Google Analytics were setting cookies with values containing the ASCII character\x01, which is a control character (python-hyper/h11#57). And in my investigations at the time, browsers and popular clients like curl all supported this just fine.
It would be great if the next round of RFCs could look into this in more detail and define thefield-content production in a way that would be acceptable to e.g. the browsers.
In my library we currently reject NUL and exotic whitespace (\r,\v,\f,\n), on the grounds that those are high-risk for interop bugs and splitting exploits, but allow all other bytes to pass through, including control characters. I don't know if that's the best solution, but in our experience it's at least closer to being real-world compatible than what the RFC draft currently says.