Link: http://trac.tools.ietf.org/wg/httpbis/trac/ticket/30
Origin: http://lists.w3.org/Archives/Public/ietf-http-wg/2004JanMar/0050.html
Component: p1-messaging
Is LWS permitted between the field-name and colon?
The grammar of RFC 2616 suggests that it is, because ":" is a separator character, and thus the rule for implied LWS between a token and a separator applies.
The wording suggests otherwise, although it is not explicit:
Each header field consists of a name followed by a colon (":") and the field value. Field names are case-insensitive. The field value MAY be preceded by any amount of LWS, though a single SP is preferred.
The wording explicit states LWS is permitted after the colon, suggesting that the intention is that it's not permitted before the colon.
Many authors have taken that interpretion, resulting in most of the servers I looked at not accepting LWS before the colon. (They should probably reject the request, but all of them treat it as an unknown header name including a space in the name token).
Apache now, and Mozilla, accept LWS at that position.
What about LWS before the field-name?
At first sight, this doesn't make sense: LWS at the start of the line indicates folding. However, all implementations I looked at accept a line beginning with LWS immediately after the Request-Line or Status-Line. Some of them treat the initial LWS as part of the field-name (they don't enforce the limited character range of tokens), or they skip the LWS.
Apache doesn't look for and ignore LWS prior to the first field-name. Neither do Squid, thttpd or lighttpd. Mozilla and phttpd do.
Technically, the grammar disallows LWS before the field-name: Implied LWS is only implied _between_ words and separators.
Both of these inconsistencies between programs, and also that lone CR is treated as LWS by some and not others, lead to potential security holes due to non-compliant messages that claim to be HTTP/1.1. Although it isn't the standard's role to state how a program should respond to every kind of invalid message, it would be good to clarify these points because they do have security implications (which was Apache's stated reason for their change):
My suggestion at the Vancouver meeting was to use
BWS = LWS ; bad whitespace that MUST NOT be sent
; but MUST be detected and removed by recipients
because failure to handle this consistently is a security hole.