draft-ietf-httpbis-header-structure-00.txt   draft-ietf-httpbis-header-structure-latest.txt 
HTTP Working Group P-H. Kamp HTTP Working Group P-H. Kamp
Internet-Draft The Varnish Cache Project Internet-Draft The Varnish Cache Project
Intended status: Standards Track December 10, 2016 Intended status: Standards Track February 20, 2017
Expires: June 13, 2017 Expires: August 24, 2017
HTTP Header Common Structure HTTP Header Common Structure
draft-ietf-httpbis-header-structure-00 draft-ietf-httpbis-header-structure-latest
Abstract Abstract
An abstract data model for HTTP headers, "Common Structure", and a An abstract data model for HTTP headers, "Common Structure", and a
HTTP/1 serialization of it, generalized from current HTTP headers. HTTP/1 serialization of it, generalized from current HTTP headers.
Note to Readers Note to Readers
Discussion of this draft takes place on the HTTP working group Discussion of this draft takes place on the HTTP working group
mailing list (ietf-http-wg@w3.org), which is archived at mailing list (ietf-http-wg@w3.org), which is archived at
skipping to change at page 2, line 30 skipping to change at page 2, line 30
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on June 13, 2017. This Internet-Draft will expire on August 24, 2017.
Copyright Notice Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the Copyright (c) 2017 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
skipping to change at page 6, line 10 skipping to change at page 5, line 22
representation, meant as the foundation on which all such representation, meant as the foundation on which all such
manifestations of the model can be built. manifestations of the model can be built.
Common Structure in ABNF: Common Structure in ABNF:
import token from RFC7230 import token from RFC7230
import DIGIT from RFC5234 import DIGIT from RFC5234
common-structure = 1* ( identifier dictionary ) common-structure = 1* ( identifier dictionary )
dictionary = * ( identifier value ) dictionary = * ( identifier [ value ] )
Key identifiers in dictionaries SHALL be unique, but semantically
overlapping key identifiers for instance 'text/plain' and 'text/*'
are ok.
value = identifier / value = identifier /
integer /
number / number /
ascii_string / ascii-string /
unicode_string / unicode-string /
blob / blob /
timestamp / timestamp /
common-structure common-structure
Recursion is included as a way to to support deep and more general
data structures, but its use is highly discouraged and where it is
used the depth of recursion SHALL always be explicitly limited.
identifier = token [ "/" token ] identifier = token [ "/" token ]
number = ["-"] 1*15 DIGIT integer = ["-"] 1*19 DIGIT
# XXX: Not sure how to do this in ABNF:
# XXX: A single "." allowed between any two digits
# The range is limited is to ensure it can be
# correctly represented in IEEE754 64 bit
# binary floating point format.
ascii_string = * %x20-7e Integers SHALL be in the range +/- 2^63-1 (= +/- 9223372036854775807)
# This is a "safe" string in the sense that it
# contains no control characters or multi-byte
# sequences. If that is not fancy enough, use
# unicode_string.
unicode_string = * unicode_codepoint number = ["-"] DIGIT '.' 1*14DIGIT /
# XXX: Is there a place to import this from ? ["-"] 2DIGIT '.' 1*13DIGIT /
# Unrestricted unicode, because there is no sane ["-"] 3DIGIT '.' 1*12DIGIT /
# way to restrict or otherwise make unicode "safe". ... /
["-"] 12DIGIT '.' 1*3DIGIT /
["-"] 13DIGIT '.' 1*2DIGIT /
["-"] 14DIGIT '.' 1DIGIT
The limit of 15 siginificant digits is chosen so that numbers can be
correctly represented by IEEE754 64 bit binary floating point.
ascii-string = * %x20-7e
This is intended to be an efficient, "safe" and uncomplicated string
type, for uses where the string content is culturally neutral or
where it will not be user visible.
unicode-string = * UNICODE
UNICODE = <U+0000-U+D7FF / U+E000-U+10FFFF>
# UNICODE nicked from draft-seantek-unicode-in-abnf-02
Unicode-strings are unrestricted because there is no sane and/or
culturally neutral way to subset or otherwise make unicode "safe",
and Unicode is still evolving new and interesting code points.
Users of unicode-string SHALL be prepared for the full gammut of
glyph-gymnastics in order to avoid U+1F4A9 U+08 U+1F574.
blob = * %0x00-ff blob = * %0x00-ff
# Intended for cryptographic data and as a general
# escape mechanism for unmet requirements.
timestamp = POSIX time_t with optional millisecond resolution Blobs are intended primarily for cryptographic data, but can be used
# XXX: Is there a place to import this from ? for any otherwise unsatisfied needs.
timestamp = number
A timestamp counts seconds since the UNIX time_t epoch, including the
"invisible leap-seconds" misfeature.
3. HTTP/1 Serialization of HTTP Header Common Structure 3. HTTP/1 Serialization of HTTP Header Common Structure
In ABNF: In ABNF:
import OWS from RFC7230 import OWS from RFC7230
import HEXDIG, DQUOTE from RFC5234 import HEXDIG, DQUOTE from RFC5234
import UTF8-2, UTF8-3, UTF8-4 from RFC3629 import EmbeddedUnicodeChar from BCP137
h1_common-structure-header = h1-common-structure-header =
( field-name ":" OWS ">" h1_common_structure "<" ) h1-common-structure-legacy-header /
# Self-identifying HTTP headers h1-common-structure-self-identifying-header
( field-name ":" OWS h1_common_structure ) /
# legacy HTTP headers on white-list, see {{iana}}
h1_common_structure = h1_element * ("," h1_element) h1-common-structure-legacy-header =
field-name ":" OWS h1-common-structure
h1_element = identifier * (";" identifier ["=" h1_value]) Only white-listed legacy headers (see Section 8) can use this format.
h1_value = identifier / h1-common-structure-self-identifying-header:
field-name ":" OWS ">" h1-common-structure "<"
h1-common-structure = h1-element * ("," h1-element)
h1-element = identifier * (";" identifier ["=" h1-value])
h1-value = identifier /
integer /
number / number /
h1_ascii_string / h1-ascii-string /
h1_unicode_string / h1-unicode-string /
h1_blob / h1-blob /
h1_timestamp / h1-timestamp /
h1_common-structure ">" h1-common-structure "<"
h1_ascii_string = DQUOTE *( h1-ascii-string = DQUOTE *(
( "\" DQUOTE ) / ( "\" DQUOTE ) /
( "\" "\" ) / ( "\" "\" ) /
0x20-21 / 0x20-21 /
0x23-5B / 0x23-5B /
0x5D-7E 0x5D-7E
) DQUOTE ) DQUOTE
# This is a proper subset of h1_unicode_string
# NB only allowed backslash escapes are \" and \\
h1_unicode_string = DQUOTE *( h1-unicode-string = DQUOTE *(
( "\" DQUOTE ) ( "\" DQUOTE )
( "\" "\" ) / ( "\" "\" ) /
( "\" "u" 4*HEXDIG ) / EmbeddedUnicodeChar /
0x20-21 / 0x20-21 /
0x23-5B / 0x23-5B /
0x5D-7E / 0x5D-7E /
UTF8-2 /
UTF8-3 /
UTF8-4
) DQUOTE ) DQUOTE
# This is UTF8 with HTTP1 unfriendly codepoints
# (00-1f, 7f) neutered with \uXXXX escapes.
h1_blob = "'" base64 "'" The dim prospects of ever getting a majority of HTTP1 paths 8-bit
# XXX: where to import base64 from ? clean makes UTF-8 unviable as H1 serialization. Given that very
little of the information in HTTP headers is presented to users in
the first place, improving H1 and HPACK efficiency by inventing a
more efficient BCP137 compliant escape-sequences seems unwarranted.
h1_timestamp = number h1-blob = ":" base64 ":"
# UNIX/POSIX time_t semantics. # XXX: where to import base64 from ?
# fractional seconds allowed.
h1_common_structure = ">" h1_common_structure "<" h1-timestamp = number
XXX: Allow OWS in parsers, but not in generators ? XXX: Allow OWS in parsers, but not in generators ?
In programming environments which do not define a native In programming environments which do not define a native
representation or serialization of Common Structure, the HTTP/1 representation or serialization of Common Structure, the HTTP/1
serialization should be used. serialization should be used.
4. When to use Common Structure Parser 4. When to use Common Structure Parser
All future standardized and all private HTTP headers using Common All future standardized and all private HTTP headers using Common
skipping to change at page 14, line 7 skipping to change at page 14, line 7
in the new field. in the new field.
The RFC723x headers listed in Appendix A.5 will get the value "False" The RFC723x headers listed in Appendix A.5 will get the value "False"
in the new field. in the new field.
All other existing entries in the registry will be set to "Unknown" All other existing entries in the registry will be set to "Unknown"
until and if the owner of the entry requests otherwise. until and if the owner of the entry requests otherwise.
9. Security Considerations 9. Security Considerations
TBD Unique dictionary keys are required to reduce the risk of smuggling
attacks.
10. Normative References 10. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/ Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/
RFC2119, March 1997, RFC2119, March 1997,
<http://www.rfc-editor.org/info/rfc2119>. <http://www.rfc-editor.org/info/rfc2119>.
[RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer [RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer
Protocol (HTTP/1.1): Message Syntax and Routing", Protocol (HTTP/1.1): Message Syntax and Routing",
skipping to change at page 16, line 33 skipping to change at page 16, line 33
The majority of RFC723x HTTP headers are lists. A few of them are The majority of RFC723x HTTP headers are lists. A few of them are
ordered, ('Content-Encoding'), some are unordered ('Connection') and ordered, ('Content-Encoding'), some are unordered ('Connection') and
some are ordered by 'q=%f' weight parameters ('Accept') some are ordered by 'q=%f' weight parameters ('Accept')
In most cases, the list elements are some kind of identifier, usually In most cases, the list elements are some kind of identifier, usually
derived from ABNF 'token' as defined by [RFC7230]. derived from ABNF 'token' as defined by [RFC7230].
A subgroup of headers, mostly related to MIME, uses what one could A subgroup of headers, mostly related to MIME, uses what one could
call a 'qualified token':: call a 'qualified token'::
qualified_token = token_or_asterix [ "/" token_or_asterix ] qualified-token = token-or-asterix [ "/" token-or-asterix ]
The second motif is parameterized list elements. The best known is The second motif is parameterized list elements. The best known is
the "q=0.5" weight parameter, but other parameters exist as well. the "q=0.5" weight parameter, but other parameters exist as well.
Generalizing from these motifs, our candidate "Common Structure" data Generalizing from these motifs, our candidate "Common Structure" data
model becomes an ordered list of named dictionaries. model becomes an ordered list of named dictionaries.
In pidgin ABNF, ignoring white-space for the sake of clarity, the In pidgin ABNF, ignoring white-space for the sake of clarity, the
HTTP/1.1 serialization of Common Structure is is something like: HTTP/1.1 serialization of Common Structure is is something like:
token_or_asterix = token from {{RFC7230}}, but also allowing "*" token-or-asterix = token from {{RFC7230}}, but also allowing "*"
qualified_token = token_or_asterix [ "/" token_or_asterix ] qualified-token = token-or-asterix [ "/" token-or-asterix ]
field-name, see {{RFC7230}} field-name, see {{RFC7230}}
Common_Structure_Header = field-name ":" 1#named_dictionary Common-Structure-Header = field-name ":" 1#named-dictionary
named_dictionary = qualified_token [ *(";" param) ] named-dictionary = qualified-token [ *(";" param) ]
param = token [ "=" value ] param = token [ "=" value ]
value = we'll get back to this in a moment. value = we'll get back to this in a moment.
Nineteen out of the RFC723x's 48 headers, almost 40%, can already be Nineteen out of the RFC723x's 48 headers, almost 40%, can already be
parsed using this definition, and none the rest have requirements parsed using this definition, and none the rest have requirements
which could not be met by this data model. See Appendix A.4 and which could not be met by this data model. See Appendix A.4 and
Appendix A.5 for the full survey details. Appendix A.5 for the full survey details.
skipping to change at page 23, line 4 skipping to change at page 22, line 8
Warning [RFC7234, Section 5.5] Warning [RFC7234, Section 5.5]
1#warning-value 1#warning-value
Proxy-Authenticate [RFC7235, Section 4.3] Proxy-Authenticate [RFC7235, Section 4.3]
WWW-Authenticate [RFC7235, Section 4.1] WWW-Authenticate [RFC7235, Section 4.1]
1#challenge 1#challenge
Appendix B. Changes Appendix B. Changes
B.1. Since draft-ietf-httpbis-header-structure-00 B.1. Since draft-ietf-httpbis-header-structure-00
Added uniqueness requirement on dictionary keys.
Added signed 64bit integer type
Drop UTF8, and settle on BCP137::EmbeddedUnicodeChar for h1-unicode-
string
Change h1_blob delimiter to ":" as "'" is valid t_char
Author's Address Author's Address
Poul-Henning Kamp Poul-Henning Kamp
The Varnish Cache Project The Varnish Cache Project
Email: phk@varnish-cache.org Email: phk@varnish-cache.org
 End of changes. 35 change blocks. 
63 lines changed or deleted 104 lines changed or added

This html diff was produced by rfcdiff 1.44jr. The latest version is available from http://tools.ietf.org/tools/rfcdiff/