draft-ietf-httpapi-ratelimit-headers-00.unpg.txt   draft-ietf-httpapi-ratelimit-headers-latest.txt 
HTTPAPI Working Group R. Polli HTTPAPI Working Group R. Polli
Internet-Draft Team Digitale, Italian Government Internet-Draft Team Digitale, Italian Government
Intended status: Standards Track A. Martinez Intended status: Standards Track A. Martinez
Expires: June 21, 2021 Red Hat Expires: September 19, 2021 Red Hat
December 18, 2020 March 18, 2021
RateLimit Header Fields for HTTP RateLimit Header Fields for HTTP
draft-ietf-httpapi-ratelimit-headers-00 draft-ietf-httpapi-ratelimit-headers-latest
Abstract Abstract
This document defines the RateLimit-Limit, RateLimit-Remaining, This document defines the RateLimit-Limit, RateLimit-Remaining,
RateLimit-Reset fields for HTTP, thus allowing servers to publish RateLimit-Reset fields for HTTP, thus allowing servers to publish
current request quotas and clients to shape their request policy and current service limits and clients to shape their request policy and
avoid being throttled out. avoid being throttled out.
Note to Readers Note to Readers
_RFC EDITOR: please remove this section before publication_ _RFC EDITOR: please remove this section before publication_
Discussion of this draft takes place on the HTTP working group Discussion of this draft takes place on the HTTP working group
mailing list (httpapi@ietf.org), which is archived at mailing list (httpapi@ietf.org), which is archived at
https://lists.w3.org/Archives/Public/ietf-httpapi-wg/ [1]. https://lists.w3.org/Archives/Public/ietf-httpapi-wg/ [1].
skipping to change at line 44 skipping to change at page 1, line 45
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on June 21, 2021. This Internet-Draft will expire on September 19, 2021.
Copyright Notice Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Rate-limiting and quotas 1.1. Rate-limiting and quotas . . . . . . . . . . . . . . . . 3
1.2. Current landscape of rate-limiting headers 1.2. Current landscape of rate-limiting headers . . . . . . . 4
1.2.1. Interoperability issues 1.2.1. Interoperability issues . . . . . . . . . . . . . . . 4
1.3. This proposal 1.3. This proposal . . . . . . . . . . . . . . . . . . . . . . 5
1.4. Goals 1.4. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5. Notational Conventions 1.5. Notational Conventions . . . . . . . . . . . . . . . . . 6
2. Expressing rate-limit policies 2. Expressing rate-limit policies . . . . . . . . . . . . . . . 6
2.1. Time window 2.1. Time window . . . . . . . . . . . . . . . . . . . . . . . 6
2.2. Request quota 2.2. Service limit . . . . . . . . . . . . . . . . . . . . . . 6
2.3. Quota policy 2.3. Quota policy . . . . . . . . . . . . . . . . . . . . . . 7
3. Header Specifications 3. Header Specifications . . . . . . . . . . . . . . . . . . . . 8
3.1. RateLimit-Limit 3.1. RateLimit-Limit . . . . . . . . . . . . . . . . . . . . . 8
3.2. RateLimit-Remaining 3.2. RateLimit-Remaining . . . . . . . . . . . . . . . . . . . 9
3.3. RateLimit-Reset 3.3. RateLimit-Reset . . . . . . . . . . . . . . . . . . . . . 9
4. Providing RateLimit headers 4. Providing RateLimit fields . . . . . . . . . . . . . . . . . 10
5. Intermediaries 5. Intermediaries . . . . . . . . . . . . . . . . . . . . . . . 11
6. Caching 6. Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
7. Receiving RateLimit headers 7. Receiving RateLimit fields . . . . . . . . . . . . . . . . . 11
8. Examples 8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 12
8.1. Unparameterized responses 8.1. Unparameterized responses . . . . . . . . . . . . . . . . 12
8.1.1. Throttling informations in responses 8.1.1. Throttling information in responses . . . . . . . . . 12
8.1.2. Use in conjunction with custom headers 8.1.2. Use in conjunction with custom fields . . . . . . . . 13
8.1.3. Use for limiting concurrency 8.1.3. Use for limiting concurrency . . . . . . . . . . . . 14
8.1.4. Use in throttled responses 8.1.4. Use in throttled responses . . . . . . . . . . . . . 15
8.2. Parameterized responses 8.2. Parameterized responses . . . . . . . . . . . . . . . . . 15
8.2.1. Throttling window specified via parameter 8.2.1. Throttling window specified via parameter . . . . . . 15
8.2.2. Dynamic limits with parameterized windows 8.2.2. Dynamic limits with parameterized windows . . . . . . 16
8.2.3. Dynamic limits for pushing back and slowing down 8.2.3. Dynamic limits for pushing back and slowing down . . 16
8.3. Dynamic limits for pushing back with Retry-After and slow 8.3. Dynamic limits for pushing back with Retry-After and slow
down down . . . . . . . . . . . . . . . . . . . . . . . . . . 17
8.3.1. Missing Remaining informations
8.3.2. Use with multiple windows 8.3.1. Missing Remaining information . . . . . . . . . . . . 18
9. Security Considerations 8.3.2. Use with multiple windows . . . . . . . . . . . . . . 19
9.1. Throttling does not prevent clients from issuing requests 9. Security Considerations . . . . . . . . . . . . . . . . . . . 20
9.2. Information disclosure 9.1. Throttling does not prevent clients from issuing requests 20
9.3. Remaining quota-units are not granted requests 9.2. Information disclosure . . . . . . . . . . . . . . . . . 20
9.4. Reliability of RateLimit-Reset 9.3. Remaining quota-units are not granted requests . . . . . 20
9.5. Resource exhaustion 9.4. Reliability of RateLimit-Reset . . . . . . . . . . . . . 20
9.6. Denial of Service 9.5. Resource exhaustion . . . . . . . . . . . . . . . . . . . 21
10. IANA Considerations 9.6. Denial of Service . . . . . . . . . . . . . . . . . . . . 21
10.1. RateLimit-Limit Field Registration 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21
10.2. RateLimit-Remaining Field Registration 10.1. RateLimit-Limit Field Registration . . . . . . . . . . . 21
10.3. RateLimit-Reset Field Registration 10.2. RateLimit-Remaining Field Registration . . . . . . . . . 22
11. References 10.3. RateLimit-Reset Field Registration . . . . . . . . . . . 22
11.1. Normative References 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 22
11.2. Informative References 11.1. Normative References . . . . . . . . . . . . . . . . . . 22
11.3. URIs 11.2. Informative References . . . . . . . . . . . . . . . . . 23
Appendix A. Change Log 11.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Appendix B. Acknowledgements Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 24
Appendix C. RateLimit headers currently used on the web Appendix B. FAQ . . . . . . . . . . . . . . . . . . . . . . . . 24
Appendix D. FAQ RateLimit fields currently used on the web . . . . . . . . . . . 27
Authors' Addresses Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
D.1. Since draft-ietf-httpapi-ratelimit-headers-00 . . . . . . 28
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 28
1. Introduction 1. Introduction
The widespreading of HTTP as a distributed computation protocol The widespreading of HTTP as a distributed computation protocol
requires an explicit way of communicating service status and usage requires an explicit way of communicating service status and usage
quotas. quotas.
This was partially addressed with the "Retry-After" header field This was partially addressed by the "Retry-After" header field
defined in [SEMANTICS] to be returned in "429 Too Many Requests" or defined in [SEMANTICS] to be returned in "429 Too Many Requests" (see
"503 Service Unavailable" responses. [STATUS429]) or "503 Service Unavailable" responses.
Still, there is not a standard way to communicate service quotas so Still, there is not a standard way to communicate service quotas so
that the client can throttle its requests and prevent 4xx or 5xx that the client can throttle its requests and prevent 4xx or 5xx
responses. responses.
1.1. Rate-limiting and quotas 1.1. Rate-limiting and quotas
Servers use quota mechanisms to avoid systems overload, to ensure an Servers use quota mechanisms to avoid systems overload, to ensure an
equitable distribution of computational resources or to enforce other equitable distribution of computational resources or to enforce other
policies - eg. monetization. policies - eg. monetization.
skipping to change at line 189 skipping to change at page 5, line 5
1.2.1. Interoperability issues 1.2.1. Interoperability issues
A major interoperability issue in throttling is the lack of standard A major interoperability issue in throttling is the lack of standard
headers, because: headers, because:
o each implementation associates different semantics to the same o each implementation associates different semantics to the same
header field names; header field names;
o header field names proliferates. o header field names proliferates.
Client applications interfacing with different servers may thus need User Agents interfacing with different servers may thus need to
to process different headers, or the very same application interface process different headers, or the very same application interface
that sits behind different reverse proxies may reply with different that sits behind different reverse proxies may reply with different
throttling headers. throttling headers.
1.3. This proposal 1.3. This proposal
This proposal defines syntax and semantics for the following fields: This proposal defines syntax and semantics for the following fields:
o "RateLimit-Limit": containing the requests quota in the time o "RateLimit-Limit": containing the requests quota in the time
window; window;
o "RateLimit-Remaining": containing the remaining requests quota in o "RateLimit-Remaining": containing the remaining requests quota in
the current window; the current window;
o "RateLimit-Reset": containing the time remaining in the current o "RateLimit-Reset": containing the time remaining in the current
window, specified in seconds. window, specified in seconds.
The behavior of "RateLimit-Reset" is compatible with the "delta- The behavior of "RateLimit-Reset" is compatible with the "delay-
seconds" notation of "Retry-After". seconds" notation of "Retry-After".
The fields definition allows to describe complex policies, including The fields definition allows to describe complex policies, including
the ones using multiple and variable time windows and dynamic quotas, the ones using multiple and variable time windows and dynamic quotas,
or implementing concurrency limits. or implementing concurrency limits.
1.4. Goals 1.4. Goals
The goals of this proposal are: The goals of this proposal are:
1. Standardizing the names and semantic of rate-limit headers; 1. Standardizing the names and semantic of rate-limit headers;
2. Improve resiliency of HTTP infrastructures simplifying the 2. Improve resiliency of HTTP infrastructures simplifying the
enforcement and the adoption of rate-limit headers; enforcement and the adoption of rate-limit headers;
3. Simplify API documentation avoiding expliciting rate-limit fields 3. Simplify API documentation avoiding expliciting rate-limit fields
semantic in documentation. semantic in documentation.
The goals do not include: The goals do not include:
Authorization: The rate-limit headers described here are not meant Authorization: The rate-limit fields described here are not meant to
to support authorization or other kinds of access controls. support authorization or other kinds of access controls.
Throttling scope: This specification does not cover the throttling Throttling scope: This specification does not cover the throttling
scope, that may be the given resource-target, its parent path or scope, that may be the given resource-target, its parent path or
the whole Origin [RFC6454] section 7. the whole Origin (see Section 7 of [RFC6454]).
Response status code: The rate-limit headers may be returned in both Response status code: The rate-limit fields may be returned in both
Successful and non Successful responses. This specification does Successful and non Successful responses. This specification does
not cover whether non Successful responses count on quota usage. not cover whether non Successful responses count on quota usage.
Throttling policy: This specification does not mandate a specific Throttling policy: This specification does not mandate a specific
throttling policy. The values published in the headers, including throttling policy. The values published in the fields, including
the window size, can be statically or dynamically evaluated. the window size, can be statically or dynamically evaluated.
Service Level Agreement: Conveyed quota hints do not imply any Service Level Agreement: Conveyed quota hints do not imply any
service guarantee. Server is free to throttle respectful clients service guarantee. Server is free to throttle respectful clients
under certain circumstances. under certain circumstances.
1.5. Notational Conventions 1.5. Notational Conventions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in "OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here. capitals, as shown here.
This document uses the Augmented BNF defined in [RFC5234] and updated This document uses the Augmented BNF defined in [RFC5234] and updated
by [RFC7405] along with the "#rule" extension defined in Section 7 of by [RFC7405] along with the "#rule" extension defined in
[MESSAGING]. Section 5.6.1 of [SEMANTICS].
The term Origin is to be interpreted as described in [RFC6454] The term Origin is to be interpreted as described in Section 7 of
section 7. [RFC6454].
The "delta-seconds" rule is defined in [CACHING] section 1.2.1. The "delay-seconds" rule is defined in Section 10.2.4 of [SEMANTICS].
2. Expressing rate-limit policies 2. Expressing rate-limit policies
2.1. Time window 2.1. Time window
Rate limit policies limit the number of acceptable requests in a Rate limit policies limit the number of acceptable requests in a
given time window. given time window.
A time window is expressed in seconds, using the following syntax: A time window is expressed in seconds, using the following syntax:
time-window = delta-seconds time-window = delay-seconds
Subsecond precision is not supported. Subsecond precision is not supported.
2.2. Request quota 2.2. Service limit
The request-quota is a value associated to the maximum number of The service-limit is a value associated to the maximum number of
requests that the server is willing to accept from one or more requests that the server is willing to accept from one or more
clients on a given basis (originating IP, authenticated user, clients on a given basis (originating IP, authenticated user,
geographical, ..) during a "time-window" as defined in Section 2.1. geographical, ..) during a "time-window" as defined in Section 2.1.
The "request-quota" is expressed in "quota-units" and has the The "service-limit" is expressed in "quota-units" and has the
following syntax: following syntax:
request-quota = quota-units service-limit = quota-units
quota-units = 1*DIGIT quota-units = 1*DIGIT
The "request-quota" SHOULD match the maximum number of acceptable The "service-limit" SHOULD match the maximum number of acceptable
requests. requests.
The "request-quota" MAY differ from the total number of acceptable The "service-limit" MAY differ from the total number of acceptable
requests when weight mechanisms, bursts, or other server policies are requests when weight mechanisms, bursts, or other server policies are
implemented. implemented.
If the "request-quota" does not match the maximum number of If the "service-limit" does not match the maximum number of
acceptable requests the relation with that SHOULD be communicated acceptable requests the relation with that SHOULD be communicated
out-of-band. out-of-band.
Example: A server could Example: A server could
o count once requests like "/books/{id}" o count once requests like "/books/{id}"
o count twice search requests like "/books?author=Camilleri" o count twice search requests like "/books?author=Camilleri"
so that we have the following counters so that we have the following counters
GET /books/123 ; request-quota=4, remaining: 3, status=200 GET /books/123 ; service-limit=4, remaining: 3, status=200
GET /books?author=Camilleri ; request-quota=4, remaining: 1, status=200 GET /books?author=Camilleri ; service-limit=4, remaining: 1, status=200
GET /books?author=Eco ; request-quota=4, remaining: 0, status=429 GET /books?author=Eco ; service-limit=4, remaining: 0, status=429
2.3. Quota policy 2.3. Quota policy
This specification allows describing a quota policy with the This specification allows describing a quota policy with the
following syntax: following syntax:
quota-policy = request-quota; "w" "=" time-window quota-policy = service-limit; "w" "=" time-window
*( OWS ";" OWS quota-comment) *( OWS ";" OWS quota-comment)
quota-comment = token "=" (token / quoted-string) quota-comment = token "=" (token / quoted-string)
quota-policy parameters like "w" and quota-comment tokens MUST NOT quota-policy parameters like "w" and quota-comment tokens MUST NOT
occur multiple times within the same quota-policy. occur multiple times within the same quota-policy.
An example policy of 100 quota-units per minute. An example policy of 100 quota-units per minute.
100;w=60 100;w=60
skipping to change at line 341 skipping to change at page 8, line 14
100;w=60;comment="fixed window" 100;w=60;comment="fixed window"
12;w=1;burst=1000;policy="leaky bucket" 12;w=1;burst=1000;policy="leaky bucket"
3. Header Specifications 3. Header Specifications
The following "RateLimit" response fields are defined The following "RateLimit" response fields are defined
3.1. RateLimit-Limit 3.1. RateLimit-Limit
The "RateLimit-Limit" response field indicates the "request-quota" The "RateLimit-Limit" response field indicates the "service-limit"
associated to the client in the current "time-window". associated to the client in the current "time-window".
If the client exceeds that limit, it MAY not be served. If the client exceeds that limit, it MAY not be served.
The header value is The field value is
RateLimit-Limit = expiring-limit [, 1#quota-policy ] RateLimit-Limit = expiring-limit [, 1#quota-policy ]
expiring-limit = request-quota expiring-limit = service-limit
The "expiring-limit" value MUST be set to the "request-quota" that is The "expiring-limit" value MUST be set to the "service-limit" that is
closer to reach its limit. closer to reach its limit.
The "quota-policy" is defined in Section 2.3, and its values are The "quota-policy" is defined in Section 2.3, and its values are
informative. informative.
RateLimit-Limit: 100 RateLimit-Limit: 100
A "time-window" associated to "expiring-limit" can be communicated A "time-window" associated to "expiring-limit" can be communicated
via an optional "quota-policy" value, like shown in the following via an optional "quota-policy" value, like shown in the following
example example
skipping to change at line 379 skipping to change at page 9, line 5
reset, or reset, or
o communicated out-of-band (eg. in the documentation). o communicated out-of-band (eg. in the documentation).
Policies using multiple quota limits MAY be returned using multiple Policies using multiple quota limits MAY be returned using multiple
"quota-policy" items, like shown in the following two examples: "quota-policy" items, like shown in the following two examples:
RateLimit-Limit: 10, 10;w=1, 50;w=60, 1000;w=3600, 5000;w=86400 RateLimit-Limit: 10, 10;w=1, 50;w=60, 1000;w=3600, 5000;w=86400
RateLimit-Limit: 10, 10;w=1;burst=1000, 1000;w=3600 RateLimit-Limit: 10, 10;w=1;burst=1000, 1000;w=3600
This header MUST NOT occur multiple times and can be sent in a This field MUST NOT occur multiple times and can be sent in a trailer
trailer section. section.
3.2. RateLimit-Remaining 3.2. RateLimit-Remaining
The "RateLimit-Remaining" response field indicates the remaining The "RateLimit-Remaining" response field indicates the remaining
"quota-units" defined in Section 2.2 associated to the client. "quota-units" defined in Section 2.2 associated to the client.
The header value is The field value is
RateLimit-Remaining = quota-units RateLimit-Remaining = quota-units
This header MUST NOT occur multiple times and can be sent in a This field MUST NOT occur multiple times and can be sent in a trailer
trailer section. section.
Clients MUST NOT assume that a positive "RateLimit-Remaining" value Clients MUST NOT assume that a positive "RateLimit-Remaining" value
is a guarantee of being served. is a guarantee that further requests will be served.
A low "RateLimit-Remaining" value is like a yellow traffic-light: the A low "RateLimit-Remaining" value is like a yellow traffic-light for
red light may arrive suddenly. either the number of requests issued in the "time-window" or the
request throughput: the red light may arrive suddenly (see
Section 4).
One example of "RateLimit-Remaining" use is below. One example of "RateLimit-Remaining" use is below.
RateLimit-Remaining: 50 RateLimit-Remaining: 50
3.3. RateLimit-Reset 3.3. RateLimit-Reset
The "RateLimit-Reset" response field indicates either The "RateLimit-Reset" response field indicates either
o the number of seconds until the quota resets. o the number of seconds until the quota resets.
The header value is The field value is
RateLimit-Reset = delta-seconds RateLimit-Reset = delay-seconds
The delta-seconds format is used because: The delay-seconds format is used because:
o it does not rely on clock synchronization and is resilient to o it does not rely on clock synchronization and is resilient to
clock adjustment and clock skew between client and server (see clock adjustment and clock skew between client and server (see
[SEMANTICS] Section 4.1.1.1); Section 5.6.7 of [SEMANTICS]);
o it mitigates the risk related to thundering herd when too many o it mitigates the risk related to thundering herd when too many
clients are serviced with the same timestamp. clients are serviced with the same timestamp.
This header MUST NOT occur multiple times and can be sent in a This field MUST NOT occur multiple times and can be sent in a trailer
trailer section. section.
An example of "RateLimit-Reset" use is below. An example of "RateLimit-Reset" use is below.
RateLimit-Reset: 50 RateLimit-Reset: 50
The client MUST NOT assume that all its "request-quota" will be The client MUST NOT assume that all its "service-limit" will be
restored after the moment referenced by "RateLimit-Reset". The restored after the moment referenced by "RateLimit-Reset". The
server MAY arbitrarily alter the "RateLimit-Reset" value between server MAY arbitrarily alter the "RateLimit-Reset" value between
subsequent requests eg. in case of resource saturation or to subsequent requests eg. in case of resource saturation or to
implement sliding window policies. implement sliding window policies.
4. Providing RateLimit headers 4. Providing RateLimit fields
A server MAY use one or more "RateLimit" response fields defined in A server MAY use one or more "RateLimit" response fields defined in
this document to communicate its quota policies. this document to communicate its quota policies.
The returned values refers to the metrics used to evaluate if the The returned values refers to the metrics used to evaluate if the
current request respects the quota policy and MAY not apply to current request respects the quota policy and MAY not apply to
subsequent requests. subsequent requests.
Example: a successful response with the following fields Example: a successful response with the following fields
skipping to change at line 463 skipping to change at page 10, line 42
example from Section 2.2. example from Section 2.2.
A server MAY return "RateLimit" response fields independently of the A server MAY return "RateLimit" response fields independently of the
response status code. This includes throttled responses. response status code. This includes throttled responses.
If a response contains both the "Retry-After" and the "RateLimit- If a response contains both the "Retry-After" and the "RateLimit-
Reset" fields, the value of "RateLimit-Reset" SHOULD reference the Reset" fields, the value of "RateLimit-Reset" SHOULD reference the
same point in time as "Retry-After". same point in time as "Retry-After".
When using a policy involving more than one "time-window", the server When using a policy involving more than one "time-window", the server
MUST reply with the "RateLimit" headers related to the window with MUST reply with the "RateLimit" fields related to the window with the
the lower "RateLimit-Remaining" values. lower "RateLimit-Remaining" values.
Under certain conditions, a server MAY artificially lower "RateLimit" Under certain conditions, a server MAY artificially lower "RateLimit"
field values between subsequent requests, eg. to respond to Denial of field values between subsequent requests, eg. to respond to Denial of
Service attacks or in case of resource saturation. Service attacks or in case of resource saturation.
Servers usually establish whether the request is in-quota before Servers usually establish whether the request is in-quota before
creating a response, so the RateLimit field values should be already creating a response, so the RateLimit field values should be already
available in that moment. Nonetheless servers MAY decide to send the available in that moment. Nonetheless servers MAY decide to send the
"RateLimit" fields in a trailer section. "RateLimit" fields in a trailer section.
5. Intermediaries 5. Intermediaries
This section documents the considerations advised in Section 15.3.3 This section documents the considerations advised in Section 16.3.3
of [SEMANTICS]. of [SEMANTICS].
An intermediary that is not part of the originating service An intermediary that is not part of the originating service
infrastructure and is not aware of the quota-policy semantic used by infrastructure and is not aware of the quota-policy semantic used by
the Origin Server SHOULD NOT alter the RateLimit fields' values in the Origin Server SHOULD NOT alter the RateLimit fields' values in
such a way as to communicate a more permissive quota-policy; this such a way as to communicate a more permissive quota-policy; this
includes removing the RateLimit fields. includes removing the RateLimit fields.
An intermediary MAY alter the RateLimit fields in such a way as to An intermediary MAY alter the RateLimit fields in such a way as to
communicate a more restrictive quota-policy when: communicate a more restrictive quota-policy when:
skipping to change at line 509 skipping to change at page 11, line 39
This specification does not mandate any behavior on intermediaries This specification does not mandate any behavior on intermediaries
respect to retries, nor requires that intermediaries have any role in respect to retries, nor requires that intermediaries have any role in
respecting quota-policies. For example, it is legitimate for a proxy respecting quota-policies. For example, it is legitimate for a proxy
to retransmit a request without notifying the client, and thus to retransmit a request without notifying the client, and thus
consuming quota-units. consuming quota-units.
6. Caching 6. Caching
As is the ordinary case for HTTP caching ([RFC7234]), a response with As is the ordinary case for HTTP caching ([RFC7234]), a response with
RateLimit fields might be cached and re-used for subsequent requests. RateLimit fields might be cached and re-used for subsequent requests.
A cached RateLimit response, does not modify quota counters but could A cached "RateLimit" response does not modify quota counters but
contain stale information. Clients interested in determining the could contain stale information. Clients interested in determining
freshness of the RateLimit fields could rely on fields such as "Date" the freshness of the "RateLimit" fields could rely on fields such as
and on the "window" value of a "quota-policy". "Date" and on the "time-window" of a "quota-policy".
7. Receiving RateLimit headers 7. Receiving RateLimit fields
A client MUST process the received "RateLimit" headers. A client MUST process the received "RateLimit" fields.
A client MUST validate the values received in the "RateLimit" headers A client MUST validate the values received in the "RateLimit" fields
before using them and check if there are significant discrepancies before using them and check if there are significant discrepancies
with the expected ones. This includes a "RateLimit-Reset" moment too with the expected ones. This includes a "RateLimit-Reset" moment too
far in the future or a "request-quota" too high. far in the future or a "service-limit" too high.
Malformed "RateLimit" headers MAY be ignored. Malformed "RateLimit" fields MAY be ignored.
A client SHOULD NOT exceed the "quota-units" expressed in "RateLimit- A client SHOULD NOT exceed the "quota-units" expressed in "RateLimit-
Remaining" before the "time-window" expressed in "RateLimit-Reset". Remaining" before the "time-window" expressed in "RateLimit-Reset".
A client MAY still probe the server if the "RateLimit-Reset" is A client MAY still probe the server if the "RateLimit-Reset" is
considered too high. considered too high.
The value of "RateLimit-Reset" is generated at response time: a The value of "RateLimit-Reset" is generated at response time: a
client aware of a significant network latency MAY behave accordingly client aware of a significant network latency MAY behave accordingly
and use other informations (eg. the "Date" response header, or and use other information (eg. the "Date" response header field, or
otherwise gathered metrics) to better estimate the "RateLimit-Reset" otherwise gathered metrics) to better estimate the "RateLimit-Reset"
moment intended by the server. moment intended by the server.
The "quota-policy" values and comments provided in "RateLimit-Limit" The "quota-policy" values and comments provided in "RateLimit-Limit"
are informative and MAY be ignored. are informative and MAY be ignored.
If a response contains both the "RateLimit-Reset" and "Retry-After" If a response contains both the "RateLimit-Reset" and "Retry-After"
fields, the "Retry-After" header field MUST take precedence and the fields, "Retry-After" MUST take precedence and "RateLimit-Reset" MAY
"RateLimit-Reset" field MAY be ignored. be ignored.
This specification does not mandate a specific throttling behavior
and implementers can adopt their preferred policies, including:
o slowing down or preemptively backoff their request rate when
approaching quota limits;
o consuming all the quota according to the exposed limits and then
wait.
8. Examples 8. Examples
8.1. Unparameterized responses 8.1. Unparameterized responses
8.1.1. Throttling informations in responses 8.1.1. Throttling information in responses
The client exhausted its request-quota for the next 50 seconds. The The client exhausted its service-limit for the next 50 seconds. The
"time-window" is communicated out-of-band or inferred by the header "time-window" is communicated out-of-band or inferred by the field
values. values.
Request: Request:
GET /items/123 GET /items/123 HTTP/1.1
Host: api.example
Response: Response:
HTTP/1.1 200 Ok HTTP/1.1 200 Ok
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 100 RateLimit-Limit: 100
Ratelimit-Remaining: 0 Ratelimit-Remaining: 0
Ratelimit-Reset: 50 Ratelimit-Reset: 50
{"hello": "world"} {"hello": "world"}
8.1.2. Use in conjunction with custom headers 8.1.2. Use in conjunction with custom fields
The server uses two custom headers, namely "acme-RateLimit-DayLimit" The server uses two custom fields, namely "acme-RateLimit-DayLimit"
and "acme-RateLimit-HourLimit" to expose the following policy: and "acme-RateLimit-HourLimit" to expose the following policy:
o 5000 daily quota-units; o 5000 daily quota-units;
o 1000 hourly quota-units. o 1000 hourly quota-units.
The client consumed 4900 quota-units in the first 14 hours. The client consumed 4900 quota-units in the first 14 hours.
Despite the next hourly limit of 1000 quota-units, the closest limit Despite the next hourly limit of 1000 quota-units, the closest limit
to reach is the daily one. to reach is the daily one.
The server then exposes the "RateLimit-*" headers to inform the The server then exposes the "RateLimit-*" fields to inform the client
client that: that:
o it has only 100 quota-units left; o it has only 100 quota-units left;
o the window will reset in 10 hours. o the window will reset in 10 hours.
Request: Request:
GET /items/123 GET /items/123 HTTP/1.1
Host: api.example
Response: Response:
HTTP/1.1 200 Ok HTTP/1.1 200 Ok
Content-Type: application/json Content-Type: application/json
acme-RateLimit-DayLimit: 5000 acme-RateLimit-DayLimit: 5000
acme-RateLimit-HourLimit: 1000 acme-RateLimit-HourLimit: 1000
RateLimit-Limit: 5000 RateLimit-Limit: 5000
RateLimit-Remaining: 100 RateLimit-Remaining: 100
RateLimit-Reset: 36000 RateLimit-Reset: 36000
{"hello": "world"} {"hello": "world"}
8.1.3. Use for limiting concurrency 8.1.3. Use for limiting concurrency
Throttling headers may be used to limit concurrency, advertising Throttling fields may be used to limit concurrency, advertising
limits that are lower than the usual ones in case of saturation, thus limits that are lower than the usual ones in case of saturation, thus
increasing availability. increasing availability.
The server adopted a basic policy of 100 quota-units per minute, and The server adopted a basic policy of 100 quota-units per minute, and
in case of resource exhaustion adapts the returned values reducing in case of resource exhaustion adapts the returned values reducing
both "RateLimit-Limit" and "RateLimit-Remaining". both "RateLimit-Limit" and "RateLimit-Remaining".
After 2 seconds the client consumed 40 quota-units After 2 seconds the client consumed 40 quota-units
Request: Request:
GET /items/123 GET /items/123 HTTP/1.1
Host: api.example
Response: Response:
HTTP/1.1 200 Ok HTTP/1.1 200 Ok
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 100 RateLimit-Limit: 100
RateLimit-Remaining: 60 RateLimit-Remaining: 60
RateLimit-Reset: 58 RateLimit-Reset: 58
{"elapsed": 2, "issued": 40} {"elapsed": 2, "issued": 40}
At the subsequent request - due to resource exhaustion - the server At the subsequent request - due to resource exhaustion - the server
advertises only "RateLimit-Remaining: 20". advertises only "RateLimit-Remaining: 20".
Request: Request:
GET /items/123 GET /items/123 HTTP/1.1
Host: api.example
Response: Response:
HTTP/1.1 200 Ok HTTP/1.1 200 Ok
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 100 RateLimit-Limit: 100
RateLimit-Remaining: 20 RateLimit-Remaining: 20
RateLimit-Reset: 56 RateLimit-Reset: 56
{"elapsed": 4, "issued": 41} {"elapsed": 4, "issued": 41}
8.1.4. Use in throttled responses 8.1.4. Use in throttled responses
A client exhausted its quota and the server throttles the request A client exhausted its quota and the server throttles it sending
sending the "Retry-After" response header field. "Retry-After".
In this example, the values of "Retry-After" and "RateLimit-Reset" In this example, the values of "Retry-After" and "RateLimit-Reset"
reference the same moment, but this is not a requirement. reference the same moment, but this is not a requirement.
The "429 Too Many Requests" HTTP status code is just used as an The "429 Too Many Requests" HTTP status code is just used as an
example. example.
Request: Request:
GET /items/123 GET /items/123 HTTP/1.1
Host: api.example
Response: Response:
HTTP/1.1 429 Too Many Requests HTTP/1.1 429 Too Many Requests
Content-Type: application/json Content-Type: application/json
Date: Mon, 05 Aug 2019 09:27:00 GMT Date: Mon, 05 Aug 2019 09:27:00 GMT
Retry-After: Mon, 05 Aug 2019 09:27:05 GMT Retry-After: Mon, 05 Aug 2019 09:27:05 GMT
RateLimit-Reset: 5 RateLimit-Reset: 5
RateLimit-Limit: 100 RateLimit-Limit: 100
Ratelimit-Remaining: 0 Ratelimit-Remaining: 0
skipping to change at line 689 skipping to change at page 15, line 47
8.2. Parameterized responses 8.2. Parameterized responses
8.2.1. Throttling window specified via parameter 8.2.1. Throttling window specified via parameter
The client has 99 "quota-units" left for the next 50 seconds. The The client has 99 "quota-units" left for the next 50 seconds. The
"time-window" is communicated by the "w" parameter, so we know the "time-window" is communicated by the "w" parameter, so we know the
throughput is 100 "quota-units" per minute. throughput is 100 "quota-units" per minute.
Request: Request:
GET /items/123 GET /items/123 HTTP/1.1
Host: api.example
Response: Response:
HTTP/1.1 200 Ok HTTP/1.1 200 Ok
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 100, 100;w=60 RateLimit-Limit: 100, 100;w=60
Ratelimit-Remaining: 99 Ratelimit-Remaining: 99
Ratelimit-Reset: 50 Ratelimit-Reset: 50
{"hello": "world"} {"hello": "world"}
skipping to change at line 718 skipping to change at page 16, line 30
The "RateLimit-Remaining" then advertises only 9 quota-units for the The "RateLimit-Remaining" then advertises only 9 quota-units for the
next 50 seconds to slow down the client. next 50 seconds to slow down the client.
Note that the server could have lowered even the other values in Note that the server could have lowered even the other values in
"RateLimit-Limit": this specification does not mandate any relation "RateLimit-Limit": this specification does not mandate any relation
between the field values contained in subsequent responses. between the field values contained in subsequent responses.
Request: Request:
GET /items/123 GET /items/123 HTTP/1.1
Host: api.example
Response: Response:
HTTP/1.1 200 Ok HTTP/1.1 200 Ok
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 10, 100;w=60 RateLimit-Limit: 10, 100;w=60
Ratelimit-Remaining: 9 Ratelimit-Remaining: 9
Ratelimit-Reset: 50 Ratelimit-Reset: 50
{ {
skipping to change at line 746 skipping to change at page 17, line 11
seconds and performs a new request which, due to resource exhaustion, seconds and performs a new request which, due to resource exhaustion,
the server rejects and pushes back, advertising "RateLimit-Remaining: the server rejects and pushes back, advertising "RateLimit-Remaining:
0" for the next 20 seconds. 0" for the next 20 seconds.
The server advertises a smaller window with a lower limit to slow The server advertises a smaller window with a lower limit to slow
down the client for the rest of its original window after the 20 down the client for the rest of its original window after the 20
seconds elapse. seconds elapse.
Request: Request:
GET /items/123 GET /items/123 HTTP/1.1
Host: api.example
Response: Response:
HTTP/1.1 429 Too Many Requests HTTP/1.1 429 Too Many Requests
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 0, 15;w=20 RateLimit-Limit: 0, 15;w=20
Ratelimit-Remaining: 0 Ratelimit-Remaining: 0
Ratelimit-Reset: 20 Ratelimit-Reset: 20
{ {
"status": 429, "status": 429,
"detail": "Wait 20 seconds, then slow down!" "detail": "Wait 20 seconds, then slow down!"
} }
8.3. Dynamic limits for pushing back with Retry-After and slow down 8.3. Dynamic limits for pushing back with Retry-After and slow down
Alternatively, given the same context where the previous example Alternatively, given the same context where the previous example
starts, we can convey the same information to the client via the starts, we can convey the same information to the client via "Retry-
Retry-After header, with the advantage that the server can now After", with the advantage that the server can now specify the
specify the policy's nominal limit and window that will apply after policy's nominal limit and window that will apply after the reset,
the reset, ie. assuming the resource exhaustion is likely to be gone ie. assuming the resource exhaustion is likely to be gone by then, so
by then, so the advertised policy does not need to be adjusted, yet the advertised policy does not need to be adjusted, yet we managed to
we managed to stop requests for a while and slow down the rest of the stop requests for a while and slow down the rest of the current
current window. window.
Request: Request:
GET /items/123 GET /items/123 HTTP/1.1
Host: api.example
Response: Response:
HTTP/1.1 429 Too Many Requests HTTP/1.1 429 Too Many Requests
Content-Type: application/json Content-Type: application/json
Retry-After: 20 Retry-After: 20
RateLimit-Limit: 15, 100;w=60 RateLimit-Limit: 15, 100;w=60
Ratelimit-Remaining: 15 Ratelimit-Remaining: 15
Ratelimit-Reset: 40 Ratelimit-Reset: 40
{ {
"status": 429, "status": 429,
"detail": "Wait 20 seconds, then slow down!" "detail": "Wait 20 seconds, then slow down!"
} }
Note that in this last response the client is expected to honor the Note that in this last response the client is expected to honor
"Retry-After" header and perform no requests for the specified amount "Retry-After" and perform no requests for the specified amount of
of time, whereas the previous example would not force the client to time, whereas the previous example would not force the client to stop
stop requests before the reset time is elapsed, as it would still be requests before the reset time is elapsed, as it would still be free
free to query again the server even if it is likely to have the to query again the server even if it is likely to have the request
request rejected. rejected.
8.3.1. Missing Remaining informations 8.3.1. Missing Remaining information
The server does not expose "RateLimit-Remaining" values, but resets The server does not expose "RateLimit-Remaining" values, but resets
the limit counter every second. the limit counter every second.
It communicates to the client the limit of 10 quota-units per second It communicates to the client the limit of 10 quota-units per second
always returning the couple "RateLimit-Limit" and "RateLimit-Reset". always returning the couple "RateLimit-Limit" and "RateLimit-Reset".
Request: Request:
GET /items/123 GET /items/123 HTTP/1.1
Host: api.example
Response: Response:
HTTP/1.1 200 Ok HTTP/1.1 200 Ok
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 10 RateLimit-Limit: 10
Ratelimit-Reset: 1 Ratelimit-Reset: 1
{"first": "request"} {"first": "request"}
Request: Request:
GET /items/123 GET /items/123 HTTP/1.1
Host: api.example
Response: Response:
HTTP/1.1 200 Ok HTTP/1.1 200 Ok
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 10 RateLimit-Limit: 10
Ratelimit-Reset: 1 Ratelimit-Reset: 1
{"second": "request"} {"second": "request"}
8.3.2. Use with multiple windows 8.3.2. Use with multiple windows
skipping to change at line 845 skipping to change at page 19, line 27
o 5000 daily quota-units; o 5000 daily quota-units;
o 1000 hourly quota-units. o 1000 hourly quota-units.
The client consumed 4900 quota-units in the first 14 hours. The client consumed 4900 quota-units in the first 14 hours.
Despite the next hourly limit of 1000 quota-units, the closest limit Despite the next hourly limit of 1000 quota-units, the closest limit
to reach is the daily one. to reach is the daily one.
The server then exposes the "RateLimit" headers to inform the client The server then exposes the "RateLimit" fields to inform the client
that: that:
o it has only 100 quota-units left; o it has only 100 quota-units left;
o the window will reset in 10 hours; o the window will reset in 10 hours;
o the "expiring-limit" is 5000. o the "expiring-limit" is 5000.
Request: Request:
GET /items/123 GET /items/123 HTTP/1.1
Host: api.example
Response: Response:
HTTP/1.1 200 OK HTTP/1.1 200 OK
Content-Type: application/json Content-Type: application/json
RateLimit-Limit: 5000, 1000;w=3600, 5000;w=86400 RateLimit-Limit: 5000, 1000;w=3600, 5000;w=86400
RateLimit-Remaining: 100 RateLimit-Remaining: 100
RateLimit-Reset: 36000 RateLimit-Reset: 36000
{"hello": "world"} {"hello": "world"}
skipping to change at line 880 skipping to change at page 20, line 17
9.1. Throttling does not prevent clients from issuing requests 9.1. Throttling does not prevent clients from issuing requests
This specification does not prevent clients to make over-quota This specification does not prevent clients to make over-quota
requests. requests.
Servers should always implement mechanisms to prevent resource Servers should always implement mechanisms to prevent resource
exhaustion. exhaustion.
9.2. Information disclosure 9.2. Information disclosure
Servers should not disclose operational capacity informations that Servers should not disclose operational capacity information that can
can be used to saturate its resources. be used to saturate its resources.
While this specification does not mandate whether non 2xx responses While this specification does not mandate whether non 2xx responses
consume quota, if 401 and 403 responses count on quota a malicious consume quota, if 401 and 403 responses count on quota a malicious
client could probe the endpoint to get traffic informations of client could probe the endpoint to get traffic information of another
another user. user.
As intermediaries might retransmit requests and consume quota-units As intermediaries might retransmit requests and consume quota-units
without prior knowledge of the User Agent, RateLimit headers might without prior knowledge of the User Agent, RateLimit fields might
reveal the existence of an intermediary to the User Agent. reveal the existence of an intermediary to the User Agent.
9.3. Remaining quota-units are not granted requests 9.3. Remaining quota-units are not granted requests
"RateLimit-*" headers convey hints from the server to the clients in "RateLimit-*" fields convey hints from the server to the clients in
order to avoid being throttled out. order to avoid being throttled out.
Clients MUST NOT consider the "quota-units" returned in "RateLimit- Clients MUST NOT consider the "quota-units" returned in "RateLimit-
Remaining" as a service level agreement. Remaining" as a service level agreement.
In case of resource saturation, the server MAY artificially lower the In case of resource saturation, the server MAY artificially lower the
returned values or not serve the request anyway. returned values or not serve the request anyway.
9.4. Reliability of RateLimit-Reset 9.4. Reliability of RateLimit-Reset
Consider that "request-quota" may not be restored after the moment Consider that "service-limit" may not be restored after the moment
referenced by "RateLimit-Reset", and the "RateLimit-Reset" value referenced by "RateLimit-Reset", and the "RateLimit-Reset" value
should not be considered fixed nor constant. should not be considered fixed nor constant.
Subsequent requests may return an higher "RateLimit-Reset" value to Subsequent requests may return an higher "RateLimit-Reset" value to
limit concurrency or implement dynamic or adaptive throttling limit concurrency or implement dynamic or adaptive throttling
policies. policies.
9.5. Resource exhaustion 9.5. Resource exhaustion
When returning "RateLimit-Reset" you must be aware that many When returning "RateLimit-Reset" you must be aware that many
skipping to change at line 986 skipping to change at page 22, line 32
Field name: "RateLimit-Reset" Field name: "RateLimit-Reset"
Status: permanent Status: permanent
Specification document(s): Section 3.3 of this document Specification document(s): Section 3.3 of this document
11. References 11. References
11.1. Normative References 11.1. Normative References
[CACHING] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke,
Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching",
RFC 7234, DOI 10.17487/RFC7234, June 2014,
<https://www.rfc-editor.org/info/rfc7234>.
[MESSAGING]
Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer
Protocol (HTTP/1.1): Message Syntax and Routing",
RFC 7230, DOI 10.17487/RFC7230, June 2014,
<https://www.rfc-editor.org/info/rfc7230>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
Specifications: ABNF", STD 68, RFC 5234, Specifications: ABNF", STD 68, RFC 5234,
DOI 10.17487/RFC5234, January 2008, DOI 10.17487/RFC5234, January 2008,
<https://www.rfc-editor.org/info/rfc5234>. <https://www.rfc-editor.org/info/rfc5234>.
skipping to change at line 1020 skipping to change at page 23, line 6
[RFC7405] Kyzivat, P., "Case-Sensitive String Support in ABNF", [RFC7405] Kyzivat, P., "Case-Sensitive String Support in ABNF",
RFC 7405, DOI 10.17487/RFC7405, December 2014, RFC 7405, DOI 10.17487/RFC7405, December 2014,
<https://www.rfc-editor.org/info/rfc7405>. <https://www.rfc-editor.org/info/rfc7405>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>. May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[SEMANTICS] [SEMANTICS]
Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer Fielding, R. T., Nottingham, M., and J. Reschke, "HTTP
Protocol (HTTP/1.1): Semantics and Content", RFC 7231, Semantics", draft-ietf-httpbis-semantics-14 (work in
DOI 10.17487/RFC7231, June 2014, progress), January 2021.
<https://www.rfc-editor.org/info/rfc7231>.
[UNIX] The Open Group, ., "The Single UNIX Specification, Version
2 - 6 Vol Set for UNIX 98", February 1997.
11.2. Informative References 11.2. Informative References
[RFC3339] Klyne, G. and C. Newman, "Date and Time on the Internet: [RFC3339] Klyne, G. and C. Newman, "Date and Time on the Internet:
Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002, Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002,
<https://www.rfc-editor.org/info/rfc3339>. <https://www.rfc-editor.org/info/rfc3339>.
[RFC6585] Nottingham, M. and R. Fielding, "Additional HTTP Status [RFC6585] Nottingham, M. and R. Fielding, "Additional HTTP Status
Codes", RFC 6585, DOI 10.17487/RFC6585, April 2012, Codes", RFC 6585, DOI 10.17487/RFC6585, April 2012,
<https://www.rfc-editor.org/info/rfc6585>. <https://www.rfc-editor.org/info/rfc6585>.
[RFC7234] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, [RFC7234] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke,
Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching", Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching",
RFC 7234, DOI 10.17487/RFC7234, June 2014, RFC 7234, DOI 10.17487/RFC7234, June 2014,
<https://www.rfc-editor.org/info/rfc7234>. <https://www.rfc-editor.org/info/rfc7234>.
[STATUS429]
Stewart, R., Tuexen, M., and P. Lei, "Stream Control
Transmission Protocol (SCTP) Stream Reconfiguration",
RFC 6525, DOI 10.17487/RFC6525, February 2012,
<https://www.rfc-editor.org/info/rfc6525>.
[UNIX] The Open Group, ., "The Single UNIX Specification, Version
2 - 6 Vol Set for UNIX 98", February 1997.
11.3. URIs 11.3. URIs
[1] https://lists.w3.org/Archives/Public/ietf-httpapi-wg/ [1] https://lists.w3.org/Archives/Public/ietf-httpapi-wg/
[2] https://github.com/ietf-wg-httpapi/ratelimit-headers [2] https://github.com/ietf-wg-httpapi/ratelimit-headers
[3] https://github.com/httpwg/http-core/ [3] https://github.com/httpwg/http-core/
pull/317#issuecomment-585868767 pull/317#issuecomment-585868767
[4] https://github.com/ioggstream/draft-polli-ratelimit-headers/ [4] https://github.com/ioggstream/draft-polli-ratelimit-headers/
skipping to change at line 1064 skipping to change at page 24, line 8
[5] https://community.ntppool.org/t/another-ntp-client-failure- [5] https://community.ntppool.org/t/another-ntp-client-failure-
story/1014/ story/1014/
[6] https://lists.w3.org/Archives/Public/ietf-http- [6] https://lists.w3.org/Archives/Public/ietf-http-
wg/2019JulSep/0202.html wg/2019JulSep/0202.html
[7] https://github.com/ioggstream/draft-polli-ratelimit-headers/ [7] https://github.com/ioggstream/draft-polli-ratelimit-headers/
issues/34#issuecomment-519366481 issues/34#issuecomment-519366481
Appendix A. Change Log Appendix A. Acknowledgements
RFC EDITOR PLEASE DELETE THIS SECTION.
Appendix B. Acknowledgements
Thanks to Willi Schoenborn, Alejandro Martinez Ruiz, Alessandro Thanks to Willi Schoenborn, Alejandro Martinez Ruiz, Alessandro
Ranellucci, Amos Jeffries, Martin Thomson, Erik Wilde and Mark Ranellucci, Amos Jeffries, Martin Thomson, Erik Wilde and Mark
Nottingham for being the initial contributors of these Nottingham for being the initial contributors of these
specifications. Kudos to the first community implementors: Aapo specifications. Kudos to the first community implementors: Aapo
Talvensaari, Nathan Friedly and Sanyam Dogra. Talvensaari, Nathan Friedly and Sanyam Dogra.
Appendix C. RateLimit headers currently used on the web Appendix B. FAQ
RFC EDITOR PLEASE DELETE THIS SECTION.
Commonly used header field names are:
o "X-RateLimit-Limit", "X-RateLimit-Remaining", "X-RateLimit-Reset";
o "X-Rate-Limit-Limit", "X-Rate-Limit-Remaining", "X-Rate-Limit-
Reset".
There are variants too, where the window is specified in the header
field name, eg:
o "x-ratelimit-limit-minute", "x-ratelimit-limit-hour", "x-
ratelimit-limit-day"
o "x-ratelimit-remaining-minute", "x-ratelimit-remaining-hour", "x-
ratelimit-remaining-day"
Here are some interoperability issues:
o "X-RateLimit-Remaining" references different values, depending on
the implementation:
* seconds remaining to the window expiration
* milliseconds remaining to the window expiration
* seconds since UTC, in UNIX Timestamp
* a datetime, either "IMF-fixdate" [SEMANTICS] or [RFC3339]
o different headers, with the same semantic, are used by different
implementers:
* X-RateLimit-Limit and X-Rate-Limit-Limit
* X-RateLimit-Remaining and X-Rate-Limit-Remaining
* X-RateLimit-Reset and X-Rate-Limit-Reset
The semantic of RateLimit-Remaining depends on the windowing
algorithm. A sliding window policy for example may result in having
a ratelimit-remaining value related to the ratio between the current
and the maximum throughput. Eg.
RateLimit-Limit: 12, 12;w=1
RateLimit-Remaining: 6 ; using 50% of throughput, that is 6 units/s
RateLimit-Reset: 1
If this is the case, the optimal solution is to achieve
RateLimit-Limit: 12, 12;w=1
RateLimit-Remaining: 1 ; using 100% of throughput, that is 12 units/s
RateLimit-Reset: 1
At this point you should stop increasing your request rate.
Appendix D. FAQ
1. Why defining standard headers for throttling? 1. Why defining standard fields for throttling?
To simplify enforcement of throttling policies. To simplify enforcement of throttling policies.
2. Can I use RateLimit-* in throttled responses (eg with status code 2. Can I use RateLimit-* in throttled responses (eg with status code
429)? 429)?
Yes, you can. Yes, you can.
3. Are those specs tied to RFC 6585? 3. Are those specs tied to RFC 6585?
No. [RFC6585] defines the "429" status code and we use it just No. [RFC6585] defines the "429" status code and we use it just
as an example of a throttled request, that could instead use even as an example of a throttled request, that could instead use even
403 or whatever status code. "403" or whatever status code. The goal of this specification is
to standardize the name and semantic of three ratelimit fields
widely used on the internet. Stricter relations with status
codes or error response payloads would impose behaviors to all
the existing implementations making the adoption more complex.
4. Why don't pass the throttling scope as a parameter? 4. Why don't pass the throttling scope as a parameter?
After a discussion on a similar thread [3] we will probably add a After a discussion on a similar thread [3] we will probably add a
new "RateLimit-Scope" header to this spec. new "RateLimit-Scope" field to this spec.
I'm open to suggestions: comment on this issue [4] I'm open to suggestions: comment on this issue [4]
5. Why using delta-seconds instead of a UNIX Timestamp? Why not 5. Why using delay-seconds instead of a UNIX Timestamp? Why not
using subsecond precision? using subsecond precision?
Using delta-seconds aligns with "Retry-After", which is returned Using delay-seconds aligns with "Retry-After", which is returned
in similar contexts, eg on 429 responses. in similar contexts, eg on 429 responses.
delta-seconds as defined in [CACHING] section 1.2.1 clarifies
some parsing rules too.
Timestamps require a clock synchronization protocol (see Timestamps require a clock synchronization protocol (see
[SEMANTICS] section 4.1.1.1). This may be problematic (eg. clock Section 5.6.7 of [SEMANTICS]). This may be problematic (eg.
adjustment, clock skew, failure of hardcoded clock clock adjustment, clock skew, failure of hardcoded clock
synchronization servers, IoT devices, ..). Moreover timestamps synchronization servers, IoT devices, ..). Moreover timestamps
may not be monotonically increasing due to clock adjustment. See may not be monotonically increasing due to clock adjustment. See
Another NTP client failure story [5] Another NTP client failure story [5]
We did not use subsecond precision because: We did not use subsecond precision because:
* that is more subject to system clock correction like the one * that is more subject to system clock correction like the one
implemented via the adjtimex() Linux system call; implemented via the adjtimex() Linux system call;
* response-time latency may not make it worth. A brief * response-time latency may not make it worth. A brief
skipping to change at line 1212 skipping to change at page 25, line 45
A semantic way to limit concurrency is to return 503 + Retry- A semantic way to limit concurrency is to return 503 + Retry-
After in case of resource saturation (eg. thrashing, connection After in case of resource saturation (eg. thrashing, connection
queues too long, Service Level Objectives not meet, ..). queues too long, Service Level Objectives not meet, ..).
Saturation conditions can be either dynamic or static: all this Saturation conditions can be either dynamic or static: all this
is out of the scope for the current document. is out of the scope for the current document.
8. Do a positive value of "RateLimit-Remaining" imply any service 8. Do a positive value of "RateLimit-Remaining" imply any service
guarantee for my future requests to be served? guarantee for my future requests to be served?
No. The returned values were used to decide whether to serve or No. FAQ integrated in Section 3.2.
not _the current request_ and do not imply any guarantee that
future requests will be successful.
Instead they help to understand when future requests will
probably be throttled. A low value for "RateLimit-Remaining"
should be interpreted as a yellow traffic-light for either the
number of requests issued in the "time-window" or the request
throughput.
9. Is the quota-policy definition Section 2.3 too complex? 9. Is the quota-policy definition Section 2.3 too complex?
You can always return the simplest form of the 3 headers You can always return the simplest form of the 3 fields
RateLimit-Limit: 100 RateLimit-Limit: 100
RateLimit-Remaining: 50 RateLimit-Remaining: 50
RateLimit-Reset: 60 RateLimit-Reset: 60
The key runtime value is the first element of the list: "expiring- The key runtime value is the first element of the list: "expiring-
limit", the others "quota-policy" are informative. So for the limit", the others "quota-policy" are informative. So for the
following header: following field:
RateLimit-Limit: 100, 100;w=60;burst=1000;comment="sliding window", 5000;w=3600;burst=0;comment="fixed window" RateLimit-Limit: 100, 100;w=60;burst=1000;comment="sliding window", 5000;w=3600;burst=0;comment="fixed window"
the key value is the one referencing the lowest limit: "100" the key value is the one referencing the lowest limit: "100"
1. Can we use shorter names? Why don't put everything in one 1. Can we use shorter names? Why don't put everything in one field?
header?
The most common syntax we found on the web is "X-RateLimit-*" and The most common syntax we found on the web is "X-RateLimit-*" and
when starting this I-D we opted for it [7] when starting this I-D we opted for it [7]
The basic form of those headers is easily parseable, even by The basic form of those fields is easily parseable, even by
implementors procesing responses using technologies like dynamic implementors procesing responses using technologies like dynamic
interpreter with limited syntax. interpreter with limited syntax.
Using a single header complicates parsing and takes a significantly Using a single field complicates parsing and takes a significantly
different approach from the existing ones: this can limit adoption. different approach from the existing ones: this can limit adoption.
1. Why don't mention connections? 1. Why don't mention connections?
Beware of the term "connection": &#65532; &#65532; - it is just Beware of the term "connection": &#65532; &#65532; - it is just
_one_ possible saturation cause. Once you go that path &#65532; _one_ possible saturation cause. Once you go that path &#65532;
you will expose other infrastructural details (bandwidth, CPU, .. you will expose other infrastructural details (bandwidth, CPU, ..
see Section 9.2) &#65532; and complicate client compliance; see Section 9.2) &#65532; and complicate client compliance;
&#65532; - it is an infrastructural detail defined in terms of &#65532; - it is an infrastructural detail defined in terms of
server and network &#65532; rather than the consumed service. server and network &#65532; rather than the consumed service.
This specification protects the services first, and then the This specification protects the services first, and then the
infrastructures through client cooperation (see Section 9.1). infrastructures through client cooperation (see Section 9.1).
&#65532; &#65532; RateLimit headers enable sending _on the same &#65532; &#65532; RateLimit fields enable sending _on the same
connection_ different limit values &#65532; on each response, connection_ different limit values &#65532; on each response,
depending on the policy scope (eg. per-user, per-custom-key, ..) depending on the policy scope (eg. per-user, per-custom-key, ..)
&#65532; &#65532;
2. Can intermediaries alter RateLimit fields? 2. Can intermediaries alter RateLimit fields?
Generally, they should not because it might result in unserviced Generally, they should not because it might result in unserviced
requests. There are reasonable use cases for intermediaries requests. There are reasonable use cases for intermediaries
mangling RateLimit fields though, e.g. when they enforce stricter mangling RateLimit fields though, e.g. when they enforce stricter
quota-policies, or when they are an active component of the quota-policies, or when they are an active component of the
service. In those case we will consider them as part of the service. In those case we will consider them as part of the
originating infrastructure. originating infrastructure.
RateLimit fields currently used on the web
_RFC Editor: Please remove this section before publication._
Commonly used header field names are:
o "X-RateLimit-Limit", "X-RateLimit-Remaining", "X-RateLimit-Reset";
o "X-Rate-Limit-Limit", "X-Rate-Limit-Remaining", "X-Rate-Limit-
Reset".
There are variants too, where the window is specified in the header
field name, eg:
o "x-ratelimit-limit-minute", "x-ratelimit-limit-hour", "x-
ratelimit-limit-day"
o "x-ratelimit-remaining-minute", "x-ratelimit-remaining-hour", "x-
ratelimit-remaining-day"
Here are some interoperability issues:
o "X-RateLimit-Remaining" references different values, depending on
the implementation:
* seconds remaining to the window expiration
* milliseconds remaining to the window expiration
* seconds since UTC, in UNIX Timestamp [UNIX]
* a datetime, either "IMF-fixdate" [SEMANTICS] or [RFC3339]
o different headers, with the same semantic, are used by different
implementers:
* X-RateLimit-Limit and X-Rate-Limit-Limit
* X-RateLimit-Remaining and X-Rate-Limit-Remaining
* X-RateLimit-Reset and X-Rate-Limit-Reset
The semantic of RateLimit-Remaining depends on the windowing
algorithm. A sliding window policy for example may result in having
a "RateLimit-Remaining" value related to the ratio between the
current and the maximum throughput. Eg.
RateLimit-Limit: 12, 12;w=1
RateLimit-Remaining: 6 ; using 50% of throughput, that is 6 units/s
RateLimit-Reset: 1
If this is the case, the optimal solution is to achieve
RateLimit-Limit: 12, 12;w=1
RateLimit-Remaining: 1 ; using 100% of throughput, that is 12 units/s
RateLimit-Reset: 1
At this point you should stop increasing your request rate.
Changes
_RFC Editor: Please remove this section before publication._
D.1. Since draft-ietf-httpapi-ratelimit-headers-00
o Use I-D.httpbis-semantics, which includes referencing "delay-
seconds" instead of "delta-seconds". #5
Authors' Addresses Authors' Addresses
Roberto Polli Roberto Polli
Team Digitale, Italian Government Team Digitale, Italian Government
Italy
Email: robipolli@gmail.com Email: robipolli@gmail.com
Alejandro Martinez Ruiz Alejandro Martinez Ruiz
Red Hat Red Hat
Email: amr@redhat.com Email: amr@redhat.com
 End of changes. 102 change blocks. 
275 lines changed or deleted 291 lines changed or added

This html diff was produced by rfcdiff 1.44jr. The latest version is available from http://tools.ietf.org/tools/rfcdiff/