draft-ietf-quic-recovery-24.txt   draft-ietf-quic-recovery-latest.txt 
QUIC Working Group J. Iyengar, Ed. QUIC Working Group J. Iyengar, Ed.
Internet-Draft Fastly Internet-Draft Fastly
Intended status: Standards Track I. Swett, Ed. Intended status: Standards Track I. Swett, Ed.
Expires: May 7, 2020 Google Expires: July 20, 2020 Google
November 4, 2019 January 17, 2020
QUIC Loss Detection and Congestion Control QUIC Loss Detection and Congestion Control
draft-ietf-quic-recovery-24 draft-ietf-quic-recovery-latest
Abstract Abstract
This document describes loss detection and congestion control This document describes loss detection and congestion control
mechanisms for QUIC. mechanisms for QUIC.
Note to Readers Note to Readers
Discussion of this draft takes place on the QUIC working group Discussion of this draft takes place on the QUIC working group
mailing list (quic@ietf.org), which is archived at mailing list (quic@ietf.org), which is archived at
skipping to change at page 1, line 42 skipping to change at page 1, line 42
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 7, 2020. This Internet-Draft will expire on July 20, 2020.
Copyright Notice Copyright Notice
Copyright (c) 2019 IETF Trust and the persons identified as the Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
skipping to change at page 2, line 30 skipping to change at page 2, line 30
3.1. Relevant Differences Between QUIC and TCP . . . . . . . . 5 3.1. Relevant Differences Between QUIC and TCP . . . . . . . . 5
3.1.1. Separate Packet Number Spaces . . . . . . . . . . . . 6 3.1.1. Separate Packet Number Spaces . . . . . . . . . . . . 6
3.1.2. Monotonically Increasing Packet Numbers . . . . . . . 6 3.1.2. Monotonically Increasing Packet Numbers . . . . . . . 6
3.1.3. Clearer Loss Epoch . . . . . . . . . . . . . . . . . 6 3.1.3. Clearer Loss Epoch . . . . . . . . . . . . . . . . . 6
3.1.4. No Reneging . . . . . . . . . . . . . . . . . . . . . 7 3.1.4. No Reneging . . . . . . . . . . . . . . . . . . . . . 7
3.1.5. More ACK Ranges . . . . . . . . . . . . . . . . . . . 7 3.1.5. More ACK Ranges . . . . . . . . . . . . . . . . . . . 7
3.1.6. Explicit Correction For Delayed Acknowledgements . . 7 3.1.6. Explicit Correction For Delayed Acknowledgements . . 7
4. Estimating the Round-Trip Time . . . . . . . . . . . . . . . 7 4. Estimating the Round-Trip Time . . . . . . . . . . . . . . . 7
4.1. Generating RTT samples . . . . . . . . . . . . . . . . . 7 4.1. Generating RTT samples . . . . . . . . . . . . . . . . . 7
4.2. Estimating min_rtt . . . . . . . . . . . . . . . . . . . 8 4.2. Estimating min_rtt . . . . . . . . . . . . . . . . . . . 8
4.3. Estimating smoothed_rtt and rttvar . . . . . . . . . . . 8 4.3. Estimating smoothed_rtt and rttvar . . . . . . . . . . . 9
5. Loss Detection . . . . . . . . . . . . . . . . . . . . . . . 9 5. Loss Detection . . . . . . . . . . . . . . . . . . . . . . . 10
5.1. Acknowledgement-based Detection . . . . . . . . . . . . . 10 5.1. Acknowledgement-based Detection . . . . . . . . . . . . . 10
5.1.1. Packet Threshold . . . . . . . . . . . . . . . . . . 10 5.1.1. Packet Threshold . . . . . . . . . . . . . . . . . . 11
5.1.2. Time Threshold . . . . . . . . . . . . . . . . . . . 10 5.1.2. Time Threshold . . . . . . . . . . . . . . . . . . . 11
5.2. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 11 5.2. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 12
5.2.1. Computing PTO . . . . . . . . . . . . . . . . . . . . 11 5.2.1. Computing PTO . . . . . . . . . . . . . . . . . . . . 12
5.3. Handshakes and New Paths . . . . . . . . . . . . . . . . 12 5.3. Handshakes and New Paths . . . . . . . . . . . . . . . . 13
5.3.1. Sending Probe Packets . . . . . . . . . . . . . . . . 13 5.3.1. Sending Probe Packets . . . . . . . . . . . . . . . . 14
5.3.2. Loss Detection . . . . . . . . . . . . . . . . . . . 14 5.3.2. Loss Detection . . . . . . . . . . . . . . . . . . . 15
5.4. Handling Retry Packets . . . . . . . . . . . . . . . . . 14 5.4. Handling Retry Packets . . . . . . . . . . . . . . . . . 15
5.5. Discarding Keys and Packet State . . . . . . . . . . . . 14 5.5. Discarding Keys and Packet State . . . . . . . . . . . . 15
6. Congestion Control . . . . . . . . . . . . . . . . . . . . . 15 6. Congestion Control . . . . . . . . . . . . . . . . . . . . . 16
6.1. Explicit Congestion Notification . . . . . . . . . . . . 15 6.1. Explicit Congestion Notification . . . . . . . . . . . . 16
6.2. Slow Start . . . . . . . . . . . . . . . . . . . . . . . 16 6.2. Slow Start . . . . . . . . . . . . . . . . . . . . . . . 17
6.3. Congestion Avoidance . . . . . . . . . . . . . . . . . . 16 6.3. Congestion Avoidance . . . . . . . . . . . . . . . . . . 17
6.4. Recovery Period . . . . . . . . . . . . . . . . . . . . . 16 6.4. Recovery Period . . . . . . . . . . . . . . . . . . . . . 17
6.5. Ignoring Loss of Undecryptable Packets . . . . . . . . . 16 6.5. Ignoring Loss of Undecryptable Packets . . . . . . . . . 17
6.6. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 16 6.6. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 17
6.7. Persistent Congestion . . . . . . . . . . . . . . . . . . 17 6.7. Persistent Congestion . . . . . . . . . . . . . . . . . . 18
6.8. Pacing . . . . . . . . . . . . . . . . . . . . . . . . . 18 6.8. Pacing . . . . . . . . . . . . . . . . . . . . . . . . . 19
6.9. Under-utilizing the Congestion Window . . . . . . . . . . 18 6.9. Under-utilizing the Congestion Window . . . . . . . . . . 19
7. Security Considerations . . . . . . . . . . . . . . . . . . . 19 7. Security Considerations . . . . . . . . . . . . . . . . . . . 20
7.1. Congestion Signals . . . . . . . . . . . . . . . . . . . 19 7.1. Congestion Signals . . . . . . . . . . . . . . . . . . . 20
7.2. Traffic Analysis . . . . . . . . . . . . . . . . . . . . 19 7.2. Traffic Analysis . . . . . . . . . . . . . . . . . . . . 20
7.3. Misreporting ECN Markings . . . . . . . . . . . . . . . . 19 7.3. Misreporting ECN Markings . . . . . . . . . . . . . . . . 20
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 20 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 21
9.1. Normative References . . . . . . . . . . . . . . . . . . 20 9.1. Normative References . . . . . . . . . . . . . . . . . . 21
9.2. Informative References . . . . . . . . . . . . . . . . . 20 9.2. Informative References . . . . . . . . . . . . . . . . . 21
9.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 22 9.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Appendix A. Loss Recovery Pseudocode . . . . . . . . . . . . . . 22 Appendix A. Loss Recovery Pseudocode . . . . . . . . . . . . . . 23
A.1. Tracking Sent Packets . . . . . . . . . . . . . . . . . . 22 A.1. Tracking Sent Packets . . . . . . . . . . . . . . . . . . 23
A.1.1. Sent Packet Fields . . . . . . . . . . . . . . . . . 22 A.1.1. Sent Packet Fields . . . . . . . . . . . . . . . . . 23
A.2. Constants of interest . . . . . . . . . . . . . . . . . . 23 A.2. Constants of interest . . . . . . . . . . . . . . . . . . 24
A.3. Variables of interest . . . . . . . . . . . . . . . . . . 23 A.3. Variables of interest . . . . . . . . . . . . . . . . . . 24
A.4. Initialization . . . . . . . . . . . . . . . . . . . . . 24 A.4. Initialization . . . . . . . . . . . . . . . . . . . . . 25
A.5. On Sending a Packet . . . . . . . . . . . . . . . . . . . 24 A.5. On Sending a Packet . . . . . . . . . . . . . . . . . . . 26
A.6. On Receiving an Acknowledgment . . . . . . . . . . . . . 25 A.6. On Receiving an Acknowledgment . . . . . . . . . . . . . 26
A.7. On Packet Acknowledgment . . . . . . . . . . . . . . . . 26 A.7. On Packet Acknowledgment . . . . . . . . . . . . . . . . 27
A.8. Setting the Loss Detection Timer . . . . . . . . . . . . 27 A.8. Setting the Loss Detection Timer . . . . . . . . . . . . 28
A.9. On Timeout . . . . . . . . . . . . . . . . . . . . . . . 29 A.9. On Timeout . . . . . . . . . . . . . . . . . . . . . . . 30
A.10. Detecting Lost Packets . . . . . . . . . . . . . . . . . 29 A.10. Detecting Lost Packets . . . . . . . . . . . . . . . . . 30
Appendix B. Congestion Control Pseudocode . . . . . . . . . . . 30 Appendix B. Congestion Control Pseudocode . . . . . . . . . . . 31
B.1. Constants of interest . . . . . . . . . . . . . . . . . . 30 B.1. Constants of interest . . . . . . . . . . . . . . . . . . 31
B.2. Variables of interest . . . . . . . . . . . . . . . . . . 31 B.2. Variables of interest . . . . . . . . . . . . . . . . . . 32
B.3. Initialization . . . . . . . . . . . . . . . . . . . . . 32 B.3. Initialization . . . . . . . . . . . . . . . . . . . . . 33
B.4. On Packet Sent . . . . . . . . . . . . . . . . . . . . . 32 B.4. On Packet Sent . . . . . . . . . . . . . . . . . . . . . 33
B.5. On Packet Acknowledgement . . . . . . . . . . . . . . . . 32 B.5. On Packet Acknowledgement . . . . . . . . . . . . . . . . 33
B.6. On New Congestion Event . . . . . . . . . . . . . . . . . 33 B.6. On New Congestion Event . . . . . . . . . . . . . . . . . 34
B.7. Process ECN Information . . . . . . . . . . . . . . . . . 33 B.7. Process ECN Information . . . . . . . . . . . . . . . . . 34
B.8. On Packets Lost . . . . . . . . . . . . . . . . . . . . . 34 B.8. On Packets Lost . . . . . . . . . . . . . . . . . . . . . 35
Appendix C. Change Log . . . . . . . . . . . . . . . . . . . . . 34 Appendix C. Change Log . . . . . . . . . . . . . . . . . . . . . 35
C.1. Since draft-ietf-quic-recovery-23 . . . . . . . . . . . . 34 C.1. Since draft-ietf-quic-recovery-23 . . . . . . . . . . . . 35
C.2. Since draft-ietf-quic-recovery-22 . . . . . . . . . . . . 35 C.2. Since draft-ietf-quic-recovery-22 . . . . . . . . . . . . 36
C.3. Since draft-ietf-quic-recovery-21 . . . . . . . . . . . . 35 C.3. Since draft-ietf-quic-recovery-21 . . . . . . . . . . . . 36
C.4. Since draft-ietf-quic-recovery-20 . . . . . . . . . . . . 35 C.4. Since draft-ietf-quic-recovery-20 . . . . . . . . . . . . 36
C.5. Since draft-ietf-quic-recovery-19 . . . . . . . . . . . . 35 C.5. Since draft-ietf-quic-recovery-19 . . . . . . . . . . . . 36
C.6. Since draft-ietf-quic-recovery-18 . . . . . . . . . . . . 36 C.6. Since draft-ietf-quic-recovery-18 . . . . . . . . . . . . 37
C.7. Since draft-ietf-quic-recovery-17 . . . . . . . . . . . . 36 C.7. Since draft-ietf-quic-recovery-17 . . . . . . . . . . . . 37
C.8. Since draft-ietf-quic-recovery-16 . . . . . . . . . . . . 36 C.8. Since draft-ietf-quic-recovery-16 . . . . . . . . . . . . 37
C.9. Since draft-ietf-quic-recovery-14 . . . . . . . . . . . . 37 C.9. Since draft-ietf-quic-recovery-14 . . . . . . . . . . . . 38
C.10. Since draft-ietf-quic-recovery-13 . . . . . . . . . . . . 37 C.10. Since draft-ietf-quic-recovery-13 . . . . . . . . . . . . 38
C.11. Since draft-ietf-quic-recovery-12 . . . . . . . . . . . . 38 C.11. Since draft-ietf-quic-recovery-12 . . . . . . . . . . . . 39
C.12. Since draft-ietf-quic-recovery-11 . . . . . . . . . . . . 38 C.12. Since draft-ietf-quic-recovery-11 . . . . . . . . . . . . 39
C.13. Since draft-ietf-quic-recovery-10 . . . . . . . . . . . . 38 C.13. Since draft-ietf-quic-recovery-10 . . . . . . . . . . . . 39
C.14. Since draft-ietf-quic-recovery-09 . . . . . . . . . . . . 38 C.14. Since draft-ietf-quic-recovery-09 . . . . . . . . . . . . 39
C.15. Since draft-ietf-quic-recovery-08 . . . . . . . . . . . . 38 C.15. Since draft-ietf-quic-recovery-08 . . . . . . . . . . . . 39
C.16. Since draft-ietf-quic-recovery-07 . . . . . . . . . . . . 38 C.16. Since draft-ietf-quic-recovery-07 . . . . . . . . . . . . 39
C.17. Since draft-ietf-quic-recovery-06 . . . . . . . . . . . . 39 C.17. Since draft-ietf-quic-recovery-06 . . . . . . . . . . . . 40
C.18. Since draft-ietf-quic-recovery-05 . . . . . . . . . . . . 39 C.18. Since draft-ietf-quic-recovery-05 . . . . . . . . . . . . 40
C.19. Since draft-ietf-quic-recovery-04 . . . . . . . . . . . . 39 C.19. Since draft-ietf-quic-recovery-04 . . . . . . . . . . . . 40
C.20. Since draft-ietf-quic-recovery-03 . . . . . . . . . . . . 39 C.20. Since draft-ietf-quic-recovery-03 . . . . . . . . . . . . 40
C.21. Since draft-ietf-quic-recovery-02 . . . . . . . . . . . . 39 C.21. Since draft-ietf-quic-recovery-02 . . . . . . . . . . . . 40
C.22. Since draft-ietf-quic-recovery-01 . . . . . . . . . . . . 39 C.22. Since draft-ietf-quic-recovery-01 . . . . . . . . . . . . 40
C.23. Since draft-ietf-quic-recovery-00 . . . . . . . . . . . . 39 C.23. Since draft-ietf-quic-recovery-00 . . . . . . . . . . . . 40
C.24. Since draft-iyengar-quic-loss-recovery-01 . . . . . . . . 39 C.24. Since draft-iyengar-quic-loss-recovery-01 . . . . . . . . 40
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 40 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 41
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 40 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 41
1. Introduction 1. Introduction
QUIC is a new multiplexed and secure transport atop UDP. QUIC builds QUIC is a new multiplexed and secure transport atop UDP. QUIC builds
on decades of transport and security experience, and implements on decades of transport and security experience, and implements
mechanisms that make it attractive as a modern general-purpose mechanisms that make it attractive as a modern general-purpose
transport. The QUIC protocol is described in [QUIC-TRANSPORT]. transport. The QUIC protocol is described in [QUIC-TRANSPORT].
QUIC implements the spirit of existing TCP loss recovery mechanisms, QUIC implements the spirit of existing TCP congestion control and
described in RFCs, various Internet-drafts, and also those prevalent loss recovery mechanisms, described in RFCs, various Internet-drafts,
in the Linux TCP implementation. This document describes QUIC and also those prevalent in the Linux TCP implementation. This
congestion control and loss recovery, and where applicable, document describes QUIC congestion control and loss recovery, and
attributes the TCP equivalent in RFCs, Internet-drafts, academic where applicable, attributes the TCP equivalent in RFCs, Internet-
papers, and/or TCP implementations. drafts, academic papers, and/or TCP implementations.
2. Conventions and Definitions 2. Conventions and Definitions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP "OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here. capitals, as shown here.
Definitions of terms that are used in this document: Definitions of terms that are used in this document:
skipping to change at page 4, line 50 skipping to change at page 4, line 50
and are not ACK-only, and they are not acknowledged, declared and are not ACK-only, and they are not acknowledged, declared
lost, or abandoned along with old keys. lost, or abandoned along with old keys.
Ack-eliciting Frames: All frames other than ACK, PADDING, and Ack-eliciting Frames: All frames other than ACK, PADDING, and
CONNECTION_CLOSE are considered ack-eliciting. CONNECTION_CLOSE are considered ack-eliciting.
Ack-eliciting Packets: Packets that contain ack-eliciting frames Ack-eliciting Packets: Packets that contain ack-eliciting frames
elicit an ACK from the receiver within the maximum ack delay and elicit an ACK from the receiver within the maximum ack delay and
are called ack-eliciting packets. are called ack-eliciting packets.
Crypto Packets: Packets containing CRYPTO data sent in Initial or
Handshake packets.
Out-of-order Packets: Packets that do not increase the largest Out-of-order Packets: Packets that do not increase the largest
received packet number for its packet number space by exactly one. received packet number for its packet number space by exactly one.
Packets arrive out of order when earlier packets are lost or Packets arrive out of order when earlier packets are lost or
delayed. delayed.
3. Design of the QUIC Transmission Machinery 3. Design of the QUIC Transmission Machinery
All transmissions in QUIC are sent with a packet-level header, which All transmissions in QUIC are sent with a packet-level header, which
indicates the encryption level and includes a packet sequence number indicates the encryption level and includes a packet sequence number
(referred to below as a packet number). The encryption level (referred to below as a packet number). The encryption level
indicates the packet number space, as described in [QUIC-TRANSPORT]. indicates the packet number space, as described in [QUIC-TRANSPORT].
Packet numbers never repeat within a packet number space for the Packet numbers never repeat within a packet number space for the
lifetime of a connection. Packet numbers monotonically increase lifetime of a connection. Packet numbers are sent in monotonically
within a space, preventing ambiguity. increasing order within a space, preventing ambiguity.
This design obviates the need for disambiguating between This design obviates the need for disambiguating between
transmissions and retransmissions and eliminates significant transmissions and retransmissions and eliminates significant
complexity from QUIC's interpretation of TCP loss detection complexity from QUIC's interpretation of TCP loss detection
mechanisms. mechanisms.
QUIC packets can contain multiple frames of different types. The QUIC packets can contain multiple frames of different types. The
recovery mechanisms ensure that data and frames that need reliable recovery mechanisms ensure that data and frames that need reliable
delivery are acknowledged or declared lost and sent in new packets as delivery are acknowledged or declared lost and sent in new packets as
necessary. The types of frames contained in a packet affect recovery necessary. The types of frames contained in a packet affect recovery
skipping to change at page 6, line 43 skipping to change at page 6, line 43
retransmissions are trivially detected, and mechanisms such as Fast retransmissions are trivially detected, and mechanisms such as Fast
Retransmit can be applied universally, based only on packet number. Retransmit can be applied universally, based only on packet number.
This design point significantly simplifies loss detection mechanisms This design point significantly simplifies loss detection mechanisms
for QUIC. Most TCP mechanisms implicitly attempt to infer for QUIC. Most TCP mechanisms implicitly attempt to infer
transmission ordering based on TCP sequence numbers - a non-trivial transmission ordering based on TCP sequence numbers - a non-trivial
task, especially when TCP timestamps are not available. task, especially when TCP timestamps are not available.
3.1.3. Clearer Loss Epoch 3.1.3. Clearer Loss Epoch
QUIC ends a loss epoch when a packet sent after loss is declared is QUIC starts a loss epoch when a packet is lost and ends one when any
acknowledged. TCP waits for the gap in the sequence number space to packet sent after the epoch starts is acknowledged. TCP waits for
be filled, and so if a segment is lost multiple times in a row, the the gap in the sequence number space to be filled, and so if a
loss epoch may not end for several round trips. Because both should segment is lost multiple times in a row, the loss epoch may not end
reduce their congestion windows only once per epoch, QUIC will do it for several round trips. Because both should reduce their congestion
correctly once for every round trip that experiences loss, while TCP windows only once per epoch, QUIC will do it once for every round
may only do it once across multiple round trips. trip that experiences loss, while TCP may only do it once across
multiple round trips.
3.1.4. No Reneging 3.1.4. No Reneging
QUIC ACKs contain information that is similar to TCP SACK, but QUIC QUIC ACKs contain information that is similar to TCP SACK, but QUIC
does not allow any acked packet to be reneged, greatly simplifying does not allow any acked packet to be reneged, greatly simplifying
implementations on both sides and reducing memory pressure on the implementations on both sides and reducing memory pressure on the
sender. sender.
3.1.5. More ACK Ranges 3.1.5. More ACK Ranges
skipping to change at page 7, line 32 skipping to change at page 7, line 32
received and when the corresponding acknowledgment is sent, allowing received and when the corresponding acknowledgment is sent, allowing
a peer to maintain a more accurate round-trip time estimate (see a peer to maintain a more accurate round-trip time estimate (see
Section 13.2 of [QUIC-TRANSPORT]). Section 13.2 of [QUIC-TRANSPORT]).
4. Estimating the Round-Trip Time 4. Estimating the Round-Trip Time
At a high level, an endpoint measures the time from when a packet was At a high level, an endpoint measures the time from when a packet was
sent to when it is acknowledged as a round-trip time (RTT) sample. sent to when it is acknowledged as a round-trip time (RTT) sample.
The endpoint uses RTT samples and peer-reported host delays (see The endpoint uses RTT samples and peer-reported host delays (see
Section 13.2 of [QUIC-TRANSPORT]) to generate a statistical Section 13.2 of [QUIC-TRANSPORT]) to generate a statistical
description of the connection's RTT. An endpoint computes the description of the network path's RTT. An endpoint computes the
following three values: the minimum value observed over the lifetime following three values for each path: the minimum value observed over
of the connection (min_rtt), an exponentially-weighted moving average the lifetime of the path (min_rtt), an exponentially-weighted moving
(smoothed_rtt), and the variance in the observed RTT samples average (smoothed_rtt), and the mean deviation (referred to as
"variation" in the rest of this document) in the observed RTT samples
(rttvar). (rttvar).
4.1. Generating RTT samples 4.1. Generating RTT samples
An endpoint generates an RTT sample on receiving an ACK frame that An endpoint generates an RTT sample on receiving an ACK frame that
meets the following two conditions: meets the following two conditions:
o the largest acknowledged packet number is newly acknowledged, and o the largest acknowledged packet number is newly acknowledged, and
o at least one of the newly acknowledged packets was ack-eliciting. o at least one of the newly acknowledged packets was ack-eliciting.
The RTT sample, latest_rtt, is generated as the time elapsed since The RTT sample, latest_rtt, is generated as the time elapsed since
the largest acknowledged packet was sent: the largest acknowledged packet was sent:
latest_rtt = ack_time - send_time_of_largest_acked latest_rtt = ack_time - send_time_of_largest_acked
An RTT sample is generated using only the largest acknowledged packet An RTT sample is generated using only the largest acknowledged packet
in the received ACK frame. This is because a peer reports host in the received ACK frame. This is because a peer reports ACK delays
delays for only the largest acknowledged packet in an ACK frame. for only the largest acknowledged packet in an ACK frame. While the
While the reported host delay is not used by the RTT sample reported ACK delay is not used by the RTT sample measurement, it is
measurement, it is used to adjust the RTT sample in subsequent used to adjust the RTT sample in subsequent computations of
computations of smoothed_rtt and rttvar Section 4.3. smoothed_rtt and rttvar Section 4.3.
To avoid generating multiple RTT samples using the same packet, an To avoid generating multiple RTT samples for a single packet, an ACK
ACK frame SHOULD NOT be used to update RTT estimates if it does not frame SHOULD NOT be used to update RTT estimates if it does not newly
newly acknowledge the largest acknowledged packet. acknowledge the largest acknowledged packet.
An RTT sample MUST NOT be generated on receiving an ACK frame that An RTT sample MUST NOT be generated on receiving an ACK frame that
does not newly acknowledge at least one ack-eliciting packet. A peer does not newly acknowledge at least one ack-eliciting packet. A peer
does not send an ACK frame on receiving only non-ack-eliciting does not send an ACK frame on receiving only non-ack-eliciting
packets, so an ACK frame that is subsequently sent can include an packets, so an ACK frame that is subsequently sent can include an
arbitrarily large Ack Delay field. Ignoring such ACK frames avoids arbitrarily large Ack Delay field. Ignoring such ACK frames avoids
complications in subsequent smoothed_rtt and rttvar computations. complications in subsequent smoothed_rtt and rttvar computations.
A sender might generate multiple RTT samples per RTT when multiple A sender might generate multiple RTT samples per RTT when multiple
ACK frames are received within an RTT. As suggested in [RFC6298], ACK frames are received within an RTT. As suggested in [RFC6298],
doing so might result in inadequate history in smoothed_rtt and doing so might result in inadequate history in smoothed_rtt and
rttvar. Ensuring that RTT estimates retain sufficient history is an rttvar. Ensuring that RTT estimates retain sufficient history is an
open research question. open research question.
4.2. Estimating min_rtt 4.2. Estimating min_rtt
min_rtt is the minimum RTT observed over the lifetime of the min_rtt is the minimum RTT observed for a given network path. min_rtt
connection. min_rtt is set to the latest_rtt on the first sample in a is set to the latest_rtt on the first RTT sample, and to the lesser
connection, and to the lesser of min_rtt and latest_rtt on subsequent of min_rtt and latest_rtt on subsequent samples. In this document,
min_rtt is used by loss detection to reject implausibly small rtt
samples. samples.
An endpoint uses only locally observed times in computing the min_rtt An endpoint uses only locally observed times in computing the min_rtt
and does not adjust for host delays reported by the peer. Doing so and does not adjust for ACK delays reported by the peer. Doing so
allows the endpoint to set a lower bound for the smoothed_rtt based allows the endpoint to set a lower bound for the smoothed_rtt based
entirely on what it observes (see Section 4.3), and limits potential entirely on what it observes (see Section 4.3), and limits potential
underestimation due to erroneously-reported delays by the peer. underestimation due to erroneously-reported delays by the peer.
The RTT for a network path may change over time. If a path's actual
RTT decreases, the min_rtt will adapt immediately on the first low
sample. If the path's actual RTT increases, the min_rtt will not
adapt to it, allowing future RTT samples that are smaller than the
new RTT be included in smoothed_rtt.
4.3. Estimating smoothed_rtt and rttvar 4.3. Estimating smoothed_rtt and rttvar
smoothed_rtt is an exponentially-weighted moving average of an smoothed_rtt is an exponentially-weighted moving average of an
endpoint's RTT samples, and rttvar is the endpoint's estimated endpoint's RTT samples, and rttvar is the variation in the RTT
variance in the RTT samples. samples, estimated using a mean variation.
The calculation of smoothed_rtt uses path latency after adjusting RTT The calculation of smoothed_rtt uses path latency after adjusting RTT
samples for host delays. For packets sent in the ApplicationData samples for ACK delays. For packets sent in the ApplicationData
packet number space, a peer limits any delay in sending an packet number space, a peer limits any delay in sending an
acknowledgement for an ack-eliciting packet to no greater than the acknowledgement for an ack-eliciting packet to no greater than the
value it advertised in the max_ack_delay transport parameter. value it advertised in the max_ack_delay transport parameter.
Consequently, when a peer reports an Ack Delay that is greater than Consequently, when a peer reports an Ack Delay that is greater than
its max_ack_delay, the delay is attributed to reasons out of the its max_ack_delay, the delay is attributed to reasons out of the
peer's control, such as scheduler latency at the peer or loss of peer's control, such as scheduler latency at the peer or loss of
previous ACK frames. Any delays beyond the peer's max_ack_delay are previous ACK frames. Any delays beyond the peer's max_ack_delay are
therefore considered effectively part of path delay and incorporated therefore considered effectively part of path delay and incorporated
into the smoothed_rtt estimate. into the smoothed_rtt estimate.
When adjusting an RTT sample using peer-reported acknowledgement When adjusting an RTT sample using peer-reported acknowledgement
delays, an endpoint: delays, an endpoint:
o MUST ignore the Ack Delay field of the ACK frame for packets sent o MUST ignore the Ack Delay field of the ACK frame for packets sent
in the Initial and Handshake packet number space. in the Initial and Handshake packet number space.
o MUST use the lesser of the value reported in Ack Delay field of o MUST use the lesser of the value reported in Ack Delay field of
the ACK frame and the peer's max_ack_delay transport parameter. the ACK frame and the peer's max_ack_delay transport parameter.
o MUST NOT apply the adjustment if the resulting RTT sample is o MUST NOT apply the adjustment if the resulting RTT sample is
smaller than the min_rtt. This limits the underestimation that a smaller than the min_rtt. This limits the underestimation that a
misreporting peer can cause to the smoothed_rtt. misreporting peer can cause to the smoothed_rtt.
On the first RTT sample in a connection, the smoothed_rtt is set to On the first RTT sample for a network path, the smoothed_rtt is set
the latest_rtt. to the latest_rtt.
smoothed_rtt and rttvar are computed as follows, similar to smoothed_rtt and rttvar are computed as follows, similar to
[RFC6298]. On the first RTT sample in a connection: [RFC6298]. On the first RTT sample for a network path:
smoothed_rtt = latest_rtt smoothed_rtt = latest_rtt
rttvar = latest_rtt / 2 rttvar = latest_rtt / 2
On subsequent RTT samples, smoothed_rtt and rttvar evolve as follows: On subsequent RTT samples, smoothed_rtt and rttvar evolve as follows:
ack_delay = min(Ack Delay in ACK Frame, max_ack_delay) ack_delay = min(Ack Delay in ACK Frame, max_ack_delay)
adjusted_rtt = latest_rtt adjusted_rtt = latest_rtt
if (min_rtt + ack_delay < latest_rtt): if (min_rtt + ack_delay < latest_rtt):
adjusted_rtt = latest_rtt - ack_delay adjusted_rtt = latest_rtt - ack_delay
smoothed_rtt = 7/8 * smoothed_rtt + 1/8 * adjusted_rtt smoothed_rtt = 7/8 * smoothed_rtt + 1/8 * adjusted_rtt
rttvar_sample = abs(smoothed_rtt - adjusted_rtt) rttvar_sample = abs(smoothed_rtt - adjusted_rtt)
rttvar = 3/4 * rttvar + 1/4 * rttvar_sample rttvar = 3/4 * rttvar + 1/4 * rttvar_sample
5. Loss Detection 5. Loss Detection
QUIC senders use both ack information and timeouts to detect lost QUIC senders use acknowledgements to detect lost packets, and a probe
packets, and this section provides a description of these algorithms. time out Section 5.2 to ensure acknowledgements are received. This
section provides a description of these algorithms.
If a packet is lost, the QUIC transport needs to recover from that If a packet is lost, the QUIC transport needs to recover from that
loss, such as by retransmitting the data, sending an updated frame, loss, such as by retransmitting the data, sending an updated frame,
or abandoning the frame. For more information, see Section 13.3 of or abandoning the frame. For more information, see Section 13.3 of
[QUIC-TRANSPORT]. [QUIC-TRANSPORT].
5.1. Acknowledgement-based Detection 5.1. Acknowledgement-based Detection
Acknowledgement-based loss detection implements the spirit of TCP's Acknowledgement-based loss detection implements the spirit of TCP's
Fast Retransmit [RFC5681], Early Retransmit [RFC5827], FACK [FACK], Fast Retransmit [RFC5681], Early Retransmit [RFC5827], FACK [FACK],
skipping to change at page 10, line 22 skipping to change at page 10, line 41
A packet is declared lost if it meets all the following conditions: A packet is declared lost if it meets all the following conditions:
o The packet is unacknowledged, in-flight, and was sent prior to an o The packet is unacknowledged, in-flight, and was sent prior to an
acknowledged packet. acknowledged packet.
o Either its packet number is kPacketThreshold smaller than an o Either its packet number is kPacketThreshold smaller than an
acknowledged packet (Section 5.1.1), or it was sent long enough in acknowledged packet (Section 5.1.1), or it was sent long enough in
the past (Section 5.1.2). the past (Section 5.1.2).
The acknowledgement indicates that a packet sent later was delivered, The acknowledgement indicates that a packet sent later was delivered,
while the packet and time thresholds provide some tolerance for and the packet and time thresholds provide some tolerance for packet
packet reordering. reordering.
Spuriously declaring packets as lost leads to unnecessary Spuriously declaring packets as lost leads to unnecessary
retransmissions and may result in degraded performance due to the retransmissions and may result in degraded performance due to the
actions of the congestion controller upon detecting loss. actions of the congestion controller upon detecting loss.
Implementations that detect spurious retransmissions and increase the Implementations that detect spurious retransmissions and increase the
reordering threshold in packets or time MAY choose to start with reordering threshold in packets or time MAY choose to start with
smaller initial reordering thresholds to minimize recovery latency. smaller initial reordering thresholds to minimize recovery latency.
5.1.1. Packet Threshold 5.1.1. Packet Threshold
The RECOMMENDED initial value for the packet reordering threshold The RECOMMENDED initial value for the packet reordering threshold
(kPacketThreshold) is 3, based on best practices for TCP loss (kPacketThreshold) is 3, based on best practices for TCP loss
detection [RFC5681] [RFC6675]. detection [RFC5681] [RFC6675]. Implementations SHOULD NOT use a
packet threshold less than 3, to keep in line with TCP [RFC5681].
Some networks may exhibit higher degrees of reordering, causing a Some networks may exhibit higher degrees of reordering, causing a
sender to detect spurious losses. Implementers MAY use algorithms sender to detect spurious losses. Implementers MAY use algorithms
developed for TCP, such as TCP-NCR [RFC4653], to improve QUIC's developed for TCP, such as TCP-NCR [RFC4653], to improve QUIC's
reordering resilience. reordering resilience.
5.1.2. Time Threshold 5.1.2. Time Threshold
Once a later packet packet within the same packet number space has Once a later packet within the same packet number space has been
been acknowledged, an endpoint SHOULD declare an earlier packet lost acknowledged, an endpoint SHOULD declare an earlier packet lost if it
if it was sent a threshold amount of time in the past. To avoid was sent a threshold amount of time in the past. To avoid declaring
declaring packets as lost too early, this time threshold MUST be set packets as lost too early, this time threshold MUST be set to at
to at least kGranularity. The time threshold is: least kGranularity. The time threshold is:
max(kTimeThreshold * max(smoothed_rtt, latest_rtt), kGranularity)
kTimeThreshold * max(smoothed_rtt, latest_rtt, kGranularity)
If packets sent prior to the largest acknowledged packet cannot yet If packets sent prior to the largest acknowledged packet cannot yet
be declared lost, then a timer SHOULD be set for the remaining time. be declared lost, then a timer SHOULD be set for the remaining time.
Using max(smoothed_rtt, latest_rtt) protects from the two following Using max(smoothed_rtt, latest_rtt) protects from the two following
cases: cases:
o the latest RTT sample is lower than the smoothed RTT, perhaps due o the latest RTT sample is lower than the smoothed RTT, perhaps due
to reordering where the acknowledgement encountered a shorter to reordering where the acknowledgement encountered a shorter
path; path;
o the latest RTT sample is higher than the smoothed RTT, perhaps due o the latest RTT sample is higher than the smoothed RTT, perhaps due
to a sustained increase in the actual RTT, but the smoothed RTT to a sustained increase in the actual RTT, but the smoothed RTT
has not yet caught up. has not yet caught up.
The RECOMMENDED time threshold (kTimeThreshold), expressed as a The RECOMMENDED time threshold (kTimeThreshold), expressed as a
round-trip time multiplier, is 9/8. round-trip time multiplier, is 9/8.
Implementations MAY experiment with absolute thresholds, thresholds Implementations MAY experiment with absolute thresholds, thresholds
from previous connections, adaptive thresholds, or including RTT from previous connections, adaptive thresholds, or including RTT
variance. Smaller thresholds reduce reordering resilience and variation. Smaller thresholds reduce reordering resilience and
increase spurious retransmissions, and larger thresholds increase increase spurious retransmissions, and larger thresholds increase
loss detection delay. loss detection delay.
5.2. Probe Timeout 5.2. Probe Timeout
A Probe Timeout (PTO) triggers sending one or two probe datagrams A Probe Timeout (PTO) triggers sending one or two probe datagrams
when ack-eliciting packets are not acknowledged within the expected when ack-eliciting packets are not acknowledged within the expected
period of time or the handshake has not been completed. A PTO period of time or the handshake has not been completed. A PTO
enables a connection to recover from loss of tail packets or enables a connection to recover from loss of tail packets or
acknowledgements. The PTO algorithm used in QUIC implements the acknowledgements.
reliability functions of Tail Loss Probe [RACK], RTO [RFC5681] and
F-RTO algorithms for TCP [RFC5682], and the timeout computation is As with loss detection, the probe timeout is per packet number space.
based on TCP's retransmission timeout period [RFC6298]. The PTO algorithm used in QUIC implements the reliability functions
of Tail Loss Probe [RACK], RTO [RFC5681], and F-RTO algorithms for
TCP [RFC5682]. The timeout computation is based on TCP's
retransmission timeout period [RFC6298].
5.2.1. Computing PTO 5.2.1. Computing PTO
When an ack-eliciting packet is transmitted, the sender schedules a When an ack-eliciting packet is transmitted, the sender schedules a
timer for the PTO period as follows: timer for the PTO period as follows:
PTO = smoothed_rtt + max(4*rttvar, kGranularity) + max_ack_delay PTO = smoothed_rtt + max(4*rttvar, kGranularity) + max_ack_delay
kGranularity, smoothed_rtt, rttvar, and max_ack_delay are defined in kGranularity, smoothed_rtt, rttvar, and max_ack_delay are defined in
Appendix A.2 and Appendix A.3. Appendix A.2 and Appendix A.3.
The PTO period is the amount of time that a sender ought to wait for The PTO period is the amount of time that a sender ought to wait for
an acknowledgement of a sent packet. This time period includes the an acknowledgement of a sent packet. This time period includes the
estimated network roundtrip-time (smoothed_rtt), the variance in the estimated network roundtrip-time (smoothed_rtt), the variation in the
estimate (4*rttvar), and max_ack_delay, to account for the maximum estimate (4*rttvar), and max_ack_delay, to account for the maximum
time by which a receiver might delay sending an acknowledgement. time by which a receiver might delay sending an acknowledgement.
When the PTO is armed for Initial or Handshake packet number spaces,
the max_ack_delay is 0, as specified in 13.2.5 of [QUIC-TRANSPORT].
The PTO value MUST be set to at least kGranularity, to avoid the The PTO value MUST be set to at least kGranularity, to avoid the
timer expiring immediately. timer expiring immediately.
A sender computes its PTO timer every time an ack-eliciting packet is
sent. When ack-eliciting packets are in-flight in multiple packet
number spaces, the timer MUST be set for the packet number space with
the earliest timeout, except for ApplicationData, which MUST be
ignored until the handshake completes; see Section 4.1.1 of
[QUIC-TLS]. Not arming the PTO for ApplicationData prioritizes
completing the handshake and prevents the server from sending a 1-RTT
packet on a PTO before before it has the keys to process a 1-RTT
packet.
When a PTO timer expires, the PTO period MUST be set to twice its When a PTO timer expires, the PTO period MUST be set to twice its
current value. This exponential reduction in the sender's rate is current value. This exponential reduction in the sender's rate is
important because the PTOs might be caused by loss of packets or important because consecutive PTOs might be caused by loss of packets
acknowledgements due to severe congestion. The life of a connection or acknowledgements due to severe congestion. Even when there are
that is experiencing consecutive PTOs is limited by the endpoint's ack-eliciting packets in-flight in multiple packet number spaces, the
idle timeout. exponential increase in probe timeout occurs across all spaces to
prevent excess load on the network. For example, a timeout in the
Initial packet number space doubles the length of the timeout in the
Handshake packet number space.
A sender computes its PTO timer every time an ack-eliciting packet is The life of a connection that is experiencing consecutive PTOs is
sent. A sender might choose to optimize this by setting the timer limited by the endpoint's idle timeout.
fewer times if it knows that more ack-eliciting packets will be sent
within a short period of time.
The probe timer is not set if the time threshold Section 5.1.2 loss The probe timer is not set if the time threshold Section 5.1.2 loss
detection timer is set. The time threshold loss detection timer is detection timer is set. The time threshold loss detection timer is
expected to both expire earlier than the PTO and be less likely to expected to both expire earlier than the PTO and be less likely to
spuriously retransmit data. spuriously retransmit data.
5.3. Handshakes and New Paths 5.3. Handshakes and New Paths
The initial probe timeout for a new connection or new path SHOULD be The initial probe timeout for a new connection or new path SHOULD be
set to twice the initial RTT. Resumed connections over the same set to twice the initial RTT. Resumed connections over the same
network SHOULD use the previous connection's final smoothed RTT value network SHOULD use the previous connection's final smoothed RTT value
as the resumed connection's initial RTT. If no previous RTT is as the resumed connection's initial RTT. If no previous RTT is
available, the initial RTT SHOULD be set to 500ms, resulting in a 1 available, the initial RTT SHOULD be set to 500ms, resulting in a 1
second initial timeout as recommended in [RFC6298]. second initial timeout as recommended in [RFC6298].
A connection MAY use the delay between sending a PATH_CHALLENGE and A connection MAY use the delay between sending a PATH_CHALLENGE and
receiving a PATH_RESPONSE to seed initial_rtt for a new path, but the receiving a PATH_RESPONSE to set the initial RTT (see kInitialRtt in
delay SHOULD NOT be considered an RTT sample. Appendix A.2) for a new path, but the delay SHOULD NOT be considered
an RTT sample.
Until the server has validated the client's address on the path, the Until the server has validated the client's address on the path, the
amount of data it can send is limited to three times the amount of amount of data it can send is limited to three times the amount of
data received, as specified in Section 8.1 of [QUIC-TRANSPORT]. If data received, as specified in Section 8.1 of [QUIC-TRANSPORT]. If
no data can be sent, then the PTO alarm MUST NOT be armed. no data can be sent, then the PTO alarm MUST NOT be armed until
datagrams have been received from the client.
Since the server could be blocked until more packets are received Since the server could be blocked until more packets are received
from the client, it is the client's responsibility to send packets to from the client, it is the client's responsibility to send packets to
unblock the server until it is certain that the server has finished unblock the server until it is certain that the server has finished
its address validation (see Section 8 of [QUIC-TRANSPORT]). That is, its address validation (see Section 8 of [QUIC-TRANSPORT]). That is,
the client MUST set the probe timer if the client has not received an the client MUST set the probe timer if the client has not received an
acknowledgement for one of its Handshake or 1-RTT packets. acknowledgement for one of its Handshake or 1-RTT packets.
Prior to handshake completion, when few to none RTT samples have been Prior to handshake completion, when few to none RTT samples have been
generated, it is possible that the probe timer expiration is due to generated, it is possible that the probe timer expiration is due to
an incorrect RTT estimate at the client. To allow the client to an incorrect RTT estimate at the client. To allow the client to
improve its RTT estimate, the new packet that it sends MUST be ack- improve its RTT estimate, the new packet that it sends MUST be ack-
eliciting. If Handshake keys are available to the client, it MUST eliciting. If Handshake keys are available to the client, it MUST
send a Handshake packet, and otherwise it MUST send an Initial packet send a Handshake packet, and otherwise it MUST send an Initial packet
in a UDP datagram of at least 1200 bytes. in a UDP datagram of at least 1200 bytes.
Initial packets and Handshake packets may never be acknowledged, but Initial packets and Handshake packets could be never acknowledged,
they are removed from bytes in flight when the Initial and Handshake but they are removed from bytes in flight when the Initial and
keys are discarded. Handshake keys are discarded.
5.3.1. Sending Probe Packets 5.3.1. Sending Probe Packets
When a PTO timer expires, a sender MUST send at least one ack- When a PTO timer expires, a sender MUST send at least one ack-
eliciting packet as a probe, unless there is no data available to eliciting packet in the packet number space as a probe, unless there
send. An endpoint MAY send up to two full-sized datagrams containing is no data available to send. An endpoint MAY send up to two full-
ack-eliciting packets, to avoid an expensive consecutive PTO sized datagrams containing ack-eliciting packets, to avoid an
expiration due to a single lost datagram. expensive consecutive PTO expiration due to a single lost datagram or
transmit data from multiple packet number spaces.
In addition to sending data in the packet number space for which the
timer expired, the sender SHOULD send ack-eliciting packets from
other packet number spaces with in-flight data, coalescing packets if
possible.
When the PTO timer expires, and there is new or previously sent When the PTO timer expires, and there is new or previously sent
unacknowledged data, it MUST be sent. Data that was previously sent unacknowledged data, it MUST be sent.
with Initial encryption MUST be sent before Handshake data and data
previously sent at Handshake encryption MUST be sent before any
ApplicationData data.
It is possible the sender has no new or previously-sent data to send. It is possible the sender has no new or previously-sent data to send.
As an example, consider the following sequence of events: new As an example, consider the following sequence of events: new
application data is sent in a STREAM frame, deemed lost, then application data is sent in a STREAM frame, deemed lost, then
retransmitted in a new packet, and then the original transmission is retransmitted in a new packet, and then the original transmission is
acknowledged. When there is no data to send, the sender SHOULD send acknowledged. When there is no data to send, the sender SHOULD send
a PING or other ack-eliciting frame in a single packet, re-arming the a PING or other ack-eliciting frame in a single packet, re-arming the
PTO timer. PTO timer.
Alternatively, instead of sending an ack-eliciting packet, the sender Alternatively, instead of sending an ack-eliciting packet, the sender
skipping to change at page 14, line 49 skipping to change at page 15, line 47
connection state, in particular cryptographic handshake messages, is connection state, in particular cryptographic handshake messages, is
retained; see Section 17.2.5 of [QUIC-TRANSPORT]. retained; see Section 17.2.5 of [QUIC-TRANSPORT].
The client MAY compute an RTT estimate to the server as the time The client MAY compute an RTT estimate to the server as the time
period from when the first Initial was sent to when a Retry or a period from when the first Initial was sent to when a Retry or a
Version Negotiation packet is received. The client MAY use this Version Negotiation packet is received. The client MAY use this
value in place of its default for the initial RTT estimate. value in place of its default for the initial RTT estimate.
5.5. Discarding Keys and Packet State 5.5. Discarding Keys and Packet State
When packet protection keys are discarded (see Section 4.9 of When packet protection keys are discarded (see Section 4.10 of
[QUIC-TLS]), all packets that were sent with those keys can no longer [QUIC-TLS]), all packets that were sent with those keys can no longer
be acknowledged because their acknowledgements cannot be processed be acknowledged because their acknowledgements cannot be processed
anymore. The sender MUST discard all recovery state associated with anymore. The sender MUST discard all recovery state associated with
those packets and MUST remove them from the count of bytes in flight. those packets and MUST remove them from the count of bytes in flight.
Endpoints stop sending and receiving Initial packets once they start Endpoints stop sending and receiving Initial packets once they start
exchanging Handshake packets (see Section 17.2.2.1 of exchanging Handshake packets (see Section 17.2.2.1 of
[QUIC-TRANSPORT]). At this point, recovery state for all in-flight [QUIC-TRANSPORT]). At this point, recovery state for all in-flight
Initial packets is discarded. Initial packets is discarded.
When 0-RTT is rejected, recovery state for all in-flight 0-RTT When 0-RTT is rejected, recovery state for all in-flight 0-RTT
packets is discarded. packets is discarded.
If a server accepts 0-RTT, but does not buffer 0-RTT packets that If a server accepts 0-RTT, but does not buffer 0-RTT packets that
arrive before Initial packets, early 0-RTT packets will be declared arrive before Initial packets, early 0-RTT packets will be declared
lost, but that is expected to be infrequent. lost, but that is expected to be infrequent.
It is expected that keys are discarded after packets encrypted with It is expected that keys are discarded after packets encrypted with
them would be acknowledged or declared lost. Initial secrets however them would be acknowledged or declared lost. Initial secrets however
might be destroyed sooner, as soon as handshake keys are available might be destroyed sooner, as soon as handshake keys are available
(see Section 4.9.1 of [QUIC-TLS]). (see Section 4.10.1 of [QUIC-TLS]).
6. Congestion Control 6. Congestion Control
QUIC's congestion control is based on TCP NewReno [RFC6582]. NewReno This document specifies a Reno congestion controller for QUIC
is a congestion window based congestion control. QUIC specifies the [RFC6582].
congestion window in bytes rather than packets due to finer control
and the ease of appropriate byte counting [RFC3465].
QUIC hosts MUST NOT send packets if they would increase The signals QUIC provides for congestion control are generic and are
bytes_in_flight (defined in Appendix B.2) beyond the available designed to support different algorithms. Endpoints can unilaterally
congestion window, unless the packet is a probe packet sent after a choose a different algorithm to use, such as Cubic [RFC8312].
PTO timer expires, as described in Section 5.2.
Implementations MAY use other congestion control algorithms, such as If an endpoint uses a different controller than that specified in
Cubic [RFC8312], and endpoints MAY use different algorithms from one this document, the chosen controller MUST conform to the congestion
another. The signals QUIC provides for congestion control are control guidelines specified in Section 3.1 of [RFC8085].
generic and are designed to support different algorithms.
The algorithm in this document specifies and uses the controller's
congestion window in bytes.
An endpoint MUST NOT send a packet if it would cause bytes_in_flight
(see Appendix B.2) to be larger than the congestion window, unless
the packet is sent on a PTO timer expiration (see Section 5.2).
6.1. Explicit Congestion Notification 6.1. Explicit Congestion Notification
If a path has been verified to support ECN, QUIC treats a Congestion If a path has been verified to support ECN [RFC3168] [RFC8311], QUIC
Experienced codepoint in the IP header as a signal of congestion. treats a Congestion Experienced(CE) codepoint in the IP header as a
This document specifies an endpoint's response when its peer receives signal of congestion. This document specifies an endpoint's response
packets with the Congestion Experienced codepoint. As discussed in when its peer receives packets with the Congestion Experienced
[RFC8311], endpoints are permitted to experiment with other response codepoint.
functions.
6.2. Slow Start 6.2. Slow Start
QUIC begins every connection in slow start and exits slow start upon QUIC begins every connection in slow start and exits slow start upon
loss or upon increase in the ECN-CE counter. QUIC re-enters slow loss or upon increase in the ECN-CE counter. QUIC re-enters slow
start anytime the congestion window is less than ssthresh, which only start any time the congestion window is less than ssthresh, which
occurs after persistent congestion is declared. While in slow start, only occurs after persistent congestion is declared. While in slow
QUIC increases the congestion window by the number of bytes start, QUIC increases the congestion window by the number of bytes
acknowledged when each acknowledgment is processed. acknowledged when each acknowledgment is processed.
6.3. Congestion Avoidance 6.3. Congestion Avoidance
Slow start exits to congestion avoidance. Congestion avoidance in Slow start exits to congestion avoidance. Congestion avoidance in
NewReno uses an additive increase multiplicative decrease (AIMD) NewReno uses an additive increase multiplicative decrease (AIMD)
approach that increases the congestion window by one maximum packet approach that increases the congestion window by one maximum packet
size per congestion window acknowledged. When a loss is detected, size per congestion window acknowledged. When a loss is detected,
NewReno halves the congestion window and sets the slow start NewReno halves the congestion window and sets the slow start
threshold to the new congestion window. threshold to the new congestion window.
skipping to change at page 17, line 40 skipping to change at page 18, line 40
+-----+------------------------+ +-----+------------------------+
| t=1 | Send Pkt #2 (PTO 1) | | t=1 | Send Pkt #2 (PTO 1) |
| | | | | |
| t=3 | Send Pkt #3 (PTO 2) | | t=3 | Send Pkt #3 (PTO 2) |
| | | | | |
| t=7 | Send Pkt #4 (PTO 3) | | t=7 | Send Pkt #4 (PTO 3) |
| | | | | |
| t=8 | Recv ACK of Pkt #4 | | t=8 | Recv ACK of Pkt #4 |
+-----+------------------------+ +-----+------------------------+
The first three packets are determined to be lost when the ACK of The first three packets are determined to be lost when the
packet 4 is received at t=8. The congestion period is calculated as acknowlegement of packet 4 is received at t=8. The congestion period
the time between the oldest and newest lost packets: (3 - 0) = 3. is calculated as the time between the oldest and newest lost packets:
The duration for persistent congestion is equal to: (1 * (3 - 0) = 3. The duration for persistent congestion is equal to: (1
kPersistentCongestionThreshold) = 3. Because the threshold was * kPersistentCongestionThreshold) = 3. Because the threshold was
reached and because none of the packets between the oldest and the reached and because none of the packets between the oldest and the
newest packets are acknowledged, the network is considered to have newest packets are acknowledged, the network is considered to have
experienced persistent congestion. experienced persistent congestion.
When persistent congestion is established, the sender's congestion When persistent congestion is established, the sender's congestion
window MUST be reduced to the minimum congestion window window MUST be reduced to the minimum congestion window
(kMinimumWindow). This response of collapsing the congestion window (kMinimumWindow). This response of collapsing the congestion window
on persistent congestion is functionally similar to a sender's on persistent congestion is functionally similar to a sender's
response on a Retransmission Timeout (RTO) in TCP [RFC5681] after response on a Retransmission Timeout (RTO) in TCP [RFC5681] after
Tail Loss Probes (TLP) [RACK]. Tail Loss Probes (TLP) [RACK].
skipping to change at page 19, line 37 skipping to change at page 20, line 37
frames to reduce leaked information. frames to reduce leaked information.
7.3. Misreporting ECN Markings 7.3. Misreporting ECN Markings
A receiver can misreport ECN markings to alter the congestion A receiver can misreport ECN markings to alter the congestion
response of a sender. Suppressing reports of ECN-CE markings could response of a sender. Suppressing reports of ECN-CE markings could
cause a sender to increase their send rate. This increase could cause a sender to increase their send rate. This increase could
result in congestion and loss. result in congestion and loss.
A sender MAY attempt to detect suppression of reports by marking A sender MAY attempt to detect suppression of reports by marking
occasional packets that they send with ECN-CE. If a packet marked occasional packets that they send with ECN-CE. If a packet sent with
with ECN-CE is not reported as having been marked when the packet is ECN-CE is not reported as having been CE marked when the packet is
acknowledged, the sender SHOULD then disable ECN for that path. acknowledged, then the sender SHOULD disable ECN for that path.
Reporting additional ECN-CE markings will cause a sender to reduce Reporting additional ECN-CE markings will cause a sender to reduce
their sending rate, which is similar in effect to advertising reduced their sending rate, which is similar in effect to advertising reduced
connection flow control limits and so no advantage is gained by doing connection flow control limits and so no advantage is gained by doing
so. so.
Endpoints choose the congestion controller that they use. Though Endpoints choose the congestion controller that they use. Though
congestion controllers generally treat reports of ECN-CE markings as congestion controllers generally treat reports of ECN-CE markings as
equivalent to loss [RFC8311], the exact response for each controller equivalent to loss [RFC8311], the exact response for each controller
could be different. Failure to correctly respond to information could be different. Failure to correctly respond to information
skipping to change at page 20, line 15 skipping to change at page 21, line 15
8. IANA Considerations 8. IANA Considerations
This document has no IANA actions. Yet. This document has no IANA actions. Yet.
9. References 9. References
9.1. Normative References 9.1. Normative References
[QUIC-TLS] [QUIC-TLS]
Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure
QUIC", draft-ietf-quic-tls-24 (work in progress). QUIC", draft-ietf-quic-tls-latest (work in progress).
[QUIC-TRANSPORT] [QUIC-TRANSPORT]
Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based
Multiplexed and Secure Transport", draft-ietf-quic- Multiplexed and Secure Transport", draft-ietf-quic-
transport-24 (work in progress). transport-latest (work in progress).
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage
Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085,
March 2017, <https://www.rfc-editor.org/info/rfc8085>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>. May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion
Notification (ECN) Experimentation", RFC 8311,
DOI 10.17487/RFC8311, January 2018,
<https://www.rfc-editor.org/info/rfc8311>.
9.2. Informative References 9.2. Informative References
[FACK] Mathis, M. and J. Mahdavi, "Forward Acknowledgement: [FACK] Mathis, M. and J. Mahdavi, "Forward Acknowledgement:
Refining TCP Congestion Control", ACM SIGCOMM , August Refining TCP Congestion Control", ACM SIGCOMM , August
1996. 1996.
[RACK] Cheng, Y., Cardwell, N., Dukkipati, N., and P. Jha, "RACK: [RACK] Cheng, Y., Cardwell, N., Dukkipati, N., and P. Jha, "RACK:
a time-based fast loss detection algorithm for TCP", a time-based fast loss detection algorithm for TCP",
draft-ietf-tcpm-rack-06 (work in progress), November 2019. draft-ietf-tcpm-rack-06 (work in progress), November 2019.
[RFC3465] Allman, M., "TCP Congestion Control with Appropriate Byte [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
Counting (ABC)", RFC 3465, DOI 10.17487/RFC3465, February of Explicit Congestion Notification (ECN) to IP",
2003, <https://www.rfc-editor.org/info/rfc3465>. RFC 3168, DOI 10.17487/RFC3168, September 2001,
<https://www.rfc-editor.org/info/rfc3168>.
[RFC4653] Bhandarkar, S., Reddy, A., Allman, M., and E. Blanton, [RFC4653] Bhandarkar, S., Reddy, A., Allman, M., and E. Blanton,
"Improving the Robustness of TCP to Non-Congestion "Improving the Robustness of TCP to Non-Congestion
Events", RFC 4653, DOI 10.17487/RFC4653, August 2006, Events", RFC 4653, DOI 10.17487/RFC4653, August 2006,
<https://www.rfc-editor.org/info/rfc4653>. <https://www.rfc-editor.org/info/rfc4653>.
[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
<https://www.rfc-editor.org/info/rfc5681>. <https://www.rfc-editor.org/info/rfc5681>.
skipping to change at page 22, line 5 skipping to change at page 23, line 5
[RFC6928] Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis, [RFC6928] Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis,
"Increasing TCP's Initial Window", RFC 6928, "Increasing TCP's Initial Window", RFC 6928,
DOI 10.17487/RFC6928, April 2013, DOI 10.17487/RFC6928, April 2013,
<https://www.rfc-editor.org/info/rfc6928>. <https://www.rfc-editor.org/info/rfc6928>.
[RFC7661] Fairhurst, G., Sathiaseelan, A., and R. Secchi, "Updating [RFC7661] Fairhurst, G., Sathiaseelan, A., and R. Secchi, "Updating
TCP to Support Rate-Limited Traffic", RFC 7661, TCP to Support Rate-Limited Traffic", RFC 7661,
DOI 10.17487/RFC7661, October 2015, DOI 10.17487/RFC7661, October 2015,
<https://www.rfc-editor.org/info/rfc7661>. <https://www.rfc-editor.org/info/rfc7661>.
[RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion
Notification (ECN) Experimentation", RFC 8311,
DOI 10.17487/RFC8311, January 2018,
<https://www.rfc-editor.org/info/rfc8311>.
[RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and
R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", R. Scheffenegger, "CUBIC for Fast Long-Distance Networks",
RFC 8312, DOI 10.17487/RFC8312, February 2018, RFC 8312, DOI 10.17487/RFC8312, February 2018,
<https://www.rfc-editor.org/info/rfc8312>. <https://www.rfc-editor.org/info/rfc8312>.
9.3. URIs 9.3. URIs
[1] https://mailarchive.ietf.org/arch/search/?email_list=quic [1] https://mailarchive.ietf.org/arch/search/?email_list=quic
[2] https://github.com/quicwg [2] https://github.com/quicwg
skipping to change at page 23, line 10 skipping to change at page 24, line 16
towards bytes in flight. towards bytes in flight.
sent_bytes: The number of bytes sent in the packet, not including sent_bytes: The number of bytes sent in the packet, not including
UDP or IP overhead, but including QUIC framing overhead. UDP or IP overhead, but including QUIC framing overhead.
time_sent: The time the packet was sent. time_sent: The time the packet was sent.
A.2. Constants of interest A.2. Constants of interest
Constants used in loss recovery are based on a combination of RFCs, Constants used in loss recovery are based on a combination of RFCs,
papers, and common practice. Some may need to be changed or papers, and common practice.
negotiated in order to better suit a variety of environments.
kPacketThreshold: Maximum reordering in packets before packet kPacketThreshold: Maximum reordering in packets before packet
threshold loss detection considers a packet lost. The RECOMMENDED threshold loss detection considers a packet lost. The RECOMMENDED
value is 3. value is 3.
kTimeThreshold: Maximum reordering in time before time threshold kTimeThreshold: Maximum reordering in time before time threshold
loss detection considers a packet lost. Specified as an RTT loss detection considers a packet lost. Specified as an RTT
multiplier. The RECOMMENDED value is 9/8. multiplier. The RECOMMENDED value is 9/8.
kGranularity: Timer granularity. This is a system-dependent value. kGranularity: Timer granularity. This is a system-dependent value.
skipping to change at page 23, line 47 skipping to change at page 24, line 52
Variables required to implement the congestion control mechanisms are Variables required to implement the congestion control mechanisms are
described in this section. described in this section.
latest_rtt: The most recent RTT measurement made when receiving an latest_rtt: The most recent RTT measurement made when receiving an
ack for a previously unacked packet. ack for a previously unacked packet.
smoothed_rtt: The smoothed RTT of the connection, computed as smoothed_rtt: The smoothed RTT of the connection, computed as
described in [RFC6298] described in [RFC6298]
rttvar: The RTT variance, computed as described in [RFC6298] rttvar: The RTT variation, computed as described in [RFC6298]
min_rtt: The minimum RTT seen in the connection, ignoring ack delay. min_rtt: The minimum RTT seen in the connection, ignoring ack delay.
max_ack_delay: The maximum amount of time by which the receiver max_ack_delay: The maximum amount of time by which the receiver
intends to delay acknowledgments for packets in the intends to delay acknowledgments for packets in the
ApplicationData packet number space. The actual ack_delay in a ApplicationData packet number space. The actual ack_delay in a
received ACK frame may be larger due to late timers, reordering, received ACK frame may be larger due to late timers, reordering,
or lost ACKs. or lost ACK frames.
loss_detection_timer: Multi-modal timer used for loss detection. loss_detection_timer: Multi-modal timer used for loss detection.
pto_count: The number of times a PTO has been sent without receiving pto_count: The number of times a PTO has been sent without receiving
an ack. an ack.
time_of_last_sent_ack_eliciting_packet: The time the most recent time_of_last_sent_ack_eliciting_packet[kPacketNumberSpace]: The time
ack-eliciting packet was sent. the most recent ack-eliciting packet was sent.
largest_acked_packet[kPacketNumberSpace]: The largest packet number largest_acked_packet[kPacketNumberSpace]: The largest packet number
acknowledged in the packet number space so far. acknowledged in the packet number space so far.
loss_time[kPacketNumberSpace]: The time at which the next packet in loss_time[kPacketNumberSpace]: The time at which the next packet in
that packet number space will be considered lost based on that packet number space will be considered lost based on
exceeding the reordering window in time. exceeding the reordering window in time.
sent_packets[kPacketNumberSpace]: An association of packet numbers sent_packets[kPacketNumberSpace]: An association of packet numbers
in a packet number space to information about them. Described in in a packet number space to information about them. Described in
skipping to change at page 24, line 39 skipping to change at page 25, line 43
At the beginning of the connection, initialize the loss detection At the beginning of the connection, initialize the loss detection
variables as follows: variables as follows:
loss_detection_timer.reset() loss_detection_timer.reset()
pto_count = 0 pto_count = 0
latest_rtt = 0 latest_rtt = 0
smoothed_rtt = 0 smoothed_rtt = 0
rttvar = 0 rttvar = 0
min_rtt = 0 min_rtt = 0
max_ack_delay = 0 max_ack_delay = 0
time_of_last_sent_ack_eliciting_packet = 0
for pn_space in [ Initial, Handshake, ApplicationData ]: for pn_space in [ Initial, Handshake, ApplicationData ]:
largest_acked_packet[pn_space] = infinite largest_acked_packet[pn_space] = infinite
time_of_last_sent_ack_eliciting_packet[pn_space] = 0
loss_time[pn_space] = 0 loss_time[pn_space] = 0
A.5. On Sending a Packet A.5. On Sending a Packet
After a packet is sent, information about the packet is stored. The After a packet is sent, information about the packet is stored. The
parameters to OnPacketSent are described in detail above in parameters to OnPacketSent are described in detail above in
Appendix A.1.1. Appendix A.1.1.
Pseudocode for OnPacketSent follows: Pseudocode for OnPacketSent follows:
OnPacketSent(packet_number, pn_space, ack_eliciting, OnPacketSent(packet_number, pn_space, ack_eliciting,
in_flight, sent_bytes): in_flight, sent_bytes):
sent_packets[pn_space][packet_number].packet_number = sent_packets[pn_space][packet_number].packet_number =
packet_number packet_number
sent_packets[pn_space][packet_number].time_sent = now sent_packets[pn_space][packet_number].time_sent = now
sent_packets[pn_space][packet_number].ack_eliciting = sent_packets[pn_space][packet_number].ack_eliciting =
ack_eliciting ack_eliciting
sent_packets[pn_space][packet_number].in_flight = in_flight sent_packets[pn_space][packet_number].in_flight = in_flight
if (in_flight): if (in_flight):
if (ack_eliciting): if (ack_eliciting):
time_of_last_sent_ack_eliciting_packet = now time_of_last_sent_ack_eliciting_packet[pn_space] = now
OnPacketSentCC(sent_bytes) OnPacketSentCC(sent_bytes)
sent_packets[pn_space][packet_number].size = sent_bytes sent_packets[pn_space][packet_number].size = sent_bytes
SetLossDetectionTimer() SetLossDetectionTimer()
A.6. On Receiving an Acknowledgment A.6. On Receiving an Acknowledgment
When an ACK frame is received, it may newly acknowledge any number of When an ACK frame is received, it may newly acknowledge any number of
packets. packets.
Pseudocode for OnAckReceived and UpdateRtt follow: Pseudocode for OnAckReceived and UpdateRtt follow:
skipping to change at page 28, line 5 skipping to change at page 29, line 5
which is set in the packet and timer events further below. The which is set in the packet and timer events further below. The
function SetLossDetectionTimer defined below shows how the single function SetLossDetectionTimer defined below shows how the single
timer is set. timer is set.
This algorithm may result in the timer being set in the past, This algorithm may result in the timer being set in the past,
particularly if timers wake up late. Timers set in the past SHOULD particularly if timers wake up late. Timers set in the past SHOULD
fire immediately. fire immediately.
Pseudocode for SetLossDetectionTimer follows: Pseudocode for SetLossDetectionTimer follows:
// Returns the earliest loss_time and the packet number GetEarliestTimeAndSpace(times):
// space it's from. Returns 0 if all times are 0. time = times[Initial]
GetEarliestLossTime():
time = loss_time[Initial]
space = Initial space = Initial
for pn_space in [ Handshake, ApplicationData ]: for pn_space in [ Handshake, ApplicationData ]:
if (loss_time[pn_space] != 0 && if (times[pn_space] != 0 &&
(time == 0 || loss_time[pn_space] < time)): (time == 0 || times[pn_space] < time) &&
time = loss_time[pn_space]; # Skip ApplicationData until handshake completion.
(pn_space != ApplicationData ||
IsHandshakeComplete()):
time = times[pn_space];
space = pn_space space = pn_space
return time, space return time, space
PeerNotAwaitingAddressValidation(): PeerNotAwaitingAddressValidation():
# Assume clients validate the server's address implicitly. # Assume clients validate the server's address implicitly.
if (endpoint is server): if (endpoint is server):
return true return true
# Servers complete address validation when a # Servers complete address validation when a
# protected packet is received. # protected packet is received.
return has received Handshake ACK || return has received Handshake ACK ||
has received 1-RTT ACK has received 1-RTT ACK
SetLossDetectionTimer(): SetLossDetectionTimer():
loss_time, _ = GetEarliestLossTime() earliest_loss_time, _ = GetEarliestTimeAndSpace(loss_time)
if (loss_time != 0): if (earliest_loss_time != 0):
// Time threshold loss detection. // Time threshold loss detection.
loss_detection_timer.update(loss_time) loss_detection_timer.update(earliest_loss_time)
return return
if (no ack-eliciting packets in flight && if (no ack-eliciting packets in flight &&
PeerNotAwaitingAddressValidation()): PeerNotAwaitingAddressValidation()):
loss_detection_timer.cancel() loss_detection_timer.cancel()
return return
// Use a default timeout if there are no RTT measurements // Use a default timeout if there are no RTT measurements
if (smoothed_rtt == 0): if (smoothed_rtt == 0):
timeout = 2 * kInitialRtt timeout = 2 * kInitialRtt
else: else:
// Calculate PTO duration // Calculate PTO duration
timeout = smoothed_rtt + max(4 * rttvar, kGranularity) + timeout = smoothed_rtt + max(4 * rttvar, kGranularity) +
max_ack_delay max_ack_delay
timeout = timeout * (2 ^ pto_count) timeout = timeout * (2 ^ pto_count)
loss_detection_timer.update( sent_time, _ = GetEarliestTimeAndSpace(
time_of_last_sent_ack_eliciting_packet + timeout) time_of_last_sent_ack_eliciting_packet)
loss_detection_timer.update(sent_time + timeout)
A.9. On Timeout A.9. On Timeout
When the loss detection timer expires, the timer's mode determines When the loss detection timer expires, the timer's mode determines
the action to be performed. the action to be performed.
Pseudocode for OnLossDetectionTimeout follows: Pseudocode for OnLossDetectionTimeout follows:
OnLossDetectionTimeout(): OnLossDetectionTimeout():
loss_time, pn_space = GetEarliestLossTime() earliest_loss_time, pn_space =
if (loss_time != 0): GetEarliestTimeAndSpace(loss_time)
if (earliest_loss_time != 0):
// Time threshold loss Detection // Time threshold loss Detection
DetectLostPackets(pn_space) DetectLostPackets(pn_space)
SetLossDetectionTimer() SetLossDetectionTimer()
return return
if (endpoint is client without 1-RTT keys): if (endpoint is client without 1-RTT keys):
// Client sends an anti-deadlock packet: Initial is padded // Client sends an anti-deadlock packet: Initial is padded
// to earn more anti-amplification credit, // to earn more anti-amplification credit,
// a Handshake packet proves address ownership. // a Handshake packet proves address ownership.
if (has Handshake keys): if (has Handshake keys):
SendOneAckElicitingHandshakePacket() SendOneAckElicitingHandshakePacket()
else: else:
SendOneAckElicitingPaddedInitialPacket() SendOneAckElicitingPaddedInitialPacket()
else: else:
// PTO. Send new data if available, else retransmit old data. // PTO. Send new data if available, else retransmit old data.
// If neither is available, send a single PING frame. // If neither is available, send a single PING frame.
SendOneOrTwoAckElicitingPackets() _, pn_space = GetEarliestTimeAndSpace(
time_of_last_sent_ack_eliciting_packet)
SendOneOrTwoAckElicitingPackets(pn_space)
pto_count++ pto_count++
SetLossDetectionTimer() SetLossDetectionTimer()
A.10. Detecting Lost Packets A.10. Detecting Lost Packets
DetectLostPackets is called every time an ACK is received and DetectLostPackets is called every time an ACK is received and
operates on the sent_packets for that packet number space. operates on the sent_packets for that packet number space.
Pseudocode for DetectLostPackets follows: Pseudocode for DetectLostPackets follows:
skipping to change at page 30, line 48 skipping to change at page 31, line 48
OnPacketsLost(lost_packets) OnPacketsLost(lost_packets)
Appendix B. Congestion Control Pseudocode Appendix B. Congestion Control Pseudocode
We now describe an example implementation of the congestion We now describe an example implementation of the congestion
controller described in Section 6. controller described in Section 6.
B.1. Constants of interest B.1. Constants of interest
Constants used in congestion control are based on a combination of Constants used in congestion control are based on a combination of
RFCs, papers, and common practice. Some may need to be changed or RFCs, papers, and common practice.
negotiated in order to better suit a variety of environments.
kInitialWindow: Default limit on the initial amount of data in kInitialWindow: Default limit on the initial amount of data in
flight, in bytes. Taken from [RFC6928], but increased slightly to flight, in bytes. The RECOMMENDED value is the minimum of 10 *
account for the smaller 8 byte overhead of UDP vs 20 bytes for max_datagram_size and max(2 * max_datagram_size, 14720)). This
TCP. The RECOMMENDED value is the minimum of 10 * follows the analysis and recommendations in [RFC6928], increasing
max_datagram_size and max(2 * max_datagram_size, 14720)). the byte limit to account for the smaller 8 byte overhead of UDP
compared to the 20 byte overhead for TCP.
kMinimumWindow: Minimum congestion window in bytes. The RECOMMENDED kMinimumWindow: Minimum congestion window in bytes. The RECOMMENDED
value is 2 * max_datagram_size. value is 2 * max_datagram_size.
kLossReductionFactor: Reduction in congestion window when a new loss kLossReductionFactor: Reduction in congestion window when a new loss
event is detected. The RECOMMENDED value is 0.5. event is detected. The RECOMMENDED value is 0.5.
kPersistentCongestionThreshold: Period of time for persistent kPersistentCongestionThreshold: Period of time for persistent
congestion to be established, specified as a PTO multiplier. The congestion to be established, specified as a PTO multiplier. The
rationale for this threshold is to enable a sender to use initial rationale for this threshold is to enable a sender to use initial
skipping to change at page 33, line 14 skipping to change at page 34, line 14
InCongestionRecovery(sent_time): InCongestionRecovery(sent_time):
return sent_time <= congestion_recovery_start_time return sent_time <= congestion_recovery_start_time
OnPacketAckedCC(acked_packet): OnPacketAckedCC(acked_packet):
// Remove from bytes_in_flight. // Remove from bytes_in_flight.
bytes_in_flight -= acked_packet.size bytes_in_flight -= acked_packet.size
if (InCongestionRecovery(acked_packet.time_sent)): if (InCongestionRecovery(acked_packet.time_sent)):
// Do not increase congestion window in recovery period. // Do not increase congestion window in recovery period.
return return
if (IsAppLimited()): if (IsAppOrFlowControlLimited()):
// Do not increase congestion_window if application // Do not increase congestion_window if application
// limited. // limited or flow control limited.
return return
if (congestion_window < ssthresh): if (congestion_window < ssthresh):
// Slow start. // Slow start.
congestion_window += acked_packet.size congestion_window += acked_packet.size
else: else:
// Congestion avoidance. // Congestion avoidance.
congestion_window += max_datagram_size * acked_packet.size congestion_window += max_datagram_size * acked_packet.size
/ congestion_window / congestion_window
B.6. On New Congestion Event B.6. On New Congestion Event
skipping to change at page 34, line 50 skipping to change at page 36, line 5
o PTO MUST send data if possible (#3056, #3057) o PTO MUST send data if possible (#3056, #3057)
o Connection Close is not ack-eliciting (#3097, #3098) o Connection Close is not ack-eliciting (#3097, #3098)
o MUST limit bursts to the initial congestion window (#3160) o MUST limit bursts to the initial congestion window (#3160)
o Define the current max_datagram_size for congestion control o Define the current max_datagram_size for congestion control
(#3041, #3167) (#3041, #3167)
o Separate PTO by packet number space (#3067, #3074, #3066)
C.2. Since draft-ietf-quic-recovery-22 C.2. Since draft-ietf-quic-recovery-22
o PTO should always send an ack-eliciting packet (#2895) o PTO should always send an ack-eliciting packet (#2895)
o Unify the Handshake Timer with the PTO timer (#2648, #2658, #2886) o Unify the Handshake Timer with the PTO timer (#2648, #2658, #2886)
o Move ACK generation text to transport draft (#1860, #2916) o Move ACK generation text to transport draft (#1860, #2916)
C.3. Since draft-ietf-quic-recovery-21 C.3. Since draft-ietf-quic-recovery-21
 End of changes. 73 change blocks. 
238 lines changed or deleted 276 lines changed or added

This html diff was produced by rfcdiff 1.44jr. The latest version is available from http://tools.ietf.org/tools/rfcdiff/