RFC 
 TOC 
Network Working Group B. de hÓra 
Request for Comments:  Propylon Ltd. 
<draft-httplr-20041215.txt>  December 2004 
Category: Informational  

HTTPLR
draft-httplr-20041215.txt

Status of this Memo

This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited.

Copyright Notice

Copyright (C) The Internet Society (2004). All Rights Reserved.

Abstract

This document describes HTTPLR version 20041215, an application protocol for reliable transmission of messages using HTTP. The protocol provides a measure of reliability within the client server model of HTTP without recourse to a peer to peer model, where both communicators are HTTP servers.

Editorial Note

This draft HAS NOT been submitted for publication, and does not have any status; it should be referred to as a "pre-draft."

Please submit comments to the draft author.

Sections marked out, @@@like this@@@, indicate editorial notes that MUST be removed prior to publication.


 RFC 
 TOC 

Table of Contents

1 Requirements notation
2 Introduction
3 Requirements and Assumptions
 3.1 Requirements
  3.1.1 Agreement
  3.1.2 Message Duplication
  3.1.3 Message Ordering
  3.1.4 Message Opacity
  3.1.5 Low impact on the Client
  3.1.6 Fidelity with HTTP
 3.2 Assumptions
  3.2.1 Eventual Arrival
  3.2.2 Regression
4 URI state resources
5 HTTP Methods and message delivery
6 Identification of exchanges
 6.1 Identifier generation
7 Retries and timeouts
8 State management and duration
9 The Upload Protocol
 9.1 Step One: establish exchange URL
  9.1.1 Phantom Exchanges
 9.2 Step two: send message
  9.2.1 Rejection of DELETE requests
 9.3 Step three: Message Reconciliation
10 The Download Protocol
 10.1 Step One: establish feed URL
  10.1.1 Constraints on the Atom feed format
 10.2 Step Two: download a Message
  10.2.1 Rejection of out of order requests
 10.3 Step Three: Message Reconciliation
11 Security Considerations
12 Appendix A: HTTP Response Codes
13 References
§ Author's Address
§ Intellectual Property and Copyright Statements

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13


 TOC 

1 Requirements notation

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

A "Client" in this discussion is whomever begins the reliable exchange via a HTTP request and a "Server" is the origin server responding to the request. A "Message" is the entity body sent by a request to a Server from a Client. The "Network" encompasses all things that lie between a Server and a Client, such as caches, proxies, ethernet cables, Internet backbones and so on.


 TOC 

2 Introduction

This document describes an application protocol for guaranteed once and only once transmission of messages using HTTP, something that HTTP alone does not guarantee. It describes a means for both downloading and uploading of messages. It is not concerned with endpoint availability, robustness of components, or details of persistent storage. It is not concerned with message order.

A characteristic of distributed systems is that senders and receivers of messages can't know with certainty what went wrong in the event of failure, and without catering for agreement, they might not know if anything did go wrong with a transmission. Our primary concern for failure is dealing with partial failure. Partial failure is where one component in the system fails while the others continue to function. The HTTP client-server model has three failing parts, the Client, the Network, and the Server. For example, if the Network fails mid-transmission, a request might be arrive to the Server but not a response to the Client. Or if the Server's firewall rules are mis-configured, Client requests might be rejected out of hand.

The techniques described here provides a measure of reliability within the client server model of HTTP. Reliable variants of HTTP or protocols layered upon HTTP often require a peer to peer model, where both communicators are HTTP servers.

The first published description of a reliable protocol using a HTTP client and server is attributed to Paul Prescod [Prescod]. That document 'Reliable delivery in HTTP', along with the author's experiences implementing messaging systems with HTTP and anecdotal descriptions of ad-hoc applications created to handle reliability for HTTP were the basis for this protocol.


 TOC 

3 Requirements and Assumptions

3.1 Requirements

3.1.1 Agreement

The key requirement for the protocol is that the Client and Server can reach agreement on the success of a message exchange.

3.1.2 Message Duplication

The protocol MUST NOT result in duplicated messages. They are required to hold to state during the exchange by using a URL as a shared key, until they come to an agreement that the message has indeed been delivered.

3.1.3 Message Ordering

The protocol SHOULD NOT result in reordered messages.

3.1.4 Message Opacity

The protocol MUST NOT rely on the contents of a Message to complete an exchange.

3.1.5 Low impact on the Client

The Client MUST NOT need to run a web server to complete the reliable contract or expose a port to the outside world.

3.1.6 Fidelity with HTTP

The protocol MUST NOT be compromised by caching proxies or firewall rules. The protocol MUST NOT contravene existing HTTP specifications. Therefor it MUST be consistent with HTTP semantics rather than extending or abusing them.

3.2 Assumptions

3.2.1 Eventual Arrival

The protocol makes an important assumption - an infinite number of requests will result in an infinite number of responses. This assumption is known as 'eventual arrival'. It allows us to disregard arbitrary (often called 'Byzantine') failure modes for which a reliable protocol can never be modelled.

3.2.2 Regression

Theoretically, as HTTP is an asymmetric protocol, we keep needing the Client to send a request to the Server acknowledging receipt of the last response, so we regress infinitely. Given the eventual arrival assumption, we can ignore such regression.


 TOC 

4 URI state resources

HTTPLR manages message exchanges through state transitions associated with a URI called the 'exchange URL'. The following URIs denote exchange states:

The URIs are used as normative state identifiers in this memo's prose. Client and Server implementations MAY use these URIs to denote or share state information (for example as RDF statements or within an XML document) between themselves or other interested parties. Implementors MUST treat the trailing '/' on the URIs as significant.


 TOC 

5 HTTP Methods and message delivery

One way to make sure a message went through in HTTP is to resend until the Client gets an appropriate response from the Server. Some HTTP actions (GET, PUT, DELETE) enable this in principle through idempotence - repeating the action does not alter the result of the first successful action. The HTTP POST method which is popular for use in message transmission is not idempotent; a message sending strategy based purely on Client retries using POST is not guaranteed to be safe.


 TOC 

6 Identification of exchanges

One way to avoid overwriting data with a sequence of HTTP methods is through the use of a identifier in a header or the message body. As HTTP is a stateless protocol, it does not require the use of message identifiers; use of identifiers thus requires additional mandatory constraints on the Client and Server to maintain state. Assigning such identifiers to message exchanges is a known networking idiom (for example it is used in TCP). [Lynch] formally describes the general process of using a shared identifier to reach agreement in the "FivePacketHandshakeProtocol".

6.1 Identifier generation

It is necessary for either the Client or the Server to generate a shared identifier. It SHOULD be noted by implementers that the algorithms for generating highly unique identifiers (like GUIDs) are complex and can be difficult to get right. A published algorithm and implementation for GUIDs is described in [Leach]. As well as being unique, identifiers should be stable across time. As a result of the need for uniqueness and stability, this task is entrusted to the Server through its assignation of URLs which are communicated using the Location header. Using only one generation source reduces the likelihood of error, particularly where there are multiple Clients.


 TOC 

7 Retries and timeouts

Under failures in requests and responses, the onus to continue the exchange is placed on the Client. The number of times a Client retries to send a request in order to get back a Server response and the duration between retries is not defined here.


 TOC 

8 State management and duration

Once an exchange begins, the Server and Client MUST hold exchange state until one the following happens :


 TOC 

9 The Upload Protocol

This section specifies the basic protocol for delivery of messages to a Server from a Client.

9.1 Step One: establish exchange URL

Message exchange state is coordinated through a shared resource called the "exchange URL". This resource is distinct from the exchanged message. The exchange URL MUST be unique. This implies that an exchange URL SHALL NOT be recycled.

A request-response exchange between the Client and Server at a well-known URL establishes the exchange URL. How the Client and Server determine the well-known URL is not specified here. The Client MUST initiate the exchange using POST. If the Server is willing to accept the exchange request it MUST use the 201 Created response code. The identifier supplied by the Server MUST be a URL, which MUST appear in the Location header of the response. This is the exchange URL. Here is an example:

C:  POST http://www.example.org/rmservice HTTP/1.1

S:  HTTP/1.1 201 Created
S:  Location:http://www.example.org/rmservice?id=249D6557

Figure 1

The server MAY return a representation (entity body) in the response.

In the event that an opening request fails, the Client MAY repeatedly request an exchange URL until it receives a response.

After receiving the request and prior to sending the response, the Server MUST maintain state about the exchange URL accordingly:

After receiving the response, the Client MUST maintain state about this URL accordingly:

If for some reason, the Server cannot durably record the state, it MUST inform the client by returning a 500 Internal Error response code and the response MUST NOT contain a Location header. 'Durably record' is understood to imply persistent storage outside working memory (for example state can be expected to be maintained between Server reboots).

9.1.1 Phantom Exchanges

If the Client requests an exchange URL, but did not receive the response from the Server, the Server may be holding onto an exchange URL to which the client will never send a Message (the client will simply ask for another URL). These are "Phantom Exchanges".

As the existence of Phantom Exchanges is not actively harmful, a means for removing them is not specified. However, If the Client was allowed to examine a list of incomplete exchanges, it could identify Phantom Exchanges and terminate them. Following feedback from implementations, future HTTPLR versions MAY provide an exchange pattern as an optimisation to allow the Server to release resources associated with such exchange URLs.

9.2 Step two: send message

The Client MAY use one of PUT or POST to send its Message to the exchange URL supplied by the Server. The Client SHOULD prefer the PUT option option. The Client request MUST contain an entity body (the Message). The Server MUST support PUT and POST delivery options. At this stage the Server response SHOULD include a Location header naming the exchange URL in its response. Here is an example:

C:  PUT http://www.example.org/rmservice?id=249D6557 HTTP/1.1
C:  [crlf]
C:  [message body]

S:  HTTP/1.1 202 Accepted
S:  Location:http://www.example.org/rmservice?id=249D6557
S:  Allow: GET, HEAD, POST

Figure 2

After receiving the request and prior to sending the response, the Server MUST maintain state about this exchange URL accordingly:

After receiving the response, the Client MUST maintain state about this URL accordingly:

The Client MUST not send Messages to a URL it has recorded as http://purl.oclc.org/httplr/state/accepted/.

In the event that the Server is not responding as expected or there is a Network failure (such as a timeout), the Client MAY try to resend the Message in the absence of a response. In this case it is possible that the Server has already recorded the message as being in the http://purl.oclc.org/httplr/state/accepted/ state but that the response had not arrived at the Client. To avoid duplicate delivery, the Server MUST respond to further POST or PUT request against an exchange URL recorded as being in the http://purl.oclc.org/httplr/state/accepted/ state with a 405 Method Not Allowed response. The Server SHALL NOT attempt to verify that the Message sent is the same as before. The Server SHOULD include the exchange URL in a Location header in its responses. For example:

C:  PUT http://www.example.org/rmservice?id=249D6557 HTTP/1.1
C:  [crlf]
C:  [message body]

S:  HTTP/1.1 405 Method Not Allowed
S:  Location:http://www.example.org/rmservice?id=249D6557
S:  Allow: GET, HEAD

Figure 3

The Client on seeing a 405 response code for a Message exchange in the http://purl.oclc.org/httplr/state/created/ state SHOULD assume the Message has been previously transmitted and SHOULD record the state as http://purl.oclc.org/httplr/state/accepted/.

9.2.1 Rejection of DELETE requests

A Client MAY send a DELETE request to an exchange URL it has recorded as being in the http://purl.oclc.org/httplr/state/accepted/ state. To ensure safety against out of order requests, the Server MUST respond to DELETE requests against an exchange URL it has recorded in the http://purl.oclc.org/httplr/state/created/ state with a 405 Method Not Allowed response. For example:

C: DELETE http://www.example.org/rmservice?id=249D6557 HTTP/1.1

S: HTTP/1.1 405 Method Not Allowed
S: Location: http://www.example.org/rmservice?id=249D6557
S: Allow: GET, HEAD

Figure 4

9.3 Step three: Message Reconciliation

Up to this point to Server knows the Message was sent to it, but does not know if the Client agrees it has been sent (as it does not know for certain if the Client received the response). The Client MUST inform the Server with a DELETE or POST request that it is in agreement the message was delivered. The Client SHOULD prefer the DELETE option:

C: DELETE http://www.example.org/rmservice?id=249D6557 HTTP/1.1

S: HTTP/1.1 200 Ok
S: Location: http://www.example.org/rmservice?id=249D6557

Figure 5

The Client reconciliation request MUST NOT contain an entity body. In the case where the POST method is used for both the message delivery and the reconciliation request, the Server MUST use the absence of an entity to distinguish the request order. The Server MUST support DELETE and POST delivery options. When the Client receives the Server response, it MAY release recorded state about the exchange URI.

The Server response SHOULD contain a Location header to indicate the exchange URL. On receiving a reconciliation request the Server MUST respond to further such requests with a 410 Gone response code. For example:

C: DELETE http://www.example.org/rmservice?id=249D6557 HTTP/1.1

S: HTTP/1.1 410 Gone
S: Location: http://www.example.org/rmservice?id=249D6557
S: Allow: GET, HEAD

Figure 6

After receiving the request and prior to sending the response, the Server MUST maintain state about this exchange URL accordingly:

After receiving the response with the message, the Client MUST maintain state about this URL accordingly:

In the event that the Server is not responding as expected or there is a Network failure (such as a timeout), the Client MAY try to resend the reconciliation in the absence of a response. In this case it is possible that the Server has already recorded the message as being in the http://purl.oclc.org/httplr/state/finished/ state but that the response had not arrived at the Client. The Server MUST respond to further POST or PUT request against an exchange URL recorded as being in the http://purl.oclc.org/httplr/state/finished/ state with a 405 Method Not Allowed response. The Server SHOULD include the exchange URL in a Location header in its responses. For example:

C:  DELETE http://www.example.org/rmservice?id=249D6557 HTTP/1.1
C:  [crlf]

S:  HTTP/1.1 405 Method Not Allowed
S:  Location:http://www.example.org/rmservice?id=249D6557
S:  Allow: GET, HEAD

Figure 7

The Client on seeing a 410 response code for a reconciliation request in the http://purl.oclc.org/httplr/state/accepted/ state SHOULD agree the Message was exchanged and SHOULD NOT send further reconciliation requests to the Server.


 TOC 

10 The Download Protocol

10.1 Step One: establish feed URL

The Client when downloading messages needs to determine either the URL of a message to download or a list of such URLs.These URLs are called 'Message URLs'. They are to be found at another URL called the "feed URL". How the Client discovers a Server's feed URL is not specified here.

To fetch the list of available messages, the Server MUST use a GET request against the feed URL. The Server MUST at minimum support the Atom syndication format to publish available messages and MUST indicate that by setting the response Content-Type header to application/x.atom+xml. The Server MAY present other content formats. The Client MUST be able to process the Atom syndication format at minimum [ATOM]. Here's an example:

C:  GET http://www.example.org/e4/c234 HTTP/1.1

S: HTTP/1.1 200 Ok
S: Content-Type: application/x.atom+xml
S: [crlf]
S: <feed version="draft-ietf-atompub-format-03: do not deploy"
S:   xmlns="http://purl.org/atom/ns#draft-ietf-atompub-format-03">
S:    <head>
S:      <title>Example Feed</title>
S:      <link href="http://www.example.org/e4/c234/"/>
S:      <updated>2004-12-17T00:00:00Z</updated>
S:      <author>
S:        <name>Lemmy</name>
S:      </author>
S:    </head>
S:    <entry>
S:      <title></title>
S:      <link href="http://www.example.org/e4/c234/item/01"/>
S:      <id>http://www.example.org/e4/c234/item/01</id>
S:      <updated>2004-12-17T00:00:00Z</updated>
S:    </entry>
S:  </feed>

Figure 8

10.1.1 Constraints on the Atom feed format

10.2 Step Two: download a Message

The Client MUST use GET to collect a Message from the Message URL supplied by the Server. The Server response MUST contain a Location header to indicate the exchange URL. The Message URL from which the Message representation is collected MUST be distinct from the exchange URL that will be used by the client to acknowledge message receipt.

C:  GET http://www.example.org/e4/c234/item/01 HTTP/1.1

S:  HTTP/1.1 200 Ok
S:  Location:http://www.example.org/e4/c234/item/01?ack=
S:  Allow: GET, HEAD
S:  [crlf]
S:  [message body]

Figure 9

Prior to receiving a GET request against the Message URL, the Server MUST maintain state about the exchange URL accordingly:

After receiving the request and prior to sending the response, the Server MUST maintain state about the exchange URL accordingly:

After receiving the response with the message, the Client MUST maintain state about this URL accordingly:

In the event that the Server is not responding as expected or there is a Network failure (such as a timeout), the Client MAY retry collection of the Message at the supplied Message URI. In this case it is possible that the Server has already recorded the message as being in the http://purl.oclc.org/httplr/state/accepted/ state but that the Message has not arrived at the Client. The Server MUST respond to subsequent GET requests against a Message URL recorded as being in the http://purl.oclc.org/httplr/state/accepted/ state with a 202 Accepted response. The Server MUST include the exchange URL in the Location header in its response. For example:

C:  GET http://www.example.org/e4/c234/item/01 HTTP/1.1

S:  HTTP/1.1 202 Accepted
S:  Location:http://www.example.org/e4/c234/item/01?ack=
S:  Allow: GET, HEAD
S:  [crlf]
S:  [message body]

Figure 10

10.2.1 Rejection of out of order requests

To reconcile an exchange, a Client MUST send a DELETE or POST request to an exchange URL it has recorded as being in the http://purl.oclc.org/httplr/state/accepted/ state. To ensure safety against out of order requests, the Server MUST respond to DELETE or POST requests against an exchange URL it has recorded in the http://purl.oclc.org/httplr/state/created/ state with a 405 Method Not Allowed response and MUST provide the Message URL in the Location header. This informs the Client that it is attempting to complete an exchange before having downloaded the message. For example:

C: DELETE http://www.example.org/e4/c234/item/01?ack= HTTP/1.1

S: HTTP/1.1 405 Method Not Allowed
S: Location: http://www.example.org/e4/c234/item/01
S: Allow: GET, HEAD

Figure 11

On seeing a 405 response the Client SHOULD record the state as http://purl.oclc.org/httplr/state/created/ and attempt to collect the Message from Message URL provided.

10.3 Step Three: Message Reconciliation

An indication by the Client to the Server that it has agreed the Message was collected successfully. Up to this point to Server knows the Message was sent to the Client, but does not know if the Client agrees it has arrived (as it does not know for certain if the Client received the response). The Client MUST inform the Server with a DELETE or POST request to the exchange URI that it is in agreement the message was delivered:

C: DELETE http://www.example.org/e4/c234/item/01?ack= HTTP/1.1

S: HTTP/1.1 200 Ok
S: Location: http://www.example.org/e4/c234/item/01

Figure 12

The Server MUST support DELETE and POST options. The Server MUST provide the Message URL in the Location header of the response. The Client SHOULD prefer the DELETE option.

After receiving the request and prior to sending the response, the Server MUST maintain state about this exchange URL accordingly:

After receiving the response with the message, the Client MUST maintain state about this URL accordingly:

In the event that the Server is not responding as expected or there is a Network failure (such as a timeout), the Client MAY try to resend the reconciliation in the absence of a response. In this case it is possible that the Server has already recorded the message as being in the http://purl.oclc.org/httplr/state/finished/ state but that the response had not arrived at the Client. The Server MUST respond to further POST or PUT request against an exchange URL recorded as being in the http://purl.oclc.org/httplr/state/finished/ as though it were in the http://purl.oclc.org/httplr/state/accepted/ state.

Once the Server has recorded an exchange URL as being in the http://purl.oclc.org/httplr/state/finished/ state it MUST respond to any further requests to the Message URL with a 410 Gone response and MUST NOT return an entity in the response.

C: GET http://www.example.org/e4/c234/item/01 HTTP/1.1

S: HTTP/1.1 410 Gone
S: Location: http://www.example.org/e4/c234/item/01?ack=

Figure 13

The Client on seeing a 410 response code for a Message URL request in the state SHOULD agree the Message was exchanged and SHOULD NOT send further reconciliation requests to the Server.


 TOC 

11 Security Considerations

@@@ SSL will layer without further specification; HTTP auth schemes @@@


 TOC 

12 Appendix A: HTTP Response Codes

The following error codes are used in the protocol:


 TOC 

13 References

[ATOM]Nottingham, M., "The Atom Syndication Format (draft-ietf-atompub-format-03) ", October 2004.
[Leach]Leach, P., "UUIDs and GUIDs [http://www.dehora.net/doc/draft-leach-uuids-guids-01.txt]", August 1998.
[Lynch]Lynch, N., "Distributed Algorithms, ISBN 1-55860-348-4".
[Prescod]Prescod, P., "Reliable delivery in HTTP [http://www.prescod.net/reliable_http.html]".
[RFC2119]Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119, March 1997.
[RFC2616]Fielding, R., "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.

 TOC 

Author's Address

 Bill de hÓra
 Propylon Ltd.
 45 Blackbourne Square
Rathfarnham
 Dublin, D14 
 Ireland
Phone: 
EMail: bill@dehora.net
URI: http://www.dehora.net/
 

TOC

Intellectual Property Statement

The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79.

Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at <http://www.ietf.org/ipr>.

The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org.

Disclaimer of Validity

This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Copyright Statement

Copyright (C) The Internet Society (2004). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights.

Acknowledgement

Funding for the RFC Editor function is currently provided by the Internet Society.