RFC 
 TOC 
 RFC 
 TOC 
  B. de hÓra 
  Propylon, Ltd. 
Expires: September 2004   June 2004 

HTTPLR (pre-draft)
draft-httplr-20040601.html

Abstract

This document describes HTTPLR version 20040601, an application protocol for reliable transmission of messages using HTTP, something that HTTP does not guarantee. The protocol provides a measure of reliability within the client server model of HTTP. Reliable variants of HTTP often require a peer to peer model, where both communicators are HTTP servers. These peer to peer models are termed heavyweight.

Editorial Note

This draft HAS NOT been submitted for publication, and does not have any status; it should be referred to as a "pre-draft."

This draft is Copyright © Bill de hÓra (2004). All Rights Reserved.

Please submit comments to the draft author.

Sections marked out, @@@like this@@@, indicate editorial notes that MUST be removed prior to publication.


 RFC 
 TOC 

Table of Contents

Requirements notation
Introduction
Requirements and Assumptions
 3.1  Requirements
  3.1.1  Agreement
  3.1.2  Message Duplication
  3.1.3  Message Ordering
  3.1.4  Message Opacity
  3.1.5  Low impact on the Client
  3.1.6  Fidelity with HTTP
 3.2  Assumptions
  3.2.1  Eventual Arrival
  3.2.2  Regression
HTTP Methods and message delivery
Identification of exchanges
 5.1  Identifier generation
The Protocol
 6.1  Step One: establish exchange URL
 6.2  Step two: send message
  6.2.1  Rejection of DELETE requests
  6.2.2  Retries and timeouts
  6.2.3  State management and duration
Message Reconciliation
Reporting
Security Considerations
10  Appendix A: HTTP Response Codes
11  Appendix B: URI state resources
12  Appendix C: Phantom Exchanges
13  References
§  Author's Address

   Figure 1
   Figure 2
   Figure 3
   Figure 4
   Figure 5
   Figure 6


 TOC 

1 Requirements notation

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

A "Client" in this discussion is whoever begins the reliable exchange via a HTTP request and a "Server" is the origin server responding to the request. A "Message" is the entity body sent by a request to a Server from a Client. The "Network" encompasses all things that lie between a Server and a Client, such as caches, proxies, ethernet cables, internet backbones and so on.


 TOC 

2 Introduction

This document describes an application protocol for guaranteed once and only once transmission of messages using HTTP, something that HTTP alone does not guarantee. It is not concerned with availability, implementation, robustness, or details of persistent storage. it is not concerned with message order.

A characteristic of distributed systems is that senders and receivers of messages can't know with certainty what went wrong in the event of failure, and without catering for agreement, they might not know if anything did go wrong with a transmission. Our primary concern for failure is dealing with partial failure. Partial failure is where one component in the system fails while the others continue. The HTTP client-server model has three failing parts, the Client, the Network, and the Server. For examples, if the Network fails mid-transmission, a request might be arrive to the Server but not a response to the Client. Or if the Server's firewall rules are mis-configured, Client requests might be rejected out of hand.

The technique described here provides a measure of reliability within the client server model of HTTP. Reliable variants of HTTP or protocols layered upon HTTP often require a peer to peer model, where both communicators are HTTP servers. These peer to peer models are termed heavyweight.

The first published description of a reliable protocol using a HTTP client and server is attributed to Paul Prescod [Prescod]. That document 'Reliable delivery in HTTP', along with a) the author's experiences implementing messaging systems with HTTP, b) anectodal descriptions of ad-hoc applicaitions created to handle relaibility for HTTP were the basis for this protocol.


 TOC 

3 Requirements and Assumptions

3.1 Requirements

3.1.1 Agreement

The key requirement for designing a reliable delivery protocol is agreement. The two parties involved in a delivery must agree that a message has been delivered.

3.1.2 Message Duplication

The protocol MUST NOT result in duplicated messages. This is achieved by enforcing certain constraints on the client and server. Specifically they are required to hold to minimal state during the exchange by using a URL identifier as a shared key, until they come to an agreement that the message has indeed been delivered.

3.1.3 Message Ordering

The protocol SHOULD NOT result in reordered messages.

3.1.4 Message Opacity

The protocol MUST NOT rely on the contents of a Message to complete an exchange.

3.1.5 Low impact on the Client

The Client MUST NOT need to run a web server to complete the reliable contract or expose a port to the outside world.

3.1.6 Fidelity with HTTP

The protocol MUST NOT be compromised by caching proxies or firewall rules. Therefor it MUST be consistent with HTTP semantics rather than extending or abusing them.

3.2 Assumptions

3.2.1 Eventual Arrival

The protocol makes one important assumption. An infinite number of requests will result in an infinite number of responses. This assumption is known as eventual arrival - it is seen in formal models of distributed systems, and implicit in most deployed reliable protocols. It allows us to disregard arbitrary (often called 'Byzantine') failure modes for which a reliable protocols can never be modelled.

3.2.2 Regression

Theoretically, as HTTP is an asymmetric protocol we keep needing the client to send a request acknowledging receipt of the last response, so we regress infinitely. However the given the assumption eventual arrival, we can ignore such regression.


 TOC 

4 HTTP Methods and message delivery

One way to make sure a message went through in HTTP is to resend until the client gets some acknowledgment from the server. Some HTTP actions (GET, PUT, DELETE) allow this to happen safely, because they are idempotent - repeating the action does not alter the result of the first successful action.

However this assumes applications are modelled precisely as is intended by HTTP and Web architecture, that is, each item of interest is given its own URL, URLs are not recycled, and idempotent methods are implemented as just that. Not all HTTP applications are deployed this way - for example GET can return different values over time, URL recycling is common enough, and a sequence of idempotent messages by more than one client can result in a non-idempotent outcome. [Naturally, it is not possible to ensure that a server does not generate internal side-effects as a result of performing an idempotent request; the important operational distinction here is that the client user and very possibly the owner of the server, did not request the side-effects, so therefore cannot be held accountable for them.]

The HTTP POST method is popular in web services for message transmission. Since a repeated POST is not sure to be idempotent, a message sending strategy based purely on client retries is not guaranteed to be safe.


 TOC 

5 Identification of exchanges

One way to avoid overwriting data with POST is to put a message identifier in a header or the message body. This is not required by HTTP. HTTP is stateless protocol and identity tracking requires (some) mandatory state on the server. Assigning such identifiers to message exchanges is a standard networking idiom (for example it is used in TCP). [Lynch] describes the general process as the "FivePacketHandshakeProtocol" and its formal properties are well-understood. This document's protocol is based on the FivePacketHandshakeProtocol.

5.1 Identifier generation

This task is entrusted to the Server through its assignation of exchange URLs via the Location header. This requires only one generator reducing the likelihood of error, particularly where there are multiple clients. It SHOULD be noted that the algorithms for generating highly unique identifiers like GUIDs are complex and can be difficult to get right. A published algorithm and implementation for GUIDs unique until 3400AD is described in [Leach].


 TOC 

6 The Protocol

This section specifies the basic protocol.

6.1 Step One: establish exchange URL

The protocol specifies that Message exchange state is managed through a shared resource called the "exchange URL". This resource is distinct from the exchanged message. The exchange URL MUST be unique. This implies that an exchange URL SHALL NOT be recycled.

A request-response exchange between the Client and Server at a well-known URL establishes the exchange URL as also being place to send the message to. How the Client determines the well-known URL is not defined here. The Client MUST initiate the exchange using POST. If the Server is willing to accept the exchange request it MUST use the 201 Created response code. The identifier supplied by the Server MUST be a URL, which MUST appear in the Location header of the response. For example:

C:  POST http://www.example.org/rmservice HTTP/1.1 

S:  HTTP/1.1 201 Created
S:  Location:http://www.example.org/rmservice?id=249D6557

Figure 1

In the event that the exchange request fails the Client MAY repeatedly request an exchange URL until it receives a response.

After receiving the request and prior to sending the response, the Server MUST maintain state about this URL accordingly:

After receiving the response, the Client MUST maintain state about this URL accordingly:

If for some reason, the Server cannot Record the state, it MUST inform the client by returning a 500 Internal Error response code and the response MUST NOT contain a Location header. 'Record' is understood to imply persistent storage outside working memory (for example state will be maintained between Server roboots).

6.2 Step two: send message

The Client MAY use one of PUT or POST to send its Message to the exchange URL supplied by the Server.

The Server response MUST include a Location header naming the exchange URL in its response. If the Location header is not present, the client MUST assume a URL which is the same as the one the message was sent to:

C:  PUT http://www.example.org/rmservice?id=249D6557 HTTP/1.1
C:  [crlf]
C:  [message body]

S:  HTTP/1.1 202 Accepted
S:  Location:http://www.example.org/rmservice?id=249D6557
S:  Allow: GET, HEAD, POST

Figure 2

After receiving the request and prior to sending the response, the Server MUST maintain state about this exchange URL accordingly:

After receiving the response, the Client MUST maintain state about this URL accordingly:

The Client MUST not send Messages to URLs it has recorded as http://purl.oclc.org/httplr/state/accepted.

In the event that the Server is not responding as expected or there is a Network failure (such as a timeout), the Client MAY try to resend the Message. In this case it is posible that the Server has already recorded the message as being in the http://purl.oclc.org/httplr/state/accepted state but that the response has not arrived at the Client. To avoid duplicate delivery, the Server MUST respond to further POST or PUT request against an exchange URL in the http://purl.oclc.org/httplr/state/accepted state with a 405 Method Not Allowed response and MUST include the exchange URL in a Location header. For example:

C:  PUT http://www.example.org/rmservice?id=249D6557 HTTP/1.1
C:  [crlf]
C:  [message body]

S:  HTTP/1.1 405 Method Not Allowed
S:  Location:http://www.example.org/rmservice?id=249D6557
S:  Allow: GET, HEAD, DELETE

Figure 3

The server SHOULD in this case send the Allow header in its responses.

The Client on seeing a 405 response code for a Message exchange in the http://purl.oclc.org/httplr/state/created state MUST assume the Message has been previously transmitted and MUST record the state as http://purl.oclc.org/httplr/state/accepted.

6.2.1 Rejection of DELETE requests

We will see shortly that a Client MAY send a DELETE request to an exchange URL it has recorded as being in the http://purl.oclc.org/httplr/state/accepted state. To ensure safety against out of order reqeusts, the Server MUST respond to DELETE requests against an exchange URL it has recorded in the http://purl.oclc.org/httplr/state/created state with a 405 Method Not Allowed response. For example:

C: DELETE http://www.example.org/rmservice?id=249D6557 HTTP/1.1

S: HTTP/1.1 405 Method Not Allowed
S: Location: http://www.example.org/rmservice?id=249D6557 
S: Allow: GET, HEAD, POST

Figure 4

6.2.2 Retries and timeouts

Under failures in requests and responses, the onus to continue the exchange is placed on the Client. The number of times a Client retries to send a request in order to get back a Server response and the duration between retries is not defined here.

6.2.3 State management and duration

Once an exchange begins, the Server and Client MUST hold exchange state until one the following happens :


 TOC 

7 Message Reconciliation

An OPTIONAL indication by the Client to the Server that it has agreed the Message was sent successfully. Up to this point to Server knows the Message was sent to it, but does not know if the Client agrees it has been sent (as it does not know for certain if the Client received the response). The Client MAY inform the Server with a DELETE request that it is in agreement the message was delivered:

C: DELETE http://www.example.org/rmservice?id=249D6557 HTTP/1.1

S: HTTP/1.1 200 Ok
S: Location: http://www.example.org/rmservice?id=249D6557 

Figure 5

When the Client receives the Server response, it MAY release any state it's recording. The Server response MUST contain a Location header to indicate the exchange URL. On receiving a reconcilliation DELETE request the Server MUST respond to further such requests with a 410 Gone response code. For example:

C: DELETE http://www.example.org/rmservice?id=249D6557 HTTP/1.1

S: HTTP/1.1 410 Gone
S: Location: http://www.example.org/rmservice?id=249D6557 
S: Allow: GET, HEAD

Figure 6

If the Location header is not present, the client MUST assume the URL is the same as the one the message was PUT to. The Client on seeing a 410 response code for a DELETE request in the http://purl.oclc.org/httplr/state/accepted state MUST agree the Message was exchanged and SHOULD NOT send further DELETE requests to the Server.

The server SHOULD in this case send the Allow header in its responses.


 TOC 

8 Reporting

This feature is OPTIONAL. The Server MAY provide a list of current exchange URLs and their state: @@@ incomplete @@@


 TOC 

9 Security Considerations

@@@ SSL will layer without further specification; HTTP auth schemes @@@


 TOC 

10 Appendix A: HTTP Response Codes

The following error codes are used in the protocol:


 TOC 

11 Appendix B: URI state resources

The following URIs can be used to denote exchange state:

The URIs are used as normative state identifiers in this memo's prose. Client and Server implementations MAY use these URIs to denote or share state information (for example as RDF statements or within an XML document) between themselves or other interested parties.


 TOC 

12 Appendix C: Phantom Exchanges

If the Client requests an exchange URL, but did not receive the response from the Server, the Server may be holding onto an exchange URL to which the client will never send a Message (the client will simply ask for another URL). These are "Phantom Exchanges". If the Client was allowed to examine a list of incomplete exchanges, it could identify Phantom Exchanges and terminate them.

As the existence of Phantom Exchanges is not actively harmful, HTTPLR does not describe an exchange pattern to remove Phantom Exchanges. We mention the possibility for completeness. Following feedback from implementations, future HTTPLR versions MAY provide an exchange pattern as an optimisation to allow the Server to release resources.


 TOC 

13  References

[Leach]Leach, P., "UUIDs and GUIDs [http://www.dehora.net/doc/draft-leach-uuids-guids-01.txt]", August 1998.
[Lynch]Lynch, N., "Distributed Algorithms, ISBN 1-55860-348-4".
[Prescod]Prescod, P., "Reliable delivery in HTTP [http://www.prescod.net/reliable_http.html]".
[RFC2119]Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119, Mach 1997.
[RFC2616]Fielding, R., "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.

 TOC 

Author's Address

  Bill de hÓra
 
  Propylon
  45 Blackbourne Square,
Rathfarnham
  Dublin, D14 
  Ireland
EMail:  bill@dehora.net
URI:  http://www.dehora.net/