WebDAVNamespacevsProperties | J. F. Reschke |
greenbytes | |
December 2005 |
Behaviour of WebDAV properties under namespace operations
draft-reschke-webdav-namespace-vs-properties-latest
This document discusses the impact of WebDAV namespace operations on the behaviour of live properties defined in HTTP and WebDAV.
HTTP ([RFC2616]) defines a set of response header fields that can be used to detect changes, namely "ETag" (Section 14.19) and "Last-Modified" (Section 14.29). User agents can use request header fields to make method invocations conditional, such as
Note that HTTP defines the behaviour of these headers in terms of "variants" (i.e. the different representations that may be returned for a single resource; see Section 1.3). Furthermore, although HTTP distinguishes between the term "URI" (identifier) and "resource" (the object identified by the URI), the difference has little impact as the HTTP specification does not define any namespace operations that would change the mapping between URIs and resources. Thus, generic clients will rely on consistent behaviour of "Last-Modified" and "ETag" on a per-URI basis even in the presence of namespace operations.
>> Request (getting the content initially)
GET /index.html HTTP/1.1 Host: example.org
>> Response
HTTP/1.1 200 OK Content-Type: text/html; charset="utf-8" Content-Length: xxxx Last-Modified: Sun, 20 Mar 2005 12:45:26 GMT ...body...
The user agent stores the response headers along with the content. When it needs to update the content (for instance the user initiates a refresh of the browser window), it uses that information to make the request conditional.
>> Request (refreshing the content)
GET /index.html HTTP/1.1 Host: example.org If-Unmodified-Since: Sun, 20 Mar 2005 12:45:26 GMT
>> Response
HTTP/1.1 304 Not Modified
Thus, if the content did not change, the user agent avoids getting the same content again.
>> Request (getting the content initially)
GET /index.html HTTP/1.1 Host: example.org
>> Response
HTTP/1.1 200 OK Content-Type: text/html; charset="utf-8" Content-Length: xxxx Last-Modified: Sun, 20 Mar 2005 12:45:26 GMT ETag: "1" ...body...
>> Request (writing back the content)
PUT /index.html HTTP/1.1 Host: example.org If-Match: "1"
>> Response
HTTP/1.1 200 OK
However, would the content have changed between the two requests, the response would be:
>> Response
HTTP/1.1 412 Precondition Failed
Below is a list of requirements for the behaviour for 'Last-Modified' and 'ETag':
The requirements above seem to be straightforward to implement, but things get tricky as soon as namespace operations such as COPY ([RFC2518], Section 8.8) and MOVE (Section 8.9) are introduced.
For example, consider two resources identified by "index.html" and "index.html.bak" with last modified dates of "12:00:00 GMT" and "11:50:00 GMT" respectively.
A client may have retrieved the content for "index.html", remembering the first timestamp:
>> Request (getting the content initially)
GET /index.html HTTP/1.1 Host: example.org
>> Response
HTTP/1.1 200 OK Content-Type: text/html; charset="utf-8" Content-Length: xxxx Last-Modified: Sun, 20 Mar 2005 12:00:00 GMT ...body...
Later, another user decides to restore the backup, using a WebDAV MOVE request.
>> Request (getting the content initially)
MOVE /index.html.bak HTTP/1.1 Host: example.org Destination: http://example.org/index.html
>> Response
HTTP/1.1 204 OK
Finally, the first user agent decides to refresh the content for "index.html". What value for "Last-Modified" should be returned?
The situation for "ETag" is only slightly different; the entity tag needs to be unique across all variants ever served for the same HTTP URL (the only difference is that it doesn't have any inherent order that the conditional request headers would check). Thus, if a server can guarantee that no entity tag ever repeats for any URL within it's namespace, namespace operations do not require any post-processing (otherwise, the same considerations as for "Last-Modified" apply).
[rfc.comment.1: Mention the impact of depth=infinity namespace operations --reschke]
[draft-ietf-webdav-bind] defines a set of new namespace operations (BIND, UNBIND, REBIND). It's easy to see that for REBIND, the same considerations will apply as for MOVE, and that UNBIND will behave as DELETE. But what about BIND?
BIND creates a new URL mapping for a given resource. A server basically has two choices for implementing the "Last-Modified" for resources that support multiple bindings:
Note that unless a server implements namespace-wide unique entity tags, the same situation will apply to entity tags as well.
Client implementors will have to expect that HTTP response headers will vary for different URLs even though the underlying resource is the same. On the other hand, they will also have to expect namespace operations such as MOVE, COPY, BIND or REBIND will affect time stamps and entity tags in a possibly surprising way. It's impossible to predict, because these headers are defined by HTTP, not per WebDAV or BIND.
Looking at the properties defined in Section 13 of [RFC2518], only some of them are inherited from HTTP and thus will possibly behave as described above:
property | behaviour |
---|---|
creationdate | per resource |
displayname | per resource (your mileage may vary for some broken implementations out there) |
creationdate | per resource |
getcontentlanguage | potentially per URL as per HTTP |
getcontentlength | potentially per URL as per HTTP |
getcontenttype | potentially per URL as per HTTP |
getetag | potentially per URL as per HTTP |
getlastmodified | potentially per URL as per HTTP |
lockdiscovery | per resource |
resourcetype | per resource |
supportedlock | per resource |
source | per resource |
[draft-ietf-webdav-bind] | Clemm, G., Crawford, J., Reschke, J. F., and J. Whitehead, “Binding Extensions to Web Distributed Authoring and Versioning (WebDAV)”, Internet-Draft draft-ietf-webdav-bind-12, July 2005. |
[RFC2518] | Goland, Y., Whitehead, E., Faizi, A., Carter, S.R., and D. Jensen, “HTTP Extensions for Distributed Authoring -- WEBDAV”, RFC 2518, February 1999. |
[RFC2616] | Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1”, RFC 2616, June 1999. |
Example #1: a Java-based server maps filesystem objects to HTTP resources, and is stuck with what java.io.File supports (which only allows access to a very limited subset of the operating system's file information, see <http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html>). If the server doesn't fully control the filesystem, and unless it's prepared to store metadata out-of-band (outside the filesystem), it will have to compute entity tags based on file information such as the last-modified date and the length. The only robust alternative would be to compute a hash of the actual file's contents, but usually this is too expensive.
Example #2: a module implementing WebDAV is just an add-on to the generic HTTP handler in a server (i.e., mod_dav inside Apache httpd server), and the server doesn't have any information except the one obtained from the underlying store (in this case the filesystem). Even if the server indeed has full access to the operating system's information, it may still not be able to use the file's inode information, for instance because it's a network drive.
HTTP distinguishes between "weak" and "strong" entity tags (see [RFC2616], Section 3.11). Only strong entity tags can be used in authoring scenarios such as the one described in Section 1.2. However, if an entity tag has been computed based on "last-modified" information, it only becomes a "strong" entity tag after a certain interval of non-activity on a resource. Thus, servers may return a weak entity tag as result of a PUT operations, and only later "promote" it to a strong entity tag.
Requiring servers to always return strong entity tags in the first place will render Apache/mod_dav non-conformant.