HTTP Extensions for Web Distributed Authoring and Versioning (WebDAV)
Status of This Memo
This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.
Copyright (C) The IETF Trust (2007).
Web Distributed Authoring and Versioning (WebDAV) consists of a set of methods, headers, and content-types ancillary to HTTP/1.1 for the management of resource properties, creation and management of resource collections, URL namespace manipulation, and resource locking (collision avoidance).
RFC 2518 was published in February 1999, and this specification obsoletes RFC 2518 with minor revisions mostly due to interoperability experience.
This document describes an extension to the HTTP/1.1 protocol that allows clients to perform remote Web content authoring operations. This extension provides a coherent set of methods, headers, request entity body formats, and response entity body formats that provide operations for:
- Properties: The ability to create, remove, and query information about Web pages, such as their authors, creation dates, etc.
- Collections: The ability to create sets of documents and to retrieve a hierarchical membership listing (like a directory listing in a file system).
- Locking: The ability to keep more than one person from working on a document at the same time. This prevents the "lost update problem", in which modifications are lost as first one author, then another, writes changes without merging the other author's changes.
- Namespace Operations: The ability to instruct the server to copy and move Web resources, operations that change the mapping from URLs to resources.
Requirements and rationale for these operations are described in a companion document, "Requirements for a Distributed Authoring and Versioning Protocol for the World Wide Web" [RFC2291].
This document does not specify the versioning operations suggested by [RFC2291]. That work was done in a separate document, "Versioning Extensions to WebDAV" [RFC3253].
The sections below provide a detailed introduction to various WebDAV abstractions: resource properties (Section 4), collections of resources (Section 5), locks (Section 6) in general, and write locks (Section 7) specifically.
These abstractions are manipulated by the WebDAV-specific HTTP methods (Section 9) and the extra HTTP headers (Section 10) used with WebDAV methods. General considerations for handling HTTP requests and responses in WebDAV are found in Section 8.
While the status codes provided by HTTP/1.1 are sufficient to describe most error conditions encountered by WebDAV methods, there are some errors that do not fall neatly into the existing categories. This specification defines extra status codes developed for WebDAV methods (Section 11) and describes existing HTTP status codes (Section 12) as used in WebDAV. Since some WebDAV methods may operate over many resources, the Multi-Status response (Section 13) has been introduced to return status information for multiple resources. Finally, this version of WebDAV introduces precondition and postcondition (Section 16) XML elements in error response bodies.
WebDAV uses XML ([REC-XML]) for property names and some values, and also uses XML to marshal complicated requests and responses. This specification contains DTD and text definitions of all properties (Section 15) and all other XML elements (Section 14) used in marshalling. WebDAV includes a few special rules on extending WebDAV XML marshalling in backwards-compatible ways (Section 17).
Finishing off the specification are sections on what it means for a resource to be compliant with this specification (Section 18), on internationalization support (Section 19), and on security (Section 20).
2. Notational Conventions
Since this document describes a set of extensions to the HTTP/1.1 protocol, the augmented BNF used herein to describe protocol elements is exactly the same as described in Section 2.1 of [RFC2616], including the rules about implied linear whitespace. Since this augmented BNF uses the basic production rules provided in Section 2.2 of [RFC2616], these rules apply to this document as well. Note this is not the standard BNF syntax used in other RFCs.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
Note that in natural language, a property like the "creationdate" property in the "DAV:" XML namespace is sometimes referred to as "DAV:creationdate" for brevity.
URI/URL - A Uniform Resource Identifier and Uniform Resource Locator, respectively. These terms (and the distinction between them) are defined in [RFC3986].
URI/URL Mapping - A relation between an absolute URI and a resource. Since a resource can represent items that are not network retrievable, as well as those that are, it is possible for a resource to have zero, one, or many URI mappings. Mapping a resource to an "http" scheme URI makes it possible to submit HTTP protocol requests to the resource using the URI.
Path Segment - Informally, the characters found between slashes ("/") in a URI. Formally, as defined in Section 3.3 of [RFC3986].
Collection - Informally, a resource that also acts as a container of references to child resources. Formally, a resource that contains a set of mappings between path segments and resources and meets the requirements defined in Section 5.
Internal Member (of a Collection) - Informally, a child resource of a collection. Formally, a resource referenced by a path segment mapping contained in the collection.
Internal Member URL (of a Collection) - A URL of an internal member, consisting of the URL of the collection (including trailing slash) plus the path segment identifying the internal member.
Member (of a Collection) - Informally, a "descendant" of a collection. Formally, an internal member of the collection, or, recursively, a member of an internal member.
Member URL (of a Collection) - A URL that is either an internal member URL of the collection itself, or is an internal member URL of a member of that collection.
Property - A name/value pair that contains descriptive information about a resource.
Live Property - A property whose semantics and syntax are enforced by the server. For example, the live property DAV:getcontentlength has its value, the length of the entity returned by a GET request, automatically calculated by the server.
Dead Property - A property whose semantics and syntax are not enforced by the server. The server only records the value of a dead property; the client is responsible for maintaining the consistency of the syntax and semantics of a dead property.
Principal - A distinct human or computational actor that initiates access to network resources.
State Token - A URI that represents a state of a resource. Lock tokens are the only state tokens defined in this specification.
4. Data Model for Resource Properties
4.1. The Resource Property Model
Properties are pieces of data that describe the state of a resource. Properties are data about data.
Properties are used in distributed authoring environments to provide for efficient discovery and management of resources. For example, a 'subject' property might allow for the indexing of all resources by their subject, and an 'author' property might allow for the discovery of what authors have written which documents.
The DAV property model consists of name/value pairs. The name of a property identifies the property's syntax and semantics, and provides an address by which to refer to its syntax and semantics.
There are two categories of properties: "live" and "dead". A live property has its syntax and semantics enforced by the server. Live properties include cases where a) the value of a property is protected and maintained by the server, and b) the value of the property is maintained by the client, but the server performs syntax checking on submitted values. All instances of a given live property MUST comply with the definition associated with that property name. A dead property has its syntax and semantics enforced by the client; the server merely records the value of the property verbatim.
4.2. Properties and HTTP Headers
Properties already exist, in a limited sense, in HTTP message headers. However, in distributed authoring environments, a relatively large number of properties are needed to describe the state of a resource, and setting/returning them all through HTTP headers is inefficient. Thus, a mechanism is needed that allows a principal to identify a set of properties in which the principal is interested and to set or retrieve just those properties.
4.3. Property Values
The value of a property is always a (well-formed) XML fragment.
XML has been chosen because it is a flexible, self-describing, structured data format that supports rich schema definitions, and because of its support for multiple character sets. XML's self-describing nature allows any property's value to be extended by adding elements. Clients will not break when they encounter extensions because they will still have the data specified in the original schema and MUST ignore elements they do not understand.
XML's support for multiple character sets allows any human-readable property to be encoded and read in a character set familiar to the user. XML's support for multiple human languages, using the "xml: lang" attribute, handles cases where the same character set is employed by multiple human languages. Note that xml:lang scope is recursive, so an xml:lang attribute on any element containing a property name element applies to the property value unless it has been overridden by a more locally scoped attribute. Note that a property only has one value, in one language (or language MAY be left undefined); a property does not have multiple values in different languages or a single value in multiple languages.
A property is always represented with an XML element consisting of the property name, called the "property name element". The simplest example is an empty property, which is different from a property that does not exist:
The value of the property appears inside the property name element. The value may be any kind of well-formed XML content, including both text-only and mixed content. Servers MUST preserve the following XML Information Items (using the terminology from [REC-XML-INFOSET]) in storage and transmission of dead properties:
For the property name Element Information Item itself:
[namespace name] [local name] [attributes] named "xml:lang" or any such attribute in scope [children] of type element or character
On all Element Information Items in the property value:
[namespace name] [local name] [attributes] [children] of type element or character
On Attribute Information Items in the property value:
[namespace name] [local name] [normalized value]
On Character Information Items in the property value:
Since prefixes are used in some XML vocabularies (XPath and XML Schema, for example), servers SHOULD preserve, for any Information Item in the value:
XML Infoset attributes not listed above MAY be preserved by the server, but clients MUST NOT rely on them being preserved. The above rules would also apply by default to live properties, unless defined otherwise.
Servers MUST ignore the XML attribute xml:space if present and never use it to change whitespace handling. Whitespace in property values is significant.
4.3.1. Example - Property with Mixed Content
Consider a dead property 'author' created by the client as follows:
<D:prop xml:lang="en" xmlns:D="DAV:"> <x:author xmlns:x='http://example.com/ns'> <x:name>Jane Doe</x:name> <!-- Jane's contact info --> <x:uri type='email' added='2005-11-26'>mailto:email@example.com</x:uri> <x:uri type='web' added='2005-11-27'>http://www.example.com</x:uri> <x:notes xmlns:h='http://www.w3.org/1999/xhtml'> Jane has been working way <h:em>too</h:em> long on the long-awaited revision of <![CDATA[<RFC2518>]]>. </x:notes> </x:author> </D:prop>
When this property is requested, a server might return:
<D:prop xmlns:D='DAV:'><author xml:lang='en' xmlns:x='http://example.com/ns' xmlns='http://example.com/ns' xmlns:h='http://www.w3.org/1999/xhtml'> <x:name>Jane Doe</x:name> <x:uri added="2005-11-26" type="email" >mailto:firstname.lastname@example.org</x:uri> <x:uri added="2005-11-27" type="web" >http://www.example.com</x:uri> <x:notes> Jane has been working way <h:em>too</h:em> long on the long-awaited revision of <RFC2518>. </x:notes> </author> </D:prop>
Note in this example:
- The [prefix] for the property name itself was not preserved, being non-significant, whereas all other [prefix] values have been preserved,
- attribute values have been rewritten with double quotes instead of single quotes (quoting style is not significant), and attribute order has not been preserved,
- the xml:lang attribute has been returned on the property name element itself (it was in scope when the property was set, but the exact position in the response is not considered significant as long as it is in scope),
- whitespace between tags has been preserved everywhere (whitespace between attributes not so),
- CDATA encapsulation was replaced with character escaping (the reverse would also be legal),
- the comment item was stripped (as would have been a processing instruction item).
Implementation note: there are cases such as editing scenarios where clients may require that XML content is preserved character by character (such as attribute ordering or quoting style). In this case, clients should consider using a text-only property value by escaping all characters that have a special meaning in XML parsing.
4.4. Property Names
A property name is a universally unique identifier that is associated with a schema that provides information about the syntax and semantics of the property.
Because a property's name is universally unique, clients can depend upon consistent behavior for a particular property across multiple resources, on the same and across different servers, so long as that property is "live" on the resources in question, and the implementation of the live property is faithful to its definition.
The XML namespace mechanism, which is based on URIs (), is used to name properties because it prevents namespace collisions and provides for varying degrees of administrative control.
The property namespace is flat; that is, no hierarchy of properties is explicitly recognized. Thus, if a property A and a property A/B exist on a resource, there is no recognition of any relationship between the two properties. It is expected that a separate specification will eventually be produced that will address issues relating to hierarchical properties.
Finally, it is not possible to define the same property twice on a single resource, as this would cause a collision in the resource's property namespace.
4.5. Source Resources and Output Resources
Some HTTP resources are dynamically generated by the server. For these resources, there presumably exists source code somewhere governing how that resource is generated. The relationship of source files to output HTTP resources may be one to one, one to many, many to one, or many to many. There is no mechanism in HTTP to determine whether a resource is even dynamic, let alone where its source files exist or how to author them. Although this problem would usefully be solved, interoperable WebDAV implementations have been widely deployed without actually solving this problem, by dealing only with static resources. Thus, the source vs. output problem is not solved in this specification and has been deferred to a separate document.
5. Collections of Web Resources
This section provides a description of a type of Web resource, the collection, and discusses its interactions with the HTTP URL namespace and with HTTP methods. The purpose of a collection resource is to model collection-like objects (e.g., file system directories) within a server's namespace.
All DAV-compliant resources MUST support the HTTP URL namespace model specified herein.
5.1. HTTP URL Namespace Model
The HTTP URL namespace is a hierarchical namespace where the hierarchy is delimited with the "/" character.
An HTTP URL namespace is said to be consistent if it meets the following conditions: for every URL in the HTTP hierarchy there exists a collection that contains that URL as an internal member URL. The root, or top-level collection of the namespace under consideration, is exempt from the previous rule. The top-level collection of the namespace under consideration is not necessarily the collection identified by the absolute path '/' -- it may be identified by one or more path segments (e.g., /servlets/webdav/...)
Neither HTTP/1.1 nor WebDAV requires that the entire HTTP URL namespace be consistent -- a WebDAV-compatible resource may not have a parent collection. However, certain WebDAV methods are prohibited from producing results that cause namespace inconsistencies.
As is implicit in [RFC2616] and [RFC3986], any resource, including collection resources, MAY be identified by more than one URI. For example, a resource could be identified by multiple HTTP URLs.
5.2. Collection Resources
Collection resources differ from other resources in that they also act as containers. Some HTTP methods apply only to a collection, but some apply to some or all of the resources inside the container defined by the collection. When the scope of a method is not clear, the client can specify what depth to apply. Depth can be either zero levels (only the collection), one level (the collection and directly contained resources), or infinite levels (the collection and all contained resources recursively).
A collection's state consists of at least a set of mappings between path segments and resources, and a set of properties on the collection itself. In this document, a resource B will be said to be contained in the collection resource A if there is a path segment mapping that maps to B and that is contained in A. A collection MUST contain at most one mapping for a given path segment, i.e., it is illegal to have the same path segment mapped to more than one resource.
Properties defined on collections behave exactly as do properties onnon-collection resources. A collection MAY have additional statesuch as entity bodies returned by GET.
For all WebDAV-compliant resources A and B, identified by URLs "U" and "V", respectively, such that "V" is equal to "U/SEGMENT", A MUST be a collection that contains a mapping from "SEGMENT" to B. So, if resource B with URL "http://example.com/bar/blah" is WebDAV compliant and if resource A with URL "http://example.com/bar/" is WebDAV compliant, then resource A must be a collection and must contain exactly one mapping from "blah" to B.
Although commonly a mapping consists of a single segment and a resource, in general, a mapping consists of a set of segments and a resource. This allows a server to treat a set of segments as equivalent (i.e., either all of the segments are mapped to the same resource, or none of the segments are mapped to a resource). For example, a server that performs case-folding on segments will treat the segments "ab", "Ab", "aB", and "AB" as equivalent. A client can then use any of these segments to identify the resource. Note that a PROPFIND result will select one of these equivalent segments to identify the mapping, so there will be one PROPFIND response element per mapping, not one per segment in the mapping.
Collection resources MAY have mappings to non-WebDAV-compliant resources in the HTTP URL namespace hierarchy but are not required to do so. For example, if resource X with URL "http://example.com/bar/blah" is not WebDAV compliant and resource A with "URL http://example.com/bar/" identifies a WebDAV collection, then A may or may not have a mapping from "blah" to X.
If a WebDAV-compliant resource has no WebDAV-compliant internal members in the HTTP URL namespace hierarchy, then the WebDAV- compliant resource is not required to be a collection.
There is a standing convention that when a collection is referred to by its name without a trailing slash, the server MAY handle the request as if the trailing slash were present. In this case, it SHOULD return a Content-Location header in the response, pointing to the URL ending with the "/". For example, if a client invokes a method on http://example.com/blah (no trailing slash), the server may respond as if the operation were invoked on http://example.com/blah/ (trailing slash), and should return a Content-Location header with the value http://example.com/blah/. Wherever a server produces a URL referring to a collection, the server SHOULD include the trailing slash. In general, clients SHOULD use the trailing slash form of collection names. If clients do not use the trailing slash form the client needs to be prepared to see a redirect response. Clients will
find the DAV:resourcetype property more reliable than the URL to find out if a resource is a collection.
Clients MUST be able to support the case where WebDAV resources are contained inside non-WebDAV resources. For example, if an OPTIONS response from "http://example.com/servlet/dav/collection" indicates WebDAV support, the client cannot assume that "http://example.com/servlet/dav/" or its parent necessarily are WebDAV collections.
A typical scenario in which mapped URLs do not appear as members of their parent collection is the case where a server allows links or redirects to non-WebDAV resources. For instance, "/col/link" might not appear as a member of "/col/", although the server would respond with a 302 status to a GET request to "/col/link"; thus, the URL "/col/link" would indeed be mapped. Similarly, a dynamically- generated page might have a URL mapping from "/col/index.html", thus this resource might respond with a 200 OK to a GET request yet not appear as a member of "/col/".
Some mappings to even WebDAV-compliant resources might not appear in the parent collection. An example for this case are servers that support multiple alias URLs for each WebDAV-compliant resource. A server may implement case-insensitive URLs, thus "/col/a" and "/col/A" identify the same resource, yet only either "a" or "A" is reported upon listing the members of "/col". In cases where a server treats a set of segments as equivalent, the server MUST expose only one preferred segment per mapping, consistently chosen, in PROPFIND responses.