PROPOSED STANDARD

Errata Exist

Network Working Group J. Urpalainen Request for Comments: 5261 Nokia Category: Standards Track September 2008 An Extensible Markup Language (XML) Patch Operations Framework Utilizing XML Path Language (XPath) Selectors Status of This Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Abstract Extensible Markup Language (XML) documents are widely used as containers for the exchange and storage of arbitrary data in today's systems. In order to send changes to an XML document, an entire copy of the new version must be sent, unless there is a means of indicating only the portions that have changed. This document describes an XML patch framework utilizing XML Path language (XPath) selectors. These selector values and updated new data content constitute the basis of patch operations described in this document. In addition to them, with basic <add>, <replace>, and <remove> directives a set of patches can then be applied to update an existing XML document. Table of Contents 1. Introduction ....................................................3 2. Conventions .....................................................3 3. Basic Features and Requirements .................................4 4. Patch Operations ................................................5 4.1. Locating the Target of a Patch .............................6 4.2. Namespace Mangling .........................................6 4.2.1. Namespaces Used in Selectors ........................7 4.2.2. Departures from XPath Requirements ..................7 4.2.3. Namespaces and Added/Changed Content ................8 4.3. <add> Element .............................................10 4.3.1. Adding an Element ..................................11 4.3.2. Adding an Attribute ................................11 4.3.3. Adding a Prefixed Namespace Declaration ............12 4.3.4. Adding Node(s) with the 'pos' Attribute ............12 4.3.5. Adding Multiple Nodes ..............................12 4.4. <replace> Element .........................................13 Urpalainen Standards Track [Page 1]

RFC 5261 Patch Operations September 2008 1 . Introduction W3C.REC-xml-20060816] documents are widely used as containers for the exchange and storage of arbitrary data in today's systems. In order to send changes to an XML document, an entire copy of the new version must be sent, unless there is a means of indicating only the portions that have changed (patches). This document describes an XML patch framework that utilizes XML Path language (XPath) [W3C.REC-xpath-19991116] selectors. An XPath selector is used to pinpoint the specific portion of the XML that is the target for the change. These selector values and updated new data content constitute the basis of patch operations described in this document. In addition to them, with basic <add>, <replace>, and <remove> directives a set of patches can be applied to update an existing target XML document. With these patch operations, a simple semantics for data oriented XML documents [W3C.REC-xmlschema-2-20041028] is achieved, that is, modifications like additions, removals, or substitutions of elements and attributes can easily be performed. This document does not describe a full XML diff format, only basic patch operation elements that can be embedded within a full format that typically has additional semantics. As one concrete example, in the Session Initiation Protocol (SIP) [RFC3903] based presence system a partial PIDF XML document format [RFC5262] consists of the existing Presence Information Data Format (PIDF) document format combined with the patch operations elements described in this document. In general, patch operations can be used in any application that exchanges XML documents, for example, within the SIP Events framework [RFC3265]. Yet another example is XCAP-diff [SIMPLE-XCAP], which uses this framework for sending partial updates of changes to XCAP [RFC4825] resources. 2 . Conventions RFC 2119, BCP 14 [RFC2119] and indicate requirement levels for compliant implementations. The following terms are used in this document: Target XML document: A target XML document that is going to be updated with a set of patches. Urpalainen Standards Track [Page 3]

RFC 5261 Patch Operations September 2008 3 . Basic Features and Requirements W3C.REC-xmlschema-1-20041028]. XPath selectors pinpoint the target for a change and they are expressed as attributes of these elements. The child node(s) of patch operation elements contain the new data content. In general when applicable, the new content SHOULD be moved unaltered to the patched XML document. XML documents that are equivalent for the purposes of many applications MAY differ in their physical representation. The aim of this document is to describe a deterministic framework where the Urpalainen Standards Track [Page 4]

RFC 5261 Patch Operations September 2008 W3C.REC-xml-c14n-20010315] of an XML document determines logical equivalence. For example, white space text nodes MUST be processed properly in order to fulfill this requirement as white space is by default significant [W3C.REC-xml-c14n-20010315]. The specifications referencing these element schema types MUST define the full XML diff format with an appropriate MIME type [RFC3023] and a character set, e.g., UTF-8 [RFC3629]. For example, the partial PIDF format [RFC5262] includes this schema and describes additional definitions to produce a complete XML diff format for partial presence information updates. As the schema defined in this document does not declare any target namespace, the type definitions inherit the target namespace of the including schema. Therefore, additional namespace declarations within the XML diff documents can be avoided. It is anticipated that applications using these types will define <add>, <replace>, and <remove> elements based on the corresponding type definitions in this schema. In addition, an application may reference only a subset of these type definitions. A future extension can introduce other operations, e.g., with document-oriented models [W3C.REC-xmlschema-2-20041028], a <move> operation and a text node patching algorithm combined with <move> would undoubtedly produce smaller XML diff documents. The instance document elements based on these schema type definitions MUST be well formed and SHOULD be valid. The following XPath 1.0 data model node types can be added, replaced, or removed with this framework: elements, attributes, namespaces, comments, texts, and processing instructions. The full XML prolog, including for example XML entities [W3C.REC-xml-20060816] and the root node of an XML document, cannot be patched according to this framework. However, patching of comments and processing instructions of the root node is allowed. Naturally, the removal or addition of a document root element is not allowed as any valid XML document MUST always contain a single root element. Also, note that support for external entities is beyond the scope of this framework. 4 . Patch Operations Urpalainen Standards Track [Page 5]

RFC 5261 Patch Operations September 2008 4.1 . Locating the Target of a Patch W3C.REC-xml-names-20060816]. A "*" character selects all element children of the context node. Right after the node test, a location step can contain one or more predicates in any order. An attribute value comparison is one of the most typical predicates. The string value of the current context node or a child element may alternatively be used to identify elements in the tree. The character ".", which denotes a current context node selection, is an abbreviated form of "self::node()". Lastly, positional constraints like "[2]" can also be used as an additional predicate. An XPath 1.0 "id()" node-set function MAY also be used to identify unique elements from the document tree. The schema that describes the content model of the document MUST then use an attribute with the type ID [W3C.REC-xmlschema-2-20041028] or with non-validating XML parsers, an "xml:id" [W3C.WD-xml-id-20041109] attribute MUST have been used within an instance document. 4.2 . Namespace Mangling Urpalainen Standards Track [Page 6]

RFC 5261 Patch Operations September 2008 4.2.1 . Namespaces Used in Selectors 4.2.2 . Departures from XPath Requirements W3C.REC-xpath20-20070123]. In XPath 1.0, a "bar" selector always locates an unqualified <bar> element. In XPath 2.0, a "bar" selector not only matches an unqualified <bar> element, but also matches a Urpalainen Standards Track [Page 7]

RFC 5261 Patch Operations September 2008 4.2.3 . Namespaces and Added/Changed Content Section 4.3.3). A fairly difficult use case for these rules is found when the target document has several namespace declarations in scope for the same namespace. A target document might declare several different Urpalainen Standards Track [Page 8]

RFC 5261 Patch Operations September 2008 Section 4.3), it is the <bar> element of the root document element. With modifications of elements, the evaluation context node is the parent element of the modified element, and in the previous example thus the root document element. - Secondly, the prefix (also empty) of the evaluation context node MUST be chosen if the namespace URIs are equal. - Lastly, if the above two rules still don't apply, first all in-scope namespace prefixes of the evaluation context node are arranged alphabetically in an ascending order. If a default namespace declaration exists, it is interpreted as the first entry in this list. The prefix from the list is then chosen that appears as the closest and just before the compared prefix if it were inserted into the list. If the compared prefix were to exist before the first prefix, the first prefix in the list MUST be selected (i.e., there's no default namespace). For example, if the list of in-scope prefixes in the target document is "x", "y" and the compared prefix in the diff document is "xx", then the "x" prefix MUST be chosen. If an "a" prefix were evaluated, the "x" prefix, the first entry MUST be chosen. If there were also an in-scope default namespace declaration, an evaluable "a" prefix would then select the default declaration. Note that unprefixed attributes don't inherit the default namespace declaration. When adding qualified attributes, the default namespace declaration is then not on this matching list of prefixes (see Section 4.3.2). Urpalainen Standards Track [Page 9]

RFC 5261 Patch Operations September 2008 W3C.REC-xmlschema-2-20041028]. Note that with namespace qualified attributes, the prefix matching rules within the 'type' attribute are evaluated with similar rules described in Section 4.2.3. Also, note that then the possible default namespace declaration of the context element isn't applicable. Note: As the 'sel' selector value MAY contain quotation marks, escaped forms: """ or "'" can be used within attribute values. However, it is often more appropriate to use the apostrophe (') character as shown in these examples. An alternative is also to interchange the apostrophes and quotation marks. 4.3.3 . Adding a Prefixed Namespace Declaration 4.3.4 . Adding Node(s) with the 'pos' Attribute 4.3.5 . Adding Multiple Nodes Urpalainen Standards Track [Page 12]

RFC 5261 Patch Operations September 2008 W3C.REC-xml-20060816] cannot be patched with this framework, the references to other than predefined internal entities can exist within text nodes or attributes when the XML prolog contains those declarations. These references may then be preserved if both the XML diff and the target XML document have identical declarations within their prologs. Otherwise, references may be replaced with identical text as long as the "canonically equivalent" rule is obeyed. 4.4 . Element Urpalainen Standards Track [Page 13]

RFC 5261 Patch Operations September 2008 4.4.1 . Replacing an Element 4.4.2 . Replacing an Attribute Value 4.4.3 . Replacing a Namespace Declaration URI 4.4.4 . Replacing a Comment Node Urpalainen Standards Track [Page 14]

RFC 5261 Patch Operations September 2008 4.4.5 . Replacing a Processing Instruction Node 4.4.6 . Replacing a Text Node W3C.REC-xmlschema-1-20041028] where several text node siblings typically exist. If a text node is updated and the <replace> element is empty, the text node MUST thus be removed as a text node MUST always have at least one character of data. 4.5 . Element 4.5.1 . Removing an Element Urpalainen Standards Track [Page 15]

RFC 5261 Patch Operations September 2008 4.5.2 . Removing an Attribute 4.5.3 . Removing a Prefixed Namespace Declaration 4.5.4 . Removing a Comment Node 4.5.5 . Removing a Processing Instruction Node 4.5.6 . Removing a Text Node Urpalainen Standards Track [Page 16]

RFC 5261 Patch Operations September 2008 Urpalainen Standards Track [Page 18]

RFC 5261 Patch Operations September 2008 6 . Usage of Patch Operations 7 . Usage of Selector Values 8 . XML Schema Types of Patch Operation Elements Urpalainen Standards Track [Page 19]

RFC 5261 Patch Operations September 2008 Urpalainen Standards Track [Page 20]

RFC 5261 Patch Operations September 2008 9 . XML Schema of Patch Operation Errors Urpalainen Standards Track [Page 21]

RFC 5261 Patch Operations September 2008 Urpalainen Standards Track [Page 22]

RFC 5261 Patch Operations September 2008 A.3 . Adding a Prefixed Namespace Declaration A.4 . Adding a Comment Node with the 'pos' Attribute Urpalainen Standards Track [Page 30]

RFC 5261 Patch Operations September 2008 A.5 . Adding Multiple Nodes A.6 . Replacing an Element Urpalainen Standards Track [Page 31]

RFC 5261 Patch Operations September 2008 A.7 . Replacing an Attribute Value A.8 . Replacing a Namespace Declaration URI Urpalainen Standards Track [Page 32]

RFC 5261 Patch Operations September 2008 A.9 . Replacing a Comment Node A.10 . Replacing a Processing Instruction Node Urpalainen Standards Track [Page 33]

RFC 5261 Patch Operations September 2008 A.11 . Replacing a Text Node A.12 . Removing an Element Urpalainen Standards Track [Page 34]

RFC 5261 Patch Operations September 2008 A.13 . Removing an Attribute A.14 . Removing a Prefixed Namespace Declaration Urpalainen Standards Track [Page 35]

RFC 5261 Patch Operations September 2008 A.15 . Removing a Comment Node A.16 . Removing a Processing Instruction Node Urpalainen Standards Track [Page 36]

RFC 5261 Patch Operations September 2008 A.17 . Removing a Text Node Urpalainen Standards Track [Page 37]

RFC 5261 Patch Operations September 2008 A.18 . Several Patches With Namespace Mangling Urpalainen Standards Track [Page 38]

RFC 5261 Patch Operations September 2008 Urpalainen Standards Track [Page 39]

RFC 5261 Patch Operations September 2008 BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Urpalainen Standards Track [Page 40]