PROPOSED STANDARD

Errata Exist

Internet Engineering Task Force (IETF) C. Bormann Request for Comments: 7049 Universitaet Bremen TZI Category: Standards Track P. Hoffman ISSN: 2070-1721 VPN Consortium October 2013 Concise Binary Object Representation (CBOR) Abstract The Concise Binary Object Representation (CBOR) is a data format whose design goals include the possibility of extremely small code size, fairly small message size, and extensibility without the need for version negotiation. These design goals make it different from earlier binary serializations such as ASN.1 and MessagePack. Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7049. Copyright Notice Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Bormann & Hoffman Standards Track [Page 1]

RFC 7049 CBOR October 2013 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Objectives . . . . . . . . . . . . . . . . . . . . . . . 4 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 2. Specification of the CBOR Encoding . . . . . . . . . . . . . 6 2.1. Major Types . . . . . . . . . . . . . . . . . . . . . . . 7 2.2. Indefinite Lengths for Some Major Types . . . . . . . . . 9 2.2.1. Indefinite-Length Arrays and Maps . . . . . . . . . . 9 2.2.2. Indefinite-Length Byte Strings and Text Strings . . . 11 2.3. Floating-Point Numbers and Values with No Content . . . . 12 2.4. Optional Tagging of Items . . . . . . . . . . . . . . . . 14 2.4.1. Date and Time . . . . . . . . . . . . . . . . . . . . 16 2.4.2. Bignums . . . . . . . . . . . . . . . . . . . . . . . 16 2.4.3. Decimal Fractions and Bigfloats . . . . . . . . . . . 17 2.4.4. Content Hints . . . . . . . . . . . . . . . . . . . . 18 2.4.4.1. Encoded CBOR Data Item . . . . . . . . . . . . . 18 2.4.4.2. Expected Later Encoding for CBOR-to-JSON Converters . . . . . . . . . . . . . . . . . . . 18 2.4.4.3. Encoded Text . . . . . . . . . . . . . . . . . . 19 2.4.5. Self-Describe CBOR . . . . . . . . . . . . . . . . . 19 3. Creating CBOR-Based Protocols . . . . . . . . . . . . . . . . 20 3.1. CBOR in Streaming Applications . . . . . . . . . . . . . 20 3.2. Generic Encoders and Decoders . . . . . . . . . . . . . . 21 3.3. Syntax Errors . . . . . . . . . . . . . . . . . . . . . . 21 3.3.1. Incomplete CBOR Data Items . . . . . . . . . . . . . 22 3.3.2. Malformed Indefinite-Length Items . . . . . . . . . . 22 3.3.3. Unknown Additional Information Values . . . . . . . . 23 3.4. Other Decoding Errors . . . . . . . . . . . . . . . . . . 23 3.5. Handling Unknown Simple Values and Tags . . . . . . . . . 24 3.6. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.7. Specifying Keys for Maps . . . . . . . . . . . . . . . . 25 3.8. Undefined Values . . . . . . . . . . . . . . . . . . . . 26 3.9. Canonical CBOR . . . . . . . . . . . . . . . . . . . . . 26 3.10. Strict Mode . . . . . . . . . . . . . . . . . . . . . . . 28 4. Converting Data between CBOR and JSON . . . . . . . . . . . . 29 4.1. Converting from CBOR to JSON . . . . . . . . . . . . . . 29 4.2. Converting from JSON to CBOR . . . . . . . . . . . . . . 30 5. Future Evolution of CBOR . . . . . . . . . . . . . . . . . . 31 5.1. Extension Points . . . . . . . . . . . . . . . . . . . . 32 5.2. Curating the Additional Information Space . . . . . . . . 33 6. Diagnostic Notation . . . . . . . . . . . . . . . . . . . . . 33 6.1. Encoding Indicators . . . . . . . . . . . . . . . . . . . 34 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 35 7.1. Simple Values Registry . . . . . . . . . . . . . . . . . 35 7.2. Tags Registry . . . . . . . . . . . . . . . . . . . . . . 35 7.3. Media Type ("MIME Type") . . . . . . . . . . . . . . . . 36 7.4. CoAP Content-Format . . . . . . . . . . . . . . . . . . . 37 Bormann & Hoffman Standards Track [Page 2]

RFC 7049 CBOR October 2013 1.1 . Objectives CNN-TERMS]). * The format should use contemporary machine representations of data (for example, not requiring binary-to-decimal conversion). 3. Data must be able to be decoded without a schema description. * Similar to JSON, encoded data should be self-describing so that a generic decoder can be written. 4. The serialization must be reasonably compact, but data compactness is secondary to code compactness for the encoder and decoder. * "Reasonable" here is bounded by JSON as an upper bound in size, and by implementation complexity maintaining a lower bound. Using either general compression schemes or extensive bit-fiddling violates the complexity goals. Bormann & Hoffman Standards Track [Page 4]

RFC 7049 CBOR October 2013 1.2 . Terminology RFC 2119, BCP 14 [RFC2119] and indicate requirement levels for compliant CBOR implementations. The term "byte" is used in its now-customary sense as a synonym for "octet". All multi-byte values are encoded in network byte order (that is, most significant byte first, also known as "big-endian"). This specification makes use of the following terminology: Data item: A single piece of CBOR data. The structure of a data item may contain zero, one, or more nested data items. The term is used both for the data item in representation format and for the abstract idea that can be derived from that by a decoder. Bormann & Hoffman Standards Track [Page 5]

RFC 7049 CBOR October 2013 2 . Specification of the CBOR Encoding Section 2.1) and additional information (the low-order 5 bits). When the value of the additional information is less than 24, it is directly used as a small unsigned integer. When it is 24 to 27, the additional bytes for a variable-length integer immediately follow; the values 24 to 27 of the additional information specify that its length is a 1-, 2-, 4-, or 8-byte unsigned integer, respectively. Additional information Bormann & Hoffman Standards Track [Page 6]

RFC 7049 CBOR October 2013 Section 2.2. Additional information values 28 to 30 are reserved for future expansion. In all additional information values, the resulting integer is interpreted depending on the major type. It may represent the actual data: for example, in integer types, the resulting integer is used for the value itself. It may instead supply length information: for example, in byte strings it gives the length of the byte string data that follows. A CBOR decoder implementation can be based on a jump table with all 256 defined values for the initial byte (Table 5). A decoder in a constrained implementation can instead use the structure of the initial byte and following bytes for more compact code (see Appendix C for a rough impression of how this could look). 2.1 . Major Types Bormann & Hoffman Standards Track [Page 7]

RFC 7049 CBOR October 2013 RFC3629]. The format of this type is identical to that of byte strings (major type 2), that is, as with major type 2, the length gives the number of bytes. This type is provided for systems that need to interpret or display human-readable text, and allows the differentiation between unstructured bytes and text that has a specified repertoire and encoding. In contrast to formats such as JSON, the Unicode characters in this type are never escaped. Thus, a newline character (U+000A) is always represented in a string as the byte 0x0a, and never as the bytes 0x5c6e (the characters "\" and "n") or as 0x5c7530303061 (the characters "\", "u", "0", "0", "0", and "a"). Major type 4: an array of data items. Arrays are also called lists, sequences, or tuples. The array's length follows the rules for byte strings (major type 2), except that the length denotes the number of data items, not the length in bytes that the array takes up. Items in an array do not need to all be of the same type. For example, an array that contains 10 items of any type would have an initial byte of 0b100_01010 (major type of 4, additional information of 10 for the length) followed by the 10 remaining items. Major type 5: a map of pairs of data items. Maps are also called tables, dictionaries, hashes, or objects (in JSON). A map is comprised of pairs of data items, each pair consisting of a key that is immediately followed by a value. The map's length follows the rules for byte strings (major type 2), except that the length denotes the number of pairs, not the length in bytes that the map takes up. For example, a map that contains 9 pairs would have an initial byte of 0b101_01001 (major type of 5, additional information of 9 for the number of pairs) followed by the 18 remaining items. The first item is the first key, the second item is the first value, the third item is the second key, and so on. A map that has duplicate keys may be well-formed, but it is not valid, and thus it causes indeterminate decoding; see also Section 3.7. Major type 6: optional semantic tagging of other major types. See Section 2.4. Bormann & Hoffman Standards Track [Page 8]

RFC 7049 CBOR October 2013 Section 2.3. These eight major types lead to a simple table showing which of the 256 possible values for the initial byte of a data item are used (Table 5). In major types 6 and 7, many of the possible values are reserved for future specification. See Section 7 for more information on these values. 2.2 . Indefinite Lengths for Some Major Types 2.2.1 . Indefinite-Length Arrays and Maps Bormann & Hoffman Standards Track [Page 9]

RFC 7049 CBOR October 2013 Bormann & Hoffman Standards Track [Page 10]

RFC 7049 CBOR October 2013 2.2.2 . Indefinite-Length Byte Strings and Text Strings Bormann & Hoffman Standards Track [Page 11]

RFC 7049 CBOR October 2013 2.3 . Floating-Point Numbers and Values with No Content Bormann & Hoffman Standards Track [Page 12]

RFC 7049 CBOR October 2013 Bormann & Hoffman Standards Track [Page 13]

RFC 7049 CBOR October 2013 Appendix D for some information about 16-bit floating point.) 2.4 . Optional Tagging of Items Section 2.4.2). This would be marked as 0b110_00010 (major type 6, additional information 2 for the tag) followed by 0b010_01100 (major type 2, additional information of 12 for the length) followed by the 12 bytes of the bignum. Decoders do not need to understand tags, and thus tags may be of little value in applications where the implementation creating a particular CBOR data item and the implementation decoding that stream know the semantic meaning of each item in the data flow. Their primary purpose in this specification is to define common data types such as dates. A secondary purpose is to allow optional tagging when the decoder is a generic CBOR decoder that might be able to benefit from hints about the content of items. Understanding the semantic tags is optional for a decoder; it can just jump over the initial bytes of the tag and interpret the tagged data item itself. A tag always applies to the item that is directly followed by it. Thus, if tag A is followed by tag B, which is followed by data item C, tag A applies to the result of applying tag B on data item C. That is, a tagged item is a data item consisting of a tag and a value. The content of the tagged item is the data item (the value) that is being tagged. IANA maintains a registry of tag values as described in Section 7.2. Table 3 provides a list of initial values, with definitions in the rest of this section. Bormann & Hoffman Standards Track [Page 14]

RFC 7049 CBOR October 2013 Section 2.4.5 | | | | | | 55800+ | (Unassigned) | (Unassigned) | +--------------+------------------+---------------------------------+ Table 3: Values for Tags 2.4.1 . Date and Time RFC3339], as refined by Section 3.3 of [RFC4287]. Tag value 1 is for numerical representation of seconds relative to 1970-01-01T00:00Z in UTC time. (For the non-negative values that the Portable Operating System Interface (POSIX) defines, the number of seconds is counted in the same way as for POSIX "seconds since the epoch" [TIME_T].) The tagged item can be a positive or negative integer (major types 0 and 1), or a floating-point number (major type 7 with additional information 25, 26, or 27). Note that the number can be negative (time before 1970-01-01T00:00Z) and, if a floating- point number, indicate fractional seconds. 2.4.2 . Bignums Bormann & Hoffman Standards Track [Page 16]

RFC 7049 CBOR October 2013 2.4.3 . Decimal Fractions and Bigfloats Section 2.3). Bigfloats may also be used by constrained applications that need some basic binary floating-point capability without the need for supporting IEEE 754. A decimal fraction or a bigfloat is represented as a tagged array that contains exactly two integer numbers: an exponent e and a mantissa m. Decimal fractions (tag 4) use base-10 exponents; the value of a decimal fraction data item is m*(10**e). Bigfloats (tag 5) use base-2 exponents; the value of a bigfloat data item is m*(2**e). The exponent e MUST be represented in an integer of major type 0 or 1, while the mantissa also can be a bignum (Section 2.4.2). An example of a decimal fraction is that the number 273.15 could be represented as 0b110_00100 (major type of 6 for the tag, additional information of 4 for the type of tag), followed by 0b100_00010 (major type of 4 for the array, additional information of 2 for the length of the array), followed by 0b001_00001 (major type of 1 for the first integer, additional information of 1 for the value of -2), followed by 0b000_11001 (major type of 0 for the second integer, additional information of 25 for a two-byte value), followed by 0b0110101010110011 (27315 in two bytes). In hexadecimal: C4 -- Tag 4 82 -- Array of length 2 21 -- -2 19 6ab3 -- 27315 An example of a bigfloat is that the number 1.5 could be represented as 0b110_00101 (major type of 6 for the tag, additional information of 5 for the type of tag), followed by 0b100_00010 (major type of 4 for the array, additional information of 2 for the length of the array), followed by 0b001_00000 (major type of 1 for the first integer, additional information of 0 for the value of -1), followed by 0b000_00011 (major type of 0 for the second integer, additional information of 3 for the value of 3). In hexadecimal: Bormann & Hoffman Standards Track [Page 17]

RFC 7049 CBOR October 2013 RFC 4648 or for other ways to encode binary data in strings. 2.4.4.3 . Encoded Text RFC3986]; o Tags 33 and 34 are for base64url- and base64-encoded text strings, as defined in [RFC4648]; o Tag 35 is for regular expressions in Perl Compatible Regular Expressions (PCRE) / JavaScript syntax [ECMA262]. o Tag 36 is for MIME messages (including all headers), as defined in [RFC2045]; Note that tags 33 and 34 differ from 21 and 22 in that the data is transported in base-encoded form for the former and in raw byte string form for the latter. 2.4.5 . Self-Describe CBOR Bormann & Hoffman Standards Track [Page 19]

RFC 7049 CBOR October 2013 3.2 . Generic Encoders and Decoders RFC3339]. There is no requirement that generic encoders and decoders make unnatural choices for their application interface to enable the processing of invalid data. Generic encoders and decoders are expected to forward simple values and tags even if their specific codepoints are not registered at the time the encoder/decoder is written (Section 3.5). Generic decoders provide ways to present well-formed CBOR values, both valid and invalid, to an application. The diagnostic notation (Section 6) may be used to present well-formed CBOR values to humans. Generic encoders provide an application interface that allows the application to specify any well-formed value, including simple values and tags unknown to the encoder. 3.3 . Syntax Errors Bormann & Hoffman Standards Track [Page 21]

RFC 7049 CBOR October 2013 3.3.1 . Incomplete CBOR Data Items 3.3.2 . Malformed Indefinite-Length Items Bormann & Hoffman Standards Track [Page 22]

RFC 7049 CBOR October 2013 3.3.3 . Unknown Additional Information Values Section 5.2). Since the overall syntax for these additional information values is not yet defined, a decoder that sees an additional information value that it does not understand cannot continue parsing. 3.4 . Other Decoding Errors Section 3.2) make data available to applications using the native CBOR data model. That data model includes maps (key-value mappings with unique keys), not multimaps (key-value mappings where multiple entries can have the same key). Thus, a generic decoder that gets a CBOR map item that has duplicate keys will decode to a map with only one instance of that key, or it might stop processing altogether. On the other hand, a "streaming decoder" may not even be able to notice (Section 3.7). Inadmissible type on the value following a tag: Tags (Section 2.4) specify what type of data item is supposed to follow the tag; for example, the tags for positive or negative bignums are supposed to be put on byte strings. A decoder that decodes the tagged data item into a native representation (a native big integer in this example) is expected to check the type of the data item being tagged. Even decoders that don't have such native representations available in their environment may perform the check on those tags known to them and react appropriately. Invalid UTF-8 string: A decoder might or might not want to verify that the sequence of bytes in a UTF-8 string (major type 3) is actually valid UTF-8 and react appropriately. Bormann & Hoffman Standards Track [Page 23]

RFC 7049 CBOR October 2013 3.5 . Handling Unknown Simple Values and Tags Section 2.3) that it does not recognize, such as a value that was added to the IANA registry after the decoder was deployed or a value that the decoder chose not to implement, might issue a warning, might stop processing altogether, might handle the error by making the unknown value available to the application as such (as is expected of generic decoders), or take some other type of action. A decoder that comes across a tag (Section 2.4) that it does not recognize, such as a tag that was added to the IANA registry after the decoder was deployed or a tag that the decoder chose not to implement, might issue a warning, might stop processing altogether, might handle the error and present the unknown tag value together with the contained data item to the application (as is expected of generic decoders), might ignore the tag and simply present the contained data item only to the application, or take some other type of action. 3.6 . Numbers Bormann & Hoffman Standards Track [Page 24]

RFC 7049 CBOR October 2013 3.7 . Specifying Keys for Maps Bormann & Hoffman Standards Track [Page 25]

RFC 7049 CBOR October 2013 Section 3.10). The CBOR data model for maps does not allow ascribing semantics to the order of the key/value pairs in the map representation. Thus, it would be a very bad practice to define a CBOR-based protocol in such a way that changing the key/value pair order in a map would change the semantics, apart from trivial aspects (cache usage, etc.). (A CBOR-based protocol can prescribe a specific order of serialization, such as for canonicalization.) Applications for constrained devices that have maps with 24 or fewer frequently used keys should consider using small integers (and those with up to 48 frequently used keys should consider also using small negative integers) because the keys can then be encoded in a single byte. 3.8 . Undefined Values Section 2.3) of Undefined might be used by an encoder as a substitute for a data item with an encoding problem, in order to allow the rest of the enclosing data items to be encoded without harm. 3.9 . Canonical CBOR Bormann & Hoffman Standards Track [Page 26]

RFC 7049 CBOR October 2013 Bormann & Hoffman Standards Track [Page 27]

RFC 7049 CBOR October 2013 3.10 . Strict Mode Section 3.9) but may require that different decoders reach the same (semantically equivalent) results, even in the presence of potentially malicious data. This can be required if one application (such as a firewall or other protecting entity) makes a decision based on the data that another application, which independently decodes the data, relies on. Normally, it is the responsibility of the sender to avoid ambiguously decodable data. However, the sender might be an attacker specially making up CBOR data such that it will be interpreted differently by different decoders in an attempt to exploit that as a vulnerability. Generic decoders used in applications where this might be a problem need to support a strict mode in which it is also the responsibility of the receiver to reject ambiguously decodable data. It is expected that firewalls and other security systems that decode CBOR will only decode in strict mode. A decoder in strict mode will reliably reject any data that could be interpreted by other decoders in different ways. It will reliably reject data items with syntax errors (Section 3.3). It will also expend the effort to reliably detect other decoding errors (Section 3.4). In particular, a strict decoder needs to have an API that reports an error (and does not return data) for a CBOR data item that contains any of the following: o a map (major type 5) that has more than one entry with the same key o a tag that is used on a data item of the incorrect type o a data item that is incorrectly formatted for the type given to it, such as invalid UTF-8 or data that cannot be interpreted with the specific tag that it has been tagged with A decoder in strict mode can do one of two things when it encounters a tag or simple value that it does not recognize: o It can report an error (and not return data). o It can emit the unknown item (type, value, and, for tags, the decoded tagged data item) to the application calling the decoder with an indication that the decoder did not recognize that tag or simple value. Bormann & Hoffman Standards Track [Page 28]

RFC 7049 CBOR October 2013 4 . Converting Data between CBOR and JSON 4.1 . Converting from CBOR to JSON RFC 4627, Section 2.5): quotation mark (U+0022), reverse solidus (U+005C), and the "C0 control characters" (U+0000 through U+001F). All other characters are copied unchanged into the JSON UTF-8 string. o An array (major type 4) becomes a JSON array. Bormann & Hoffman Standards Track [Page 29]

RFC 7049 CBOR October 2013 4.2 . Converting from JSON to CBOR Bormann & Hoffman Standards Track [Page 30]

RFC 7049 CBOR October 2013 5 . Future Evolution of CBOR Bormann & Hoffman Standards Track [Page 31]

RFC 7049 CBOR October 2013 5.1 . Extension Points Section 7.1 is the appropriate way to address the extensibility of this codepoint space. o the "tag" space (values in major type 6). Again, only a small part of the codepoint space has been allocated, and the space is abundant (although the early numbers are more efficient than the later ones). Implementations receiving an unknown tag can choose to simply ignore it or to process it as an unknown tag wrapping the following data item. The IANA registry in Section 7.2 is the appropriate way to address the extensibility of this codepoint space. o the "additional information" space. An implementation receiving an unknown additional information value has no way to continue parsing, so allocating codepoints to this space is a major step. There are also very few codepoints left. Bormann & Hoffman Standards Track [Page 32]

RFC 7049 CBOR October 2013 5.2 . Curating the Additional Information Space 6 . Diagnostic Notation YAML].) The diagnostic notation is loosely based on JSON as it is defined in RFC 4627, extending it where needed. The notation borrows the JSON syntax for numbers (integer and floating point), True (>true<), False (>false<), Null (>null<), UTF-8 strings, arrays, and maps (maps are called objects in JSON; the diagnostic notation extends JSON here by allowing any data item in the key position). Undefined is written >undefined< as in JavaScript. The non-finite floating-point numbers Infinity, -Infinity, and NaN are written exactly as in this sentence (this is also a way they can be written in JavaScript, although JSON does not allow them). A tagged item is written as an integer number for the tag followed by the item in parentheses; for instance, an RFC 3339 (ISO 8601) date could be notated as: Bormann & Hoffman Standards Track [Page 33]

RFC 7049 CBOR October 2013 6.1 . Encoding Indicators Appendix A. (Note that the encoding indicator "_" is thus an abbreviation of the full form "_7", which is not used.) As a special case, byte and text strings of indefinite length can be notated in the form (_ h'0123', h'4567') and (_ "foo", "bar"). Bormann & Hoffman Standards Track [Page 34]

RFC 7049 CBOR October 2013 7 . IANA Considerations RFC5226]. IANA has also assigned a new MIME media type and an associated Constrained Application Protocol (CoAP) Content-Format entry. 7.1 . Simple Values Registry 7.2 . Tags Registry Bormann & Hoffman Standards Track [Page 35]

RFC 7049 CBOR October 2013 7.3 . Media Type ("MIME Type") RFC6838] for CBOR data is application/cbor. Type name: application Subtype name: cbor Required parameters: n/a Optional parameters: n/a Encoding considerations: binary Security considerations: See Section 8 of this document Interoperability considerations: n/a Published specification: This document Applications that use this media type: None yet, but it is expected that this format will be deployed in protocols and applications. Additional information: Magic number(s): n/a File extension(s): .cbor Macintosh file type code(s): n/a Person & email address to contact for further information: Carsten Bormann cabo@tzi.org Intended usage: COMMON Restrictions on usage: none Author: Carsten Bormann <cabo@tzi.org> Change controller: The IESG <iesg@ietf.org> Bormann & Hoffman Standards Track [Page 36]

RFC 7049 CBOR October 2013 7.4 . CoAP Content-Format RFC7049] 7.5 . The +cbor Structured Syntax Suffix Registration RFC7049] Encoding Considerations: CBOR is a binary format. Interoperability Considerations: n/a Fragment Identifier Considerations: The syntax and semantics of fragment identifiers specified for +cbor SHOULD be as specified for "application/cbor". (At publication of this document, there is no fragment identification syntax defined for "application/cbor".) The syntax and semantics for fragment identifiers for a specific "xxx/yyy+cbor" SHOULD be processed as follows: For cases defined in +cbor, where the fragment identifier resolves per the +cbor rules, then process as specified in +cbor. For cases defined in +cbor, where the fragment identifier does not resolve per the +cbor rules, then process as specified in "xxx/yyy+cbor". For cases not defined in +cbor, then process as specified in "xxx/yyy+cbor". Security Considerations: See Section 8 of this document Contact: Apps Area Working Group (apps-discuss@ietf.org) Bormann & Hoffman Standards Track [Page 37]

RFC 7049 CBOR October 2013 Bormann & Hoffman Standards Track [Page 42]

RFC 7049 CBOR October 2013 Bormann & Hoffman Standards Track [Page 43]

RFC 7049 CBOR October 2013 Bormann & Hoffman Standards Track [Page 44]

RFC 7049 CBOR October 2013 Bormann & Hoffman Standards Track [Page 46]

RFC 7049 CBOR October 2013 Bormann & Hoffman Standards Track [Page 49]

RFC 7049 CBOR October 2013 Appendix E . Comparison of Other Binary Formats to CBOR's Design Bormann & Hoffman Standards Track [Page 51]

RFC 7049 CBOR October 2013 Section 1.1 is: 1. unambiguous encoding of most common data formats from Internet standards 2. code compactness for encoder or decoder 3. no schema description needed 4. reasonably compact serialization 5. applicability to constrained and unconstrained applications 6. good JSON conversion 7. extensibility E.1 . ASN.1 DER, BER, and PER ASN.1] has many serializations. In the IETF, DER and BER are the most common. The serialized output is not particularly compact for many items, and the code needed to decode numeric items can be complex on a constrained device. Few (if any) IETF protocols have adopted one of the several variants of Packed Encoding Rules (PER). There could be many reasons for this, but one that is commonly stated is that PER makes use of the schema even for parsing the surface structure of the data stream, requiring significant tool support. There are different versions of the ASN.1 schema language in use, which has also hampered adoption. E.2 . MessagePack MessagePack] is a concise, widely implemented counted binary serialization format, similar in many properties to CBOR, although somewhat less regular. While the data model can be used to represent JSON data, MessagePack has also been used in many remote procedure call (RPC) applications and for long-term storage of data. MessagePack has been essentially stable since it was first published around 2011; it has not yet had a transition. The evolution of MessagePack is impeded by an imperative to maintain complete backwards compatibility with existing stored data, while only few bytecodes are still available for extension. Repeated requests over the years from the MessagePack user community to separate out binary Bormann & Hoffman Standards Track [Page 52]

RFC 7049 CBOR October 2013 E.3 . BSON BSON] is a data format that was developed for the storage of JSON- like maps (JSON objects) in the MongoDB database. Its major distinguishing feature is the capability for in-place update, foregoing a compact representation. BSON uses a counted representation except for map keys, which are null-byte terminated. While BSON can be used for the representation of JSON-like objects on the wire, its specification is dominated by the requirements of the database application and has become somewhat baroque. The status of how BSON extensions will be implemented remains unclear. E.4 . UBJSON UBJSON] has a design goal to make JSON faster and somewhat smaller, using a binary format that is limited to exactly the data model JSON uses. Thus, there is expressly no intention to support, for example, binary data; however, there is a "high-precision number", expressed as a character string in JSON syntax. UBJSON is not optimized for code compactness, and its type byte coding is optimized for human recognition and not for compact representation of native types such as small integers. Although UBJSON is mostly counted, it provides a reserved "unknown-length" value to support streaming of arrays and maps (JSON objects). Within these containers, UBJSON also has a "Noop" type for padding. E.5 . MSDTP: RFC 713 RFC0713], written in 1976. It is included here for its historical value, not because it was ever widely used. E.6 . Conciseness on the Wire Bormann & Hoffman Standards Track [Page 53]

RFC 7049 CBOR October 2013 RFC 713 | c2 05 81 c2 02 82 83 | | | | | | | ASN.1 BER | 30 0b 02 01 01 30 06 02 | 30 80 02 01 01 30 06 02 | | | 01 02 02 01 03 | 01 02 02 01 03 00 00 | | | | | | MessagePack | 92 01 92 02 03 | | | | | | | BSON | 22 00 00 00 10 30 00 01 | | | | 00 00 00 04 31 00 13 00 | | | | 00 00 10 30 00 02 00 00 | | | | 00 10 31 00 03 00 00 00 | | | | 00 00 | | | | | | | UBJSON | 61 02 42 01 61 02 42 02 | 61 ff 42 01 61 02 42 02 | | | 42 03 | 42 03 45 | | | | | | CBOR | 82 01 82 02 03 | 9f 01 82 02 03 ff | +---------------+-------------------------+-------------------------+ Table 6: Examples for Different Levels of Conciseness Authors' Addresses Carsten Bormann Universitaet Bremen TZI Postfach 330440 D-28359 Bremen Germany Phone: +49-421-218-63921 EMail: cabo@tzi.org Paul Hoffman VPN Consortium EMail: paul.hoffman@vpnc.org Bormann & Hoffman Standards Track [Page 54]