PROPOSED STANDARD

Errata Exist

Internet Engineering Task Force (IETF) S. Farrell Request for Comments: 6920 Trinity College Dublin Category: Standards Track D. Kutscher ISSN: 2070-1721 NEC C. Dannewitz University of Paderborn B. Ohlman A. Keranen Ericsson P. Hallam-Baker Comodo Group Inc. April 2013 Naming Things with Hashes Abstract This document defines a set of ways to identify a thing (a digital object in this case) using the output from a hash function. It specifies a new URI scheme for this purpose, a way to map these to HTTP URLs, and binary and human-speakable formats for these names. The various formats are designed to support, but not require, a strong link to the referenced object, such that the referenced object may be authenticated to the same degree as the reference to it. The reason for this work is to standardise current uses of hash outputs in URLs and to support new information-centric applications and other uses of hash outputs in protocols. Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc6920. Farrell, et al. Standards Track [Page 1]

RFC 6920 Naming Things with Hashes April 2013 BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Hashes Are What Count . . . . . . . . . . . . . . . . . . . . 4 3. Named Information (ni) URI Format . . . . . . . . . . . . . . 6 3.1. Content Type Query String Attribute . . . . . . . . . . . 8 4. .well-known URI . . . . . . . . . . . . . . . . . . . . . . . 9 5. URL Segment Format . . . . . . . . . . . . . . . . . . . . . . 10 6. Binary Format . . . . . . . . . . . . . . . . . . . . . . . . 10 7. Human-Speakable (nih) URI Format . . . . . . . . . . . . . . . 11 8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 8.1. Hello World! . . . . . . . . . . . . . . . . . . . . . . . 13 8.2. Public Key Examples . . . . . . . . . . . . . . . . . . . 13 8.3. nih Usage Example . . . . . . . . . . . . . . . . . . . . 14 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 9.1. Assignment of ni URI Scheme . . . . . . . . . . . . . . . 15 9.2. Assignment of nih URI Scheme . . . . . . . . . . . . . . . 15 9.3. Assignment of .well-known 'ni' URI . . . . . . . . . . . . 16 9.4. Creation of Named Information Hash Algorithm Registry . . 16 9.5. Creation of Named Information Parameter Registry . . . . . 18 10. Security Considerations . . . . . . . . . . . . . . . . . . . 18 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 20 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 12.1. Normative References . . . . . . . . . . . . . . . . . . . 20 12.2. Informative References . . . . . . . . . . . . . . . . . . 21 Farrell, et al. Standards Track [Page 2]

RFC 6920 Naming Things with Hashes April 2013 1 . Introduction RFC3986] to a specific resource or to make URIs hard to guess for security reasons. Since there is no standard way to interpret those strings today, in general only the creator of the URI knows how to use the hash function output. Other protocols, such as application-layer protocols for accessing "smart objects" in constrained environments, also require more compact (e.g., binary) forms of such identifiers. In yet other situations, people may have to speak such values, e.g., in a voice call (see Section 8.3), in order to confirm the presence or absence of a resource. As another example, protocols for accessing in-network storage servers need a way to identify stored resources uniquely and in a location-independent way so that replicas on different servers can be accessed by the same name. Also, such applications may require verification that a resource representation that has been obtained actually corresponds to the name that was used to request the resource, i.e., verifying the binding between the data and the name, which is here termed "name-data integrity". Similarly, in the context of information-centric networking [NETINF-ARCHITECTURE] [CCN] and elsewhere, there is value in being able to compare a presented resource against the URI that was used to access that resource. If a cryptographically strong comparison function can be used, then this allows for many forms of in-network storage, without requiring as much trust in the infrastructure used to present the resource. The outputs of hash functions can be used in this manner, if they are presented in a standard way. Farrell, et al. Standards Track [Page 3]

RFC 6920 Naming Things with Hashes April 2013 Magnet]), or using other mechanisms also defined herein. However it is represented, the Named Identifier *names* a resource, and the mechanism used to dereference the name and to *locate* the named resource needs to be known by the entity that dereferences it. Media content-type, alternative locations for retrieval, and other additional information about a resource named using this scheme can be provided using a query string. "The Named Information (ni) URI Scheme: Optional Features" [DECPARAMS] describes specific values that can be used in such query strings for these various purposes and other extensions to this basic format specification. In addition, we define a ".well-known" URL equivalent, a way to include a hash as a segment of an HTTP URL, a binary format for use in protocols that require more compact names, and a human-speakable text form that could be used, e.g., for reading out (parts of) the name over a voice connection. Not all uses of these names require use of the full hash output -- truncated hashes can be safely used in some environments. For this reason, we define a new IANA registry for hash functions to be used with this specification so as not to mix strong and weak (truncated) hash algorithms in other protocol registries. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Syntax definitions in this memo are specified according to ABNF [RFC5234]. 2 . Hashes Are What Count Farrell, et al. Standards Track [Page 4]

RFC 6920 Naming Things with Hashes April 2013 SHA-256] is mandatory to implement; that is, implementations MUST be able to generate/send and to accept/process names based on a sha-256 hash. However, implementations MAY support additional hash algorithms and MAY use those for specific names, for example, in a constrained environment where sha-256 is non-optimal or where truncated names are needed to fit into corresponding protocols (when a higher collision probability can be tolerated). Truncated hashes MAY be supported. When a hash value is truncated, the name MUST indicate this. Therefore, we use different hash algorithm strings in these cases, such as sha-256-32 for a 32-bit truncation of a sha-256 output. A 32-bit truncated hash is essentially useless for security in almost all cases but might be useful for naming. With current best practices [RFC3766], very few, if any, applications making use of names with less than 100-bit hashes will have useful security properties. When a hash value is truncated to N bits, the leftmost N bits (that is, the most significant N bits in network byte order) from the binary representation of the hash value MUST be used as the truncated value. An example of a 128-bit hash output truncated to 32 bits is shown in Figure 2. 128-bit hash: 0x265357902fe1b7e2a04b897c6025d7a2 32-bit truncated hash: 0x26535790 Figure 2: Example of Truncated Hash When the input to the hash algorithm is a public key value, as may be used by various security protocols, the hash SHOULD be calculated over the public key in an X.509 SubjectPublicKeyInfo structure (Section 4.1 of [RFC5280]). This input has been chosen primarily for compatibility with the DANE TSLA protocol [RFC6698] but also includes any relevant public key parameters in the hash input, which is sometimes necessary for security reasons. This does not force use of X.509 or full compliance with [RFC5280] since formatting any public key as a SubjectPublicKeyInfo is relatively straightforward and well supported by libraries. Any of the formats defined below can be used to represent the resulting name for a public key. Farrell, et al. Standards Track [Page 5]

RFC 6920 Naming Things with Hashes April 2013 3 . Named Information (ni) URI Format Section 3.2.2 of [RFC3986] for details.) While ni names with and without an authority differ syntactically from ni names with different authorities, all three refer to the same object if and only if the digest algorithm, length, and value are the same. One slash: The literal "/" Digest Algorithm: The name of the digest algorithm, as specified in the IANA registry defined in Section 9.4 below. Separator: The literal ";" Digest Value: The digest value MUST be encoded using the base64url [RFC4648] encoding, with no "=" padding characters. Query Parameter separator '?': The query parameter separator acts as a separator between the digest value and the query parameters (if specified). For compatibility with Internationalized Resource Identifiers (IRIs), non-ASCII characters in the query part MUST be encoded as UTF-8, and the resulting octets MUST be percent-encoded (see [RFC3986], Section 2.1). Query Parameters: A "tag=value" list of optional query parameters as are used with HTTP URLs [RFC2616] with a separator character '&' between each. For example, "foo=bar&baz=bat". It is OPTIONAL for implementations to check the integrity of the URI/ resource mapping when sending, receiving, or processing ni URIs. Escaping of characters follows the rules in RFC 3986. This means that percent-encoding is used to distinguish between reserved and unreserved functions of the same character in the same URI component. Farrell, et al. Standards Track [Page 6]

RFC 6920 Naming Things with Hashes April 2013 [RFC2616], Section 3.2.3. 3.1 . Content Type Query String Attribute RFC6838]. See Section 9.5 for the associated IANA registry for ni parameter names as shown in Figure 6. Implementations of this specification MUST support parsing the "ct=" query string attribute name. ni:///sha-256-32;f4OxZQ?ct=text/plain Figure 6: Example ni URI with Content Type Protocols making use of ni URIs will need to specify how to verify name-data integrity for the MIME Content Types that they need to process and will need to take into account possible Content-Transfer- Encodings and other aspects of MIME encoding. Farrell, et al. Standards Track [Page 8]

RFC 6920 Naming Things with Hashes April 2013 [RFC2045], Section 5.1. 4 . .well-known URI RFC2616] or HTTPS [RFC2818] URLs that makes use of the .well-known URI [RFC5785] by defining an "ni" suffix (see Section 9). The HTTP(S) mapping MAY be used in any context where clients with support for ni URIs are not available. Since the .well-known name-space is not intended for general information retrieval, if an application dereferences a .well-known/ni URL via HTTP(S), then it will often receive a 3xx HTTP redirection response. A server responding to a request for a .well-known/ni URL will often therefore return a 3xx response, and a client sending such a request MUST be able to handle that, as should any fully compliant HTTP [RFC2616] client. For an ni name of the form "ni://n-authority/alg;val?query-string" the corresponding HTTP(S) URL produced by this mapping is "http://h-authority/.well-known/ni/alg/val?query-string", where "h-authority" is derived as follows: If the ni name has a specified authority (i.e., the n-authority is non-empty), then the h-authority MUST have the same value. If the ni name has no authority specified (i.e., the n-authority string is empty), a h-authority value MAY be derived from the application context. For example, if the mapping is being done in the context of a web page, then the origin [RFC6454] for that web site can be used. Of course, in general there are no guarantees that the object named by the ni URI will be available via the corresponding HTTP(S) URL. But in the case that any data is returned, the retriever can determine whether or not it is content that matches the ni URI. Farrell, et al. Standards Track [Page 9]

RFC 6920 Naming Things with Hashes April 2013 RFC6797] may dictate that data only be available over "https". In general, however, whether to use "http" or "https" is something that needs to be decided by the application. 5 . URL Segment Format RFC2616]. In such cases, there is nothing present in the URL that ensures that a client can depend on compliance with this specification, so clients MUST NOT assume that any URL with a pathname component that matches the "alg-val" production was in fact produced as a result of this specification. That URL might or might not be related to this specification, only the context will tell. 6 . Binary Format Farrell, et al. Standards Track [Page 10]

RFC 6920 Naming Things with Hashes April 2013 Section 9.4 for details. A hash value that is truncated to 120 bits will result in the overall name being a 128-bit value, which may be useful for protocols that can easily use 128-bit identifiers. 7 . Human-Speakable (nih) URI Format Section 8.3 is the main current use-case for Named Information for Humans (nih) URIs. ("nih" also means "Not Invented Here", which is clearly false, and therefore worth including [RFC5513]. :-) The ni URI format is not well-suited for this, as, for example, base64url uses both uppercase and lowercase, which can easily cause confusion. For this particular purpose ("speaking" the value of a hash output), the more verbose but less ambiguous (when spoken) nih URI scheme is defined. The justification for nih being a URI scheme is that it can help a user agent for the speaker to better display the value or help a machine to better speak or recognise the value when spoken. We do not include the query string since there is no way to ensure that its value might be spoken unambiguously and similarly for the authority, where, e.g., some internationalised forms of domain name might not be Farrell, et al. Standards Track [Page 11]

RFC 6920 Naming Things with Hashes April 2013 RFC 4648 [RFC4648] except using lowercase letters. Separators ("-" characters) MAY be interspersed in the hash value in any way to make those easier to read, typically grouping four or six characters with a separator between. The hash value MAY be followed by a semicolon ';' then a checkdigit. The checkdigit MUST be calculated using Luhn's mod N algorithm (with N=16) as defined in [ISOIEC7812] (see also [Luhn]). The input to the calculation is the ASCII hex-encoded hash value (i.e., "sepval" in the ABNF production below) but with all "-" separator characters first stripped out. This maps the ASCII hex so that '0'=0, ...'9'=9, 'a'=10, ...'f'=15. None of the other fields, nor any "-" separators, are input when calculating the checkdigit. humanname = "nih:" alg-sepval [ ";" checkdigit ] alg-sepval = alg ";" sepval sepval = 1*(ahlc / "-") ahlc = DIGIT / "a" / "b" / "c" / "d" / "e" / "f" ; DIGIT is defined in RFC 5234 and is 0-9 checkdigit = ahlc Figure 8: Human-Speakable Syntax For algorithms that have a Suite ID reserved (see Figure 11), the alg field MAY contain the ID value as an ASCII-encoded decimal number instead of the hash name string (for example, "3" instead of "sha-256-120"). Implementations MUST be able to match the decimal ID values for the algorithms and hash lengths that they support, even if they do not support the binary format. There is no such thing as a relative nih URI. Farrell, et al. Standards Track [Page 12]

RFC 6920 Naming Things with Hashes April 2013 8.3 . nih Usage Example Farrell, et al. Standards Track [Page 14]

RFC 6920 Naming Things with Hashes April 2013 Section 7. Encoding considerations: See Section 7. Applications/protocols that use this URI scheme name: General applicability. Interoperability considerations: Defined here. Security considerations: See Section 10. Contact: Stephen Farrell, stephen.farrell@cs.tcd.ie Author/Change controller: IETF References: As specified in this document 9.3 . Assignment of .well-known 'ni' URI RFC 5785 [RFC5785]. The following assignment has been made. URI suffix: ni Change controller: IETF Specification document(s): This document Related information: None 9.4 . Creation of Named Information Hash Algorithm Registry RFC5226]. This registry has five fields: the suite ID, the hash algorithm name string, the truncation length, the underlying algorithm reference, and a status field that indicates if the algorithm is current or deprecated and should no longer be used. The status field can have the value "current" or "deprecated". Other values are reserved for possible future definition. If the status is "current", then that does not necessarily mean that the algorithm is "good" for any particular purpose, since the cryptographic strength requirements will be set by other applications or protocols. Farrell, et al. Standards Track [Page 16]

RFC 6920 Naming Things with Hashes April 2013 RFC6149], [RFC6150], and [RFC6151] for examples of some hash functions that are considered obsolete in this sense. The suite ID field ("ID") can be empty or can have values between 0 and 63, inclusive. Because there are only 64 possible values, this field is OPTIONAL (leaving it empty if omitted). Where the binary format is not expected to be used for a given hash algorithm, this field SHOULD be omitted. If an entry is registered without a suite ID, the Designated Expert MAY allow for later allocation of a suite ID, if that appears warranted. The Designated Expert MAY consult the community via a "call for comments" by sending a mail to the IETF discussion list before allocating a suite ID. ID Hash Name String Value Length Reference Status 0 Reserved 1 sha-256 256 bits [SHA-256] current 2 sha-256-128 128 bits [SHA-256] current 3 sha-256-120 120 bits [SHA-256] current 4 sha-256-96 96 bits [SHA-256] current 5 sha-256-64 64 bits [SHA-256] current 6 sha-256-32 32 bits [SHA-256] current 32 Reserved Figure 11: Suite Identifiers The Suite ID value 32 is reserved for compatibility with IPv6 addresses from the Special Purpose Address Registry [RFC4773], such as Overlay Routable Cryptographic Hash Identifiers (ORCHIDs) [RFC4843]. The referenced hash algorithm matching the Suite ID, truncated to the length indicated, according to the description given in Section 2, is used for generating the hash. The Designated Expert is responsible for ensuring that the document referenced for the hash algorithm meets the "specification required" rule. Farrell, et al. Standards Track [Page 17]

RFC 6920 Naming Things with Hashes April 2013 9.5 . Creation of Named Information Parameter Registry Section 3.) The initial contents of the registry are: Parameter Meaning Reference ----------- -------------------------------------------- --------- ct Content Type [RFC6920] 10 . Security Considerations Farrell, et al. Standards Track [Page 18]

RFC 6920 Naming Things with Hashes April 2013 RFC6454] for the bytes of the named object is really going to be the place from which you get the ni name and not the place from which you get the bytes of the object. This appears to offer a potential benefit if using ni names for scripts included from a HTML page accessed via server- authenticated https, for example. If name-data integrity is not validated (and it is optional) or fails, then the web origin is, as usual, the place from which the object bytes were received. Applications making use of ni names SHOULD take this into account in their trust models. Farrell, et al. Standards Track [Page 19]

RFC 6920 Naming Things with Hashes April 2013