Standard ECMA-262 6th Edition / June 2015 ECMAScript® 2015 Language Specification

This is the HTML rendering of ECMA-262 6th Edition, The ECMAScript 2015 Language Specification. The PDF rendering of this document is located at http://www.ecma-international.org/ecma-262/6.0/ECMA-262.pdf. The PDF version is the definitive specification. Any discrepancies between this HTML version and the PDF version are unintentional.

Ecma International Rue du Rhone 114 CH-1204 Geneva Tel: +41 22 849 6000 Fax: +41 22 849 6001 Web: http://www.ecma-international.org COPYRIGHT NOTICE © 2015 Ecma International This document may be copied, published and distributed to others, and certain derivative works of it may be prepared, copied, published, and distributed, in whole or in part, provided that the above copyright notice and this Copyright License and Disclaimer are included on all such copies and derivative works. The only derivative works that are permissible under this Copyright License and Disclaimer are: (i) works which incorporate all or portion of this document for the purpose of providing commentary or explanation (such as an annotated version of the document), (ii) works which incorporate all or portion of this document for the purpose of incorporating features that provide accessibility, (iii) translations of this document into languages other than English and into different formats and (iv) works by making use of this specification in standard conformant products by implementing (e.g. by copy and paste wholly or partly) the functionality therein. However, the content of this document itself may not be modified in any way, including by removing the copyright notice or references to Ecma International, except as required to translate it into languages other than English or into a different format. The official version of an Ecma International document is the English language version on the Ecma International website. In the event of discrepancies between a translated version and the official version, the official version shall govern. The limited permissions granted above are perpetual and will not be revoked by Ecma International or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and ECMA INTERNATIONAL DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY OWNERSHIP RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." Software License All Software contained in this document ("Software)" is protected by copyright and is being made available under the "BSD License", included below. This Software may be subject to third party rights (rights from parties other than Ecma International), including patent rights, and no licenses under such third party rights are granted under this license even if the third party concerned is a member of Ecma International. SEE THE ECMA CODE OF CONDUCT IN PATENT MATTERS AVAILABLE AT http://www.ecma-international.org/memento/codeofconduct.htm FOR INFORMATION REGARDING THE LICENSING OF PATENT CLAIMS THAT ARE REQUIRED TO IMPLEMENT ECMA INTERNATIONAL STANDARDS*. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. Neither the name of the authors nor Ecma International may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE ECMA INTERNATIONAL "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL ECMA INTERNATIONAL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Introduction This Ecma Standard defines the ECMAScript 2015 Language. It is the sixth edition of the ECMAScript Language Specification. Since publication of the first edition in 1997, ECMAScript has grown to be one of the world’s most widely used general purpose programming languages. It is best known as the language embedded in web browsers but has also been widely adopted for server and embedded applications. The sixth edition is the most extensive update to ECMAScript since the publication of the first edition in 1997. Goals for ECMAScript 2015 include providing better support for large applications, library creation, and for use of ECMAScript as a compilation target for other languages. Some of its major enhancements include modules, class declarations, lexical block scoping, iterators and generators, promises for asynchronous programming, destructuring patterns, and proper tail calls. The ECMAScript library of built-ins has been expanded to support additional data abstractions including maps, sets, and arrays of binary numeric values as well as additional support for Unicode supplemental characters in strings and regular expressions. The built-ins are now extensible via subclassing. ECMAScript is based on several originating technologies, the most well-known being JavaScript (Netscape) and JScript (Microsoft). The language was invented by Brendan Eich at Netscape and first appeared in that company’s Navigator 2.0 browser. It has appeared in all subsequent browsers from Netscape and in all browsers from Microsoft starting with Internet Explorer 3.0. The development of the ECMAScript Language Specification started in November 1996. The first edition of this Ecma Standard was adopted by the Ecma General Assembly of June 1997. That Ecma Standard was submitted to ISO/IEC JTC 1 for adoption under the fast-track procedure, and approved as international standard ISO/IEC 16262, in April 1998. The Ecma General Assembly of June 1998 approved the second edition of ECMA-262 to keep it fully aligned with ISO/IEC 16262. Changes between the first and the second edition are editorial in nature. The third edition of the Standard introduced powerful regular expressions, better string handling, new control statements, try/catch exception handling, tighter definition of errors, formatting for numeric output and minor changes in anticipation future language growth. The third edition of the ECMAScript standard was adopted by the Ecma General Assembly of December 1999 and published as ISO/IEC 16262:2002 in June 2002. After publication of the third edition, ECMAScript achieved massive adoption in conjunction with the World Wide Web where it has become the programming language that is supported by essentially all web browsers. Significant work was done to develop a fourth edition of ECMAScript. However, that work was not completed and not published as the fourth edition of ECMAScript but some of it was incorporated into the development of the sixth edition. The fifth edition of ECMAScript (published as ECMA-262 5th edition) codified de facto interpretations of the language specification that have become common among browser implementations and added support for new features that had emerged since the publication of the third edition. Such features include accessor properties, reflective creation and inspection of objects, program control of property attributes, additional array manipulation functions, support for the JSON object encoding format, and a strict mode that provides enhanced error checking and program security. The Fifth Edition was adopted by the Ecma General Assembly of December 2009. The fifth Edition was submitted to ISO/IEC JTC 1 for adoption under the fast-track procedure, and approved as international standard ISO/IEC 16262:2011. Edition 5.1 of the ECMAScript Standard incorporated minor corrections and is the same text as ISO/IEC 16262:2011. The 5.1 Edition was adopted by the Ecma General Assembly of June 2011. Focused development of the sixth edition started in 2009, as the fifth edition was being prepared for publication. However, this was preceded by significant experimentation and language enhancement design efforts dating to the publication of the third edition in 1999. In a very real sense, the completion of the sixth edition is the culmination of a fifteen year effort. Dozens of individuals representing many organizations have made very significant contributions within Ecma TC39 to the development of this edition and to the prior editions. In addition, a vibrant informal community has emerged supporting TC39’s ECMAScript efforts. This community has reviewed numerous drafts, filed thousands of bug reports, performed implementation experiments, contributed test suites, and educated the world-wide developer community about ECMAScript. Unfortunately, it is impossible to identify and acknowledge every person and organization who has contributed to this effort. New uses and requirements for ECMAScript continue to emerge. The sixth edition provides the foundation for regular, incremental language and library enhancements. Allen Wirfs-Brock

ECMA-262, 6th Edition Project Editor This Ecma Standard has been adopted by the General Assembly of June 2015.

ECMAScript 2015 Language Specification

1 This Standard defines the ECMAScript 2015 general purpose programming language.

2 A conforming implementation of ECMAScript must provide and support all the types, values, objects, properties, functions, and program syntax and semantics described in this specification. A conforming implementation of ECMAScript must interpret source text input in conformance with the Unicode Standard, Version 5.1.0 or later and ISO/IEC 10646. If the adopted ISO/IEC 10646-1 subset is not otherwise specified, it is presumed to be the Unicode set, collection 10646. A conforming implementation of ECMAScript that provides an application programming interface that supports programs that need to adapt to the linguistic and cultural conventions used by different human languages and countries must implement the interface defined by the most recent edition of ECMA-402 that is compatible with this specification. A conforming implementation of ECMAScript may provide additional types, values, objects, properties, and functions beyond those described in this specification. In particular, a conforming implementation of ECMAScript may provide properties not described in this specification, and values for those properties, for objects that are described in this specification. A conforming implementation of ECMAScript may support program and regular expression syntax not described in this specification. In particular, a conforming implementation of ECMAScript may support program syntax that makes use of the “future reserved words” listed in subclause 11.6.2.2 of this specification. A conforming implementation of ECMAScript must not implement any extension that is listed as a Forbidden Extension in subclause 16.1.

3 The following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies. ISO/IEC 10646:2003: Information Technology – Universal Multiple-Octet Coded Character Set (UCS) plus Amendment 1:2005, Amendment 2:2006, Amendment 3:2008, and Amendment 4:2008, plus additional amendments and corrigenda, or successor ECMA-402, ECMAScript 2015 Internationalization API Specification.

http://www.ecma-international.org/publications/standards/Ecma-402.htm ECMA-404, The JSON Data Interchange Format.

http://www.ecma-international.org/publications/standards/Ecma-404.htm

4 This section contains a non-normative overview of the ECMAScript language. ECMAScript is an object-oriented programming language for performing computations and manipulating computational objects within a host environment. ECMAScript as defined here is not intended to be computationally self-sufficient; indeed, there are no provisions in this specification for input of external data or output of computed results. Instead, it is expected that the computational environment of an ECMAScript program will provide not only the objects and other facilities described in this specification but also certain environment-specific objects, whose description and behaviour are beyond the scope of this specification except to indicate that they may provide certain properties that can be accessed and certain functions that can be called from an ECMAScript program. ECMAScript was originally designed to be used as a scripting language, but has become widely used as a general purpose programming language. A scripting language is a programming language that is used to manipulate, customize, and automate the facilities of an existing system. In such systems, useful functionality is already available through a user interface, and the scripting language is a mechanism for exposing that functionality to program control. In this way, the existing system is said to provide a host environment of objects and facilities, which completes the capabilities of the scripting language. A scripting language is intended for use by both professional and non-professional programmers. ECMAScript was originally designed to be a Web scripting language, providing a mechanism to enliven Web pages in browsers and to perform server computation as part of a Web-based client-server architecture. ECMAScript is now used to provide core scripting capabilities for a variety of host environments. Therefore the core language is specified in this document apart from any particular host environment. ECMAScript usage has moved beyond simple scripting and it is now used for the full spectrum of programming tasks in many different environments and scales. As the usage of ECMAScript has expanded, so has the features and facilities it provides. ECMAScript is now a fully featured general propose programming language. Some of the facilities of ECMAScript are similar to those used in other programming languages; in particular C, Java™, Self, and Scheme as described in: ISO/IEC 9899:1996, Programming Languages – C. Gosling, James, Bill Joy and Guy Steele. The Java™ Language Specification. Addison Wesley Publishing Co., 1996. Ungar, David, and Smith, Randall B. Self: The Power of Simplicity. OOPSLA '87 Conference Proceedings, pp. 227–241, Orlando, FL, October 1987. IEEE Standard for the Scheme Programming Language. IEEE Std 1178-1990. 4.1 A web browser provides an ECMAScript host environment for client-side computation including, for instance, objects that represent windows, menus, pop-ups, dialog boxes, text areas, anchors, frames, history, cookies, and input/output. Further, the host environment provides a means to attach scripting code to events such as change of focus, page and image loading, unloading, error and abort, selection, form submission, and mouse actions. Scripting code appears within the HTML and the displayed page is a combination of user interface elements and fixed and computed text and images. The scripting code is reactive to user interaction and there is no need for a main program. A web server provides a different host environment for server-side computation including objects representing requests, clients, and files; and mechanisms to lock and share data. By using browser-side and server-side scripting together, it is possible to distribute computation between the client and server while providing a customized user interface for a Web-based application. Each Web browser and server that supports ECMAScript supplies its own host environment, completing the ECMAScript execution environment. 4.2 The following is an informal overview of ECMAScript—not all parts of the language are described. This overview is not part of the standard proper. ECMAScript is object-based: basic language and host facilities are provided by objects, and an ECMAScript program is a cluster of communicating objects. In ECMAScript, an object is a collection of zero or more properties each with attributes that determine how each property can be used—for example, when the Writable attribute for a property is set to false, any attempt by executed ECMAScript code to assign a different value to the property fails. Properties are containers that hold other objects, primitive values, or functions. A primitive value is a member of one of the following built-in types: Undefined, Null, Boolean, Number, String, and Symbol; an object is a member of the built-in type Object; and a function is a callable object. A function that is associated with an object via a property is called a method. ECMAScript defines a collection of built-in objects that round out the definition of ECMAScript entities. These built-in objects include the global object; objects that are fundamental to the runtime semantics of the language including Object, Function, Boolean, Symbol, and various Error objects; objects that represent and manipulate numeric values including Math, Number, and Date; the text processing objects String and RegExp; objects that are indexed collections of values including Array and nine different kinds of Typed Arrays whose elements all have a specific numeric data representation; keyed collections including Map and Set objects; objects supporting structured data including the JSON object, ArrayBuffer, and DataView; objects supporting control abstractions including generator functions and Promise objects; and, reflection objects including Proxy and Reflect. ECMAScript also defines a set of built-in operators. ECMAScript operators include various unary operations, multiplicative operators, additive operators, bitwise shift operators, relational operators, equality operators, binary bitwise operators, binary logical operators, assignment operators, and the comma operator. Large ECMAScript programs are supported by modules which allow a program to be divided into multiple sequences of statements and declarations. Each module explicitly identifies declarations it uses that need to be provided by other modules and which of its declarations are available for use by other modules. ECMAScript syntax intentionally resembles Java syntax. ECMAScript syntax is relaxed to enable it to serve as an easy-to-use scripting language. For example, a variable is not required to have its type declared nor are types associated with properties, and defined functions are not required to have their declarations appear textually before calls to them. 4.2.1 Even though ECMAScript includes syntax for class definitions, ECMAScript objects are not fundamentally class-based such as those in C++, Smalltalk, or Java. Instead objects may be created in various ways including via a literal notation or via constructors which create objects and then execute code that initializes all or part of them by assigning initial values to their properties. Each constructor is a function that has a property named "prototype" that is used to implement prototype-based inheritance and shared properties. Objects are created by using constructors in new expressions; for example, new Date(2009,11) creates a new Date object. Invoking a constructor without using new has consequences that depend on the constructor. For example, Date() produces a string representation of the current date and time rather than an object. Every object created by a constructor has an implicit reference (called the object’s prototype) to the value of its constructor’s "prototype" property. Furthermore, a prototype may have a non-null implicit reference to its prototype, and so on; this is called the prototype chain. When a reference is made to a property in an object, that reference is to the property of that name in the first object in the prototype chain that contains a property of that name. In other words, first the object mentioned directly is examined for such a property; if that object contains the named property, that is the property to which the reference refers; if that object does not contain the named property, the prototype for that object is examined next; and so on. Figure 1 — Object/Prototype Relationships In a class-based object-oriented language, in general, state is carried by instances, methods are carried by classes, and inheritance is only of structure and behaviour. In ECMAScript, the state and methods are carried by objects, while structure, behaviour, and state are all inherited. All objects that do not directly contain a particular property that their prototype contains share that property and its value. Figure 1 illustrates this: CF is a constructor (and also an object). Five objects have been created by using new expressions: cf 1 , cf 2 , cf 3 , cf 4 , and cf 5 . Each of these objects contains properties named q1 and q2 . The dashed lines represent the implicit prototype relationship; so, for example, cf 3 ’s prototype is CF p . The constructor, CF, has two properties itself, named P1 and P2 , which are not visible to CF p , cf 1 , cf 2 , cf 3 , cf 4 , or cf 5 . The property named CFP1 in CF p is shared by cf 1 , cf 2 , cf 3 , cf 4 , and cf 5 (but not by CF), as are any properties found in CF p ’s implicit prototype chain that are not named q1 , q2 , or CFP1 . Notice that there is no implicit prototype link between CF and CF p . Unlike most class-based object languages, properties can be added to objects dynamically by assigning values to them. That is, constructors are not required to name or assign values to all or any of the constructed object’s properties. In the above diagram, one could add a new shared property for cf 1 , cf 2 , cf 3 , cf 4 , and cf 5 by assigning a new value to the property in CF p . Although ECMAScript objects are not inherently class-based, it is often convenient to define class-like abstractions based upon a common pattern of constructor functions, prototype objects, and methods. The ECMAScript built-in objects themselves follow such a class-like pattern. Beginning with ECMAScript 2015, the ECMAScript language includes syntactic class definitions that permit programmers to concisely define objects that conform to the same class-like abstraction pattern used by the built-in objects. 4.2.2 The ECMAScript Language recognizes the possibility that some users of the language may wish to restrict their usage of some features available in the language. They might do so in the interests of security, to avoid what they consider to be error-prone features, to get enhanced error checking, or for other reasons of their choosing. In support of this possibility, ECMAScript defines a strict variant of the language. The strict variant of the language excludes some specific syntactic and semantic features of the regular ECMAScript language and modifies the detailed semantics of some features. The strict variant also specifies additional error conditions that must be reported by throwing error exceptions in situations that are not specified as errors by the non-strict form of the language. The strict variant of ECMAScript is commonly referred to as the strict mode of the language. Strict mode selection and use of the strict mode syntax and semantics of ECMAScript is explicitly made at the level of individual ECMAScript source text units. Because strict mode is selected at the level of a syntactic source text unit, strict mode only imposes restrictions that have local effect within such a source text unit. Strict mode does not restrict or modify any aspect of the ECMAScript semantics that must operate consistently across multiple source text units. A complete ECMAScript program may be composed of both strict mode and non-strict mode ECMAScript source text units. In this case, strict mode only applies when actually executing code that is defined within a strict mode source text unit. In order to conform to this specification, an ECMAScript implementation must implement both the full unrestricted ECMAScript language and the strict variant of the ECMAScript language as defined by this specification. In addition, an implementation must support the combination of unrestricted and strict mode source text units into a single composite program. 4.3 For the purposes of this document, the following terms and definitions apply. 4.3.1 set of data values as defined in clause 6 of this specification 4.3.2 member of one of the types Undefined, Null, Boolean, Number, Symbol, or String as defined in clause 6 NOTE A primitive value is a datum that is represented directly at the lowest level of the language implementation. 4.3.3 member of the type Object NOTE An object is a collection of properties and has a single prototype object. The prototype may be the null value. 4.3.4 function object that creates and initializes objects NOTE The value of a constructor’s prototype property is a prototype object that is used to implement inheritance and shared properties. 4.3.5 object that provides shared properties for other objects NOTE When a constructor creates an object, that object implicitly references the constructor’s prototype property for the purpose of resolving property references. The constructor’s prototype property can be referenced by the program expression constructor.prototype , and properties added to an object’s prototype are shared, through inheritance, by all objects sharing the prototype. Alternatively, a new object may be created with an explicitly specified prototype by using the Object.create built-in function. 4.3.6 object that has the default behaviour for the essential internal methods that must be supported by all objects 4.3.7 object that does not have the default behaviour for one or more of the essential internal methods that must be supported by all objects NOTE Any object that is not an ordinary object is an exotic object. 4.3.8 object whose semantics are defined by this specification 4.3.9 object specified and supplied by an ECMAScript implementation NOTE Standard built-in objects are defined in this specification. An ECMAScript implementation may specify and supply additional kinds of built-in objects. A built-in constructor is a built-in object that is also a constructor. 4.3.10 primitive value used when a variable has not been assigned a value 4.3.11 type whose sole value is the undefined value 4.3.12 primitive value that represents the intentional absence of any object value 4.3.13 type whose sole value is the null value 4.3.14 member of the Boolean type NOTE There are only two Boolean values, true and false 4.3.15 type consisting of the primitive values true and false 4.3.16 member of the Object type that is an instance of the standard built-in Boolean constructor NOTE A Boolean object is created by using the Boolean constructor in a new expression, supplying a Boolean value as an argument. The resulting object has an internal slot whose value is the Boolean value. A Boolean object can be coerced to a Boolean value. 4.3.17 primitive value that is a finite ordered sequence of zero or more 16-bit unsigned integer NOTE A String value is a member of the String type. Each integer value in the sequence usually represents a single 16-bit unit of UTF-16 text. However, ECMAScript does not place any restrictions or requirements on the values except that they must be 16-bit unsigned integers. 4.3.18 set of all possible String values 4.3.19 member of the Object type that is an instance of the standard built-in String constructor NOTE A String object is created by using the String constructor in a new expression, supplying a String value as an argument. The resulting object has an internal slot whose value is the String value. A String object can be coerced to a String value by calling the String constructor as a function (21.1.1.1). 4.3.20 primitive value corresponding to a double-precision 64-bit binary format IEEE 754-2008 value NOTE A Number value is a member of the Number type and is a direct representation of a number. 4.3.21 set of all possible Number values including the special “Not-a-Number” (NaN) value, positive infinity, and negative infinity 4.3.22 member of the Object type that is an instance of the standard built-in Number constructor NOTE A Number object is created by using the Number constructor in a new expression, supplying a number value as an argument. The resulting object has an internal slot whose value is the number value. A Number object can be coerced to a number value by calling the Number constructor as a function (20.1.1.1). 4.3.23 number value that is the positive infinite number value 4.3.24 number value that is an IEEE 754-2008 “Not-a-Number” value 4.3.25 primitive value that represents a unique, non-String Object property key 4.3.26 set of all possible Symbol values 4.3.27 member of the Object type that is an instance of the standard built-in Symbol constructor 4.3.28 member of the Object type that may be invoked as a subroutine NOTE In addition to its properties, a function contains executable code and state that determine how it behaves when invoked. A function’s code may or may not be written in ECMAScript. 4.3.29 built-in object that is a function NOTE Examples of built-in functions include parseInt and Math.exp . An implementation may provide implementation-dependent built-in functions that are not described in this specification. 4.3.30 part of an object that associates a key (either a String value or a Symbol value) and a value NOTE Depending upon the form of the property the value may be represented either directly as a data value (a primitive value, an object, or a function object) or indirectly by a pair of accessor functions. 4.3.31 function that is the value of a property NOTE When a function is called as a method of an object, the object is passed to the function as its this value. 4.3.32 method that is a built-in function NOTE Standard built-in methods are defined in this specification, and an ECMAScript implementation may specify and provide other additional built-in methods. 4.3.33 internal value that defines some characteristic of a property 4.3.34 property that is directly contained by its object 4.3.35 property of an object that is not an own property but is a property (either own or inherited) of the object’s prototype 4.4 The remainder of this specification is organized as follows: Clause 5 defines the notational conventions used throughout the specification. Clauses 6−9 define the execution environment within which ECMAScript programs operate. Clauses 10−16 define the actual ECMAScript programming language including its syntactic encoding and the execution semantics of all language features. Clauses 17−26 define the ECMAScript standard library. It includes the definitions of all of the standard objects that are available for use by ECMAScript programs as they execute.

5 5.1 5.1.1 A context-free grammar consists of a number of productions. Each production has an abstract symbol called a nonterminal as its left-hand side, and a sequence of zero or more nonterminal and terminal symbols as its right-hand side. For each grammar, the terminal symbols are drawn from a specified alphabet. A chain production is a production that has exactly one nonterminal symbol on its right-hand side along with zero or more terminal symbols. Starting from a sentence consisting of a single distinguished nonterminal, called the goal symbol, a given context-free grammar specifies a language, namely, the (perhaps infinite) set of possible sequences of terminal symbols that can result from repeatedly replacing any nonterminal in the sequence with a right-hand side of a production for which the nonterminal is the left-hand side. 5.1.2 A lexical grammar for ECMAScript is given in clause 11. This grammar has as its terminal symbols Unicode code points that conform to the rules for SourceCharacter defined in 10.1. It defines a set of productions, starting from the goal symbol InputElementDiv, InputElementTemplateTail, or InputElementRegExp, or InputElementRegExpOrTemplateTail, that describe how sequences of such code points are translated into a sequence of input elements. Input elements other than white space and comments form the terminal symbols for the syntactic grammar for ECMAScript and are called ECMAScript tokens. These tokens are the reserved words, identifiers, literals, and punctuators of the ECMAScript language. Moreover, line terminators, although not considered to be tokens, also become part of the stream of input elements and guide the process of automatic semicolon insertion (11.9). Simple white space and single-line comments are discarded and do not appear in the stream of input elements for the syntactic grammar. A MultiLineComment (that is, a comment of the form /* … */ regardless of whether it spans more than one line) is likewise simply discarded if it contains no line terminator; but if a MultiLineComment contains one or more line terminators, then it is replaced by a single line terminator, which becomes part of the stream of input elements for the syntactic grammar. A RegExp grammar for ECMAScript is given in 21.2.1. This grammar also has as its terminal symbols the code points as defined by SourceCharacter. It defines a set of productions, starting from the goal symbol Pattern, that describe how sequences of code points are translated into regular expression patterns. Productions of the lexical and RegExp grammars are distinguished by having two colons “::” as separating punctuation. The lexical and RegExp grammars share some productions. 5.1.3 Another grammar is used for translating Strings into numeric values. This grammar is similar to the part of the lexical grammar having to do with numeric literals and has as its terminal symbols SourceCharacter. This grammar appears in 7.1.3.1. Productions of the numeric string grammar are distinguished by having three colons “:::” as punctuation. 5.1.4 The syntactic grammar for ECMAScript is given in clauses 11, 12, 13, 14, and 15. This grammar has ECMAScript tokens defined by the lexical grammar as its terminal symbols (5.1.2). It defines a set of productions, starting from two alternative goal symbols Script and Module, that describe how sequences of tokens form syntactically correct independent components of ECMAScript programs. When a stream of code points is to be parsed as an ECMAScript Script or Module, it is first converted to a stream of input elements by repeated application of the lexical grammar; this stream of input elements is then parsed by a single application of the syntactic grammar. The input stream is syntactically in error if the tokens in the stream of input elements cannot be parsed as a single instance of the goal nonterminal (Script or Module), with no tokens left over. Productions of the syntactic grammar are distinguished by having just one colon “:” as punctuation. The syntactic grammar as presented in clauses 12, 13, 14 and 15 is not a complete account of which token sequences are accepted as a correct ECMAScript Script or Module. Certain additional token sequences are also accepted, namely, those that would be described by the grammar if only semicolons were added to the sequence in certain places (such as before line terminator characters). Furthermore, certain token sequences that are described by the grammar are not considered acceptable if a line terminator character appears in certain “awkward” places. In certain cases in order to avoid ambiguities the syntactic grammar uses generalized productions that permit token sequences that do not form a valid ECMAScript Script or Module. For example, this technique is used for object literals and object destructuring patterns. In such cases a more restrictive supplemental grammar is provided that further restricts the acceptable token sequences. In certain contexts, when explicitly specified, the input elements corresponding to such a production are parsed again using a goal symbol of a supplemental grammar. The input stream is syntactically in error if the tokens in the stream of input elements parsed by a cover grammar cannot be parsed as a single instance of the corresponding supplemental goal symbol, with no tokens left over. 5.1.5 Terminal symbols of the lexical, RegExp, and numeric string grammars are shown in fixed width font, both in the productions of the grammars and throughout this specification whenever the text directly refers to such a terminal symbol. These are to appear in a script exactly as written. All terminal symbol code points specified in this way are to be understood as the appropriate Unicode code points from the Basic Latin range, as opposed to any similar-looking code points from other Unicode ranges. Nonterminal symbols are shown in italic type. The definition of a nonterminal (also called a “production”) is introduced by the name of the nonterminal being defined followed by one or more colons. (The number of colons indicates to which grammar the production belongs.) One or more alternative right-hand sides for the nonterminal then follow on succeeding lines. For example, the syntactic definition: WhileStatement : while ( Expression ) Statement states that the nonterminal WhileStatement represents the token while , followed by a left parenthesis token, followed by an Expression, followed by a right parenthesis token, followed by a Statement. The occurrences of Expression and Statement are themselves nonterminals. As another example, the syntactic definition: ArgumentList : AssignmentExpression ArgumentList , AssignmentExpression states that an ArgumentList may represent either a single AssignmentExpression or an ArgumentList, followed by a comma, followed by an AssignmentExpression. This definition of ArgumentList is recursive, that is, it is defined in terms of itself. The result is that an ArgumentList may contain any positive number of arguments, separated by commas, where each argument expression is an AssignmentExpression. Such recursive definitions of nonterminals are common. The subscripted suffix “ opt ”, which may appear after a terminal or nonterminal, indicates an optional symbol. The alternative containing the optional symbol actually specifies two right-hand sides, one that omits the optional element and one that includes it. This means that: VariableDeclaration : BindingIdentifier Initializer opt is a convenient abbreviation for: VariableDeclaration : BindingIdentifier BindingIdentifier Initializer and that: IterationStatement : for ( LexicalDeclaration Expression opt ; Expression opt ) Statement is a convenient abbreviation for: IterationStatement : for ( LexicalDeclaration ; Expression opt ) Statement for ( LexicalDeclaration Expression ; Expression opt ) Statement which in turn is an abbreviation for: IterationStatement : for ( LexicalDeclaration ; ) Statement for ( LexicalDeclaration ; Expression ) Statement for ( LexicalDeclaration Expression ; ) Statement for ( LexicalDeclaration Expression ; Expression ) Statement so, in this example, the nonterminal IterationStatement actually has four alternative right-hand sides. A production may be parameterized by a subscripted annotation of the form “ [parameters] ”, which may appear as a suffix to the nonterminal symbol defined by the production. “ parameters ” may be either a single name or a comma separated list of names. A parameterized production is shorthand for a set of productions defining all combinations of the parameter names, preceded by an underscore, appended to the parameterized nonterminal symbol. This means that: StatementList [Return] : ReturnStatement ExpressionStatement is a convenient abbreviation for: StatementList : ReturnStatement ExpressionStatement StatementList_Return : ReturnStatement ExpressionStatement and that: StatementList [Return, In] : ReturnStatement ExpressionStatement is an abbreviation for: StatementList : ReturnStatement ExpressionStatement StatementList_Return : ReturnStatement ExpressionStatement StatementList_In : ReturnStatement ExpressionStatement StatementList_Return_In : ReturnStatement ExpressionStatement Multiple parameters produce a combinatory number of productions, not all of which are necessarily referenced in a complete grammar. References to nonterminals on the right-hand side of a production can also be parameterized. For example: StatementList : ReturnStatement ExpressionStatement [In] is equivalent to saying: StatementList : ReturnStatement ExpressionStatement_In A nonterminal reference may have both a parameter list and an “ opt ” suffix. For example: VariableDeclaration : BindingIdentifier Initializer [In] opt is an abbreviation for: VariableDeclaration : BindingIdentifier BindingIdentifier Initializer_In Prefixing a parameter name with “ ? ” on a right-hand side nonterminal reference makes that parameter value dependent upon the occurrence of the parameter name on the reference to the current production’s left-hand side symbol. For example: VariableDeclaration [In] : BindingIdentifier Initializer [?In] is an abbreviation for: VariableDeclaration : BindingIdentifier Initializer VariableDeclaration_In : BindingIdentifier Initializer_In If a right-hand side alternative is prefixed with “[+parameter]” that alternative is only available if the named parameter was used in referencing the production’s nonterminal symbol. If a right-hand side alternative is prefixed with “[~parameter]” that alternative is only available if the named parameter was not used in referencing the production’s nonterminal symbol. This means that: StatementList [Return] : [+Return] ReturnStatement ExpressionStatement is an abbreviation for: StatementList : ExpressionStatement StatementList_Return : ReturnStatement ExpressionStatement and that StatementList [Return] : [~Return] ReturnStatement ExpressionStatement is an abbreviation for: StatementList : ReturnStatement ExpressionStatement StatementList_Return : ExpressionStatement When the words “one of” follow the colon(s) in a grammar definition, they signify that each of the terminal symbols on the following line or lines is an alternative definition. For example, the lexical grammar for ECMAScript contains the production: NonZeroDigit :: one of 1 2 3 4 5 6 7 8 9 which is merely a convenient abbreviation for: NonZeroDigit :: 1 2 3 4 5 6 7 8 9 If the phrase “[empty]” appears as the right-hand side of a production, it indicates that the production's right-hand side contains no terminals or nonterminals. If the phrase “[lookahead ∉ set ]” appears in the right-hand side of a production, it indicates that the production may not be used if the immediately following input token sequence is a member of the given set . The set can be written as a comma separated list of one or two element terminal sequences enclosed in curly brackets. For convenience, the set can also be written as a nonterminal, in which case it represents the set of all terminals to which that nonterminal could expand. If the set consists of a single terminal the phrase “[lookahead ≠ terminal ]” may be used. For example, given the definitions DecimalDigit :: one of 0 1 2 3 4 5 6 7 8 9 DecimalDigits :: DecimalDigit DecimalDigits DecimalDigit the definition LookaheadExample :: n [lookahead ∉ { 1 , 3 , 5 , 7 , 9 }] DecimalDigits DecimalDigit [lookahead ∉ DecimalDigit ] matches either the letter n followed by one or more decimal digits the first of which is even, or a decimal digit not followed by another decimal digit. If the phrase “[no LineTerminator here]” appears in the right-hand side of a production of the syntactic grammar, it indicates that the production is a restricted production: it may not be used if a LineTerminator occurs in the input stream at the indicated position. For example, the production: ThrowStatement : throw [no LineTerminator here] Expression ; indicates that the production may not be used if a LineTerminator occurs in the script between the throw token and the Expression. Unless the presence of a LineTerminator is forbidden by a restricted production, any number of occurrences of LineTerminator may appear between any two consecutive tokens in the stream of input elements without affecting the syntactic acceptability of the script. When an alternative in a production of the lexical grammar or the numeric string grammar appears to be a multi-code point token, it represents the sequence of code points that would make up such a token. The right-hand side of a production may specify that certain expansions are not permitted by using the phrase “but not” and then indicating the expansions to be excluded. For example, the production: Identifier :: IdentifierName but not ReservedWord means that the nonterminal Identifier may be replaced by any sequence of code points that could replace IdentifierName provided that the same sequence of code points could not replace ReservedWord. Finally, a few nonterminal symbols are described by a descriptive phrase in sans-serif type in cases where it would be impractical to list all the alternatives: SourceCharacter :: any Unicode code point 5.2 The specification often uses a numbered list to specify steps in an algorithm. These algorithms are used to precisely specify the required semantics of ECMAScript language constructs. The algorithms are not intended to imply the use of any specific implementation technique. In practice, there may be more efficient algorithms available to implement a given feature. Algorithms may be explicitly parameterized, in which case the names and usage of the parameters must be provided as part of the algorithm’s definition. In order to facilitate their use in multiple parts of this specification, some algorithms, called abstract operations, are named and written in parameterized functional form so that they may be referenced by name from within other algorithms. Abstract operations are typically referenced using a functional application style such as operationName(arg1, arg2). Some abstract operations are treated as polymorphically dispatched methods of class-like specification abstractions. Such method-like abstract operations are typically referenced using a method application style such as someValue.operationName(arg1, arg2). Algorithms may be associated with productions of one of the ECMAScript grammars. A production that has multiple alternative definitions will typically have a distinct algorithm for each alternative. When an algorithm is associated with a grammar production, it may reference the terminal and nonterminal symbols of the production alternative as if they were parameters of the algorithm. When used in this manner, nonterminal symbols refer to the actual alternative definition that is matched when parsing the source text. When an algorithm is associated with a production alternative, the alternative is typically shown without any “[ ]” grammar annotations. Such annotations should only affect the syntactic recognition of the alternative and have no effect on the associated semantics for the alternative. Unless explicitly specified otherwise, all chain productions have an implicit definition for every algorithm that might be applied to that production’s left-hand side nonterminal. The implicit definition simply reapplies the same algorithm name with the same parameters, if any, to the chain production’s sole right-hand side nonterminal and then returns the result. For example, assume there is a production: Block : { StatementList } but there is no corresponding Evaluation algorithm that is explicitly specified for that production. If in some algorithm there is a statement of the form: “Return the result of evaluating Block” it is implicit that an Evaluation algorithm exists of the form: Runtime Semantics: Evaluation Block : { StatementList } Return the result of evaluating StatementList. For clarity of expression, algorithm steps may be subdivided into sequential substeps. Substeps are indented and may themselves be further divided into indented substeps. Outline numbering conventions are used to identify substeps with the first level of substeps labelled with lower case alphabetic characters and the second level of substeps labelled with lower case roman numerals. If more than three levels are required these rules repeat with the fourth level using numeric labels. For example: Top-level step Substep. Substep. Subsubstep. Subsubsubstep Subsubsubsubstep Subsubsubsubsubstep A step or substep may be written as an “if” predicate that conditions its substeps. In this case, the substeps are only applied if the predicate is true. If a step or substep begins with the word “else”, it is a predicate that is the negation of the preceding “if” predicate step at the same level. A step may specify the iterative application of its substeps. A step that begins with “Assert:” asserts an invariant condition of its algorithm. Such assertions are used to make explicit algorithmic invariants that would otherwise be implicit. Such assertions add no additional semantic requirements and hence need not be checked by an implementation. They are used simply to clarify algorithms. Mathematical operations such as addition, subtraction, negation, multiplication, division, and the mathematical functions defined later in this clause should always be understood as computing exact mathematical results on mathematical real numbers, which unless otherwise noted do not include infinities and do not include a negative zero that is distinguished from positive zero. Algorithms in this standard that model floating-point arithmetic include explicit steps, where necessary, to handle infinities and signed zero and to perform rounding. If a mathematical operation or function is applied to a floating-point number, it should be understood as being applied to the exact mathematical value represented by that floating-point number; such a floating-point number must be finite, and if it is +0 or −0 then the corresponding mathematical value is simply 0. The mathematical function abs(x) produces the absolute value of x , which is −x if x is negative (less than zero) and otherwise is x itself. The mathematical function sign(x) produces 1 if x is positive and −1 if x is negative. The sign function is not used in this standard for cases when x is zero. The mathematical function min(x 1 , x 2 , ..., x n ) produces the mathematically smallest of x 1 through x n . The mathematical function max(x 1 , x 2 , ..., x n ) produces the mathematically largest of x 1 through x n . The domain and range of these mathematical functions include +∞ and −∞. The notation “x modulo y” ( y must be finite and nonzero) computes a value k of the same sign as y (or zero) such that abs(k) < abs(y) and x−k = q × y for some integer q . The mathematical function floor(x) produces the largest integer (closest to positive infinity) that is not larger than x . NOTE floor(x) = x−(x modulo 1). 5.3 Context-free grammars are not sufficiently powerful to express all the rules that define whether a stream of input elements form a valid ECMAScript Script or Module that may be evaluated. In some situations additional rules are needed that may be expressed using either ECMAScript algorithm conventions or prose requirements. Such rules are always associated with a production of a grammar and are called the static semantics of the production. Static Semantic Rules have names and typically are defined using an algorithm. Named Static Semantic Rules are associated with grammar productions and a production that has multiple alternative definitions will typically have for each alternative a distinct algorithm for each applicable named static semantic rule. Unless otherwise specified every grammar production alternative in this specification implicitly has a definition for a static semantic rule named Contains which takes an argument named symbol whose value is a terminal or nonterminal of the grammar that includes the associated production. The default definition of Contains is: For each terminal and nonterminal grammar symbol, sym, in the definition of this production do If sym is the same grammar symbol as symbol, return true. If sym is a nonterminal, then Let contained be the result of sym Contains symbol. If contained is true, return true. Return false. The above definition is explicitly over-ridden for specific productions. A special kind of static semantic rule is an Early Error Rule. Early error rules define early error conditions (see clause 16) that are associated with specific grammar productions. Evaluation of most early error rules are not explicitly invoked within the algorithms of this specification. A conforming implementation must, prior to the first evaluation of a Script or Module, validate all of the early error rules of the productions used to parse that Script or Module. If any of the early error rules are violated the Script or Module is invalid and cannot be evaluated.

10 10.1 Syntax SourceCharacter :: any Unicode code point ECMAScript code is expressed using Unicode, version 5.1 or later. ECMAScript source text is a sequence of code points. All Unicode code point values from U+0000 to U+10FFFF, including surrogate code points, may occur in source text where permitted by the ECMAScript grammars. The actual encodings used to store and interchange ECMAScript source text is not relevant to this specification. Regardless of the external source text encoding, a conforming ECMAScript implementation processes the source text as if it was an equivalent sequence of SourceCharacter values. Each SourceCharacter being a Unicode code point. Conforming ECMAScript implementations are not required to perform any normalization of source text, or behave as though they were performing normalization of source text. The components of a combining character sequence are treated as individual Unicode code points even though a user might think of the whole sequence as a single character. NOTE In string literals, regular expression literals, template literals and identifiers, any Unicode code point may also be expressed using Unicode escape sequences that explicitly express a code point’s numeric value. Within a comment, such an escape sequence is effectively ignored as part of the comment. ECMAScript differs from the Java programming language in the behaviour of Unicode escape sequences. In a Java program, if the Unicode escape sequence \u000A , for example, occurs within a single-line comment, it is interpreted as a line terminator (Unicode code point U+000A is LINE FEED (LF) and therefore the next code point is not part of the comment. Similarly, if the Unicode escape sequence \u000A occurs within a string literal in a Java program, it is likewise interpreted as a line terminator, which is not allowed within a string literal—one must write

instead of \u000A to cause a LINE FEED (LF) to be part of the String value of a string literal. In an ECMAScript program, a Unicode escape sequence occurring within a comment is never interpreted and therefore cannot contribute to termination of the comment. Similarly, a Unicode escape sequence occurring within a string literal in an ECMAScript program always contributes to the literal and is never interpreted as a line terminator or as a code point that might terminate the string literal. 10.1.1 UTF16Encoding ( cp ) The UTF16Encoding of a numeric code point value, cp , is determined as follows: Assert: 0 ≤ cp ≤ 0x10FFFF. If cp ≤ 65535, return cp. Let cu1 be floor((cp – 65536) / 1024) + 0xD800. Let cu2 be ((cp – 65536) modulo 1024) + 0xDC00. Return the code unit sequence consisting of cu1 followed by cu2. 10.1.2 Two code units, lead and trail , that form a UTF-16 surrogate pair are converted to a code point by performing the following steps: Assert: 0xD800 ≤ lead ≤ 0xDBFF and 0xDC00 ≤ trail ≤ 0xDFFF. Let cp be (lead – 0xD800) × 1024 + (trail – 0xDC00) + 0x10000. Return the code point cp. 10.2 There are four types of ECMAScript code: Global code is source text that is treated as an ECMAScript Script . The global code of a particular Script does not include any source text that is parsed as part of a FunctionDeclaration , FunctionExpression , GeneratorDeclaration , GeneratorExpression , MethodDefinition , ArrowFunction, ClassDeclaration , or ClassExpression .

Eval code is the source text supplied to the built-in eval function. More precisely, if the parameter to the built-in eval function is a String, it is treated as an ECMAScript Script . The eval code for a particular invocation of eval is the global code portion of that Script .

Function code is source text that is parsed to supply the value of the [[ECMAScriptCode]] and [[FormalParameters]] internal slots (see 9.2) of an ECMAScript function object. The function code of a particular ECMAScript function does not include any source text that is parsed as the function code of a nested FunctionDeclaration , FunctionExpression , GeneratorDeclaration , GeneratorExpression , MethodDefinition , ArrowFunction, ClassDeclaration , or ClassExpression .

Module code is source text that is code that is provided as a ModuleBody. It is the code that is directly evaluated when a module is initialized. The module code of a particular module does not include any source text that is parsed as part of a nested FunctionDeclaration, FunctionExpression, GeneratorDeclaration, GeneratorExpression, MethodDefinition, ArrowFunction, ClassDeclaration, or ClassExpression. NOTE Function code is generally provided as the bodies of Function Definitions (14.1), Arrow Function Definitions (14.2), Method Definitions (14.3) and Generator Definitions (14.4). Function code is also derived from the arguments to the Function constructor (19.2.1.1) and the GeneratorFunction constructor (25.2.1.1). 10.2.1 An ECMAScript Script syntactic unit may be processed using either unrestricted or strict mode syntax and semantics. Code is interpreted as strict mode code in the following situations: Global code is strict mode code if it begins with a Directive Prologue that contains a Use Strict Directive (see 14.1.1).

Module code is always strict mode code.

All parts of a ClassDeclaration or a ClassExpression are strict mode code.

Eval code is strict mode code if it begins with a Directive Prologue that contains a Use Strict Directive or if the call to eval is a direct eval (see 12.3.4.1) that is contained in strict mode code.

Function code is strict mode code if the associated FunctionDeclaration, FunctionExpression, GeneratorDeclaration, GeneratorExpression, MethodDefinition, or ArrowFunction is contained in strict mode code or if the code that produces the value of the function’s [[ECMAScriptCode]] internal slot begins with a Directive Prologue that contains a Use Strict Directive.

Function code that is supplied as the arguments to the built-in Function and Generator constructors is strict mode code if the last argument is a String that when processed is a FunctionBody that begins with a Directive Prologue that contains a Use Strict Directive. ECMAScript code that is not strict mode code is called non-strict code. 10.2.2 An ECMAScript implementation may support the evaluation of exotic function objects whose evaluative behaviour is expressed in some implementation defined form of executable code other than via ECMAScript code. Whether a function object is an ECMAScript code function or a non-ECMAScript function is not semantically observable from the perspective of an ECMAScript code function that calls or is called by such a non-ECMAScript function.

11 The source text of an ECMAScript Script or Module is first converted into a sequence of input elements, which are tokens, line terminators, comments, or white space. The source text is scanned from left to right, repeatedly taking the longest possible sequence of code points as the next input element. There are several situations where the identification of lexical input elements is sensitive to the syntactic grammar context that is consuming the input elements. This requires multiple goal symbols for the lexical grammar. The InputElementRegExpOrTemplateTail goal is used in syntactic grammar contexts where a RegularExpressionLiteral, a TemplateMiddle, or a TemplateTail is permitted. The InputElementRegExp goal symbol is used in all syntactic grammar contexts where a RegularExpressionLiteral is permitted but neither a TemplateMiddle, nor a TemplateTail is permitted. The InputElementTemplateTail goal is used in all syntactic grammar contexts where a TemplateMiddle or a TemplateTail is permitted but a RegularExpressionLiteral is not permitted. In all other contexts, InputElementDiv is used as the lexical goal symbol. NOTE The use of multiple lexical goals ensures that there are no lexical ambiguities that would affect automatic semicolon insertion. For example, there are no syntactic grammar contexts where both a leading division or division-assignment, and a leading RegularExpressionLiteral are permitted. This is not affected by semicolon insertion (see 11.9); in examples such as the following: a = b

/hi/g.exec(c).map(d); where the first non-whitespace, non-comment code point after a LineTerminator is U+002F (SOLIDUS) and the syntactic context allows division or division-assignment, no semicolon is inserted at the LineTerminator. That is, the above example is interpreted in the same way as: a = b / hi / g.exec(c).map(d); Syntax InputElementDiv :: WhiteSpace LineTerminator Comment CommonToken DivPunctuator RightBracePunctuator InputElementRegExp :: WhiteSpace LineTerminator Comment CommonToken RightBracePunctuator RegularExpressionLiteral InputElementRegExpOrTemplateTail :: WhiteSpace LineTerminator Comment CommonToken RegularExpressionLiteral TemplateSubstitutionTail InputElementTemplateTail :: WhiteSpace LineTerminator Comment CommonToken DivPunctuator TemplateSubstitutionTail 11.1 The Unicode format-control characters (i.e., the characters in category “Cf” in the Unicode Character Database such as LEFT-TO-RIGHT MARK or RIGHT-TO-LEFT MARK) are control codes used to control the formatting of a range of text in the absence of higher-level protocols for this (such as mark-up languages). It is useful to allow format-control characters in source text to facilitate editing and display. All format control characters may be used within comments, and within string literals, template literals, and regular expression literals. U+200C (ZERO WIDTH NON-JOINER) and U+200D (ZERO WIDTH JOINER) are format-control characters that are used to make necessary distinctions when forming words or phrases in certain languages. In ECMAScript source text these code points may also be used in an IdentifierName (see 11.6.1) after the first character. U+FEFF (ZERO WIDTH NO-BREAK SPACE) is a format-control character used primarily at the start of a text to mark it as Unicode and to allow detection of the text's encoding and byte order. <ZWNBSP> characters intended for this purpose can sometimes also appear after the start of a text, for example as a result of concatenating files. In ECMAScript source text <ZWNBSP> code points are treated as white space characters (see 11.2). The special treatment of certain format-control characters outside of comments, string literals, and regular expression literals is summarized in Table 31. Table 31 — Format-Control Code Point Usage Code Point Name Abbreviation Usage U+200C ZERO WIDTH NON-JOINER <ZWNJ> IdentifierPart U+200D ZERO WIDTH JOINER <ZWJ> IdentifierPart U+FEFF ZERO WIDTH NO-BREAK SPACE <ZWNBSP> WhiteSpace 11.2 White space code points are used to improve source text readability and to separate tokens (indivisible lexical units) from each other, but are otherwise insignificant. White space code points may occur between any two tokens and at the start or end of input. White space code points may occur within a StringLiteral, a RegularExpressionLiteral, a Template, or a TemplateSubstitutionTail where they are considered significant code points forming part of a literal value. They may also occur within a Comment, but cannot appear within any other kind of token. The ECMAScript white space code points are listed in Table 32. Table 32 — White Space Code Points Code Point Name Abbreviation U+0009 CHARACTER TABULATION <TAB> U+000B LINE TABULATION <VT> U+000C FORM FEED (FF) <FF> U+0020 SPACE <SP> U+00A0 NO-BREAK SPACE <NBSP> U+FEFF ZERO WIDTH NO-BREAK SPACE <ZWNBSP> Other category “Zs” Any other Unicode “Separator, space” code point <USP> ECMAScript implementations must recognize as WhiteSpace code points listed in the “Separator, space” (Zs) category by Unicode 5.1. ECMAScript implementations may also recognize as WhiteSpace additional category Zs code points from subsequent editions of the Unicode Standard. NOTE Other than for the code points listed in Table 32, ECMAScript WhiteSpace intentionally excludes all code points that have the Unicode “White_Space” property but which are not classified in category “Zs”. Syntax WhiteSpace :: <TAB> <VT> <FF> <SP> <NBSP> <ZWNBSP> <USP> 11.3 Like white space code points, line terminator code points are used to improve source text readability and to separate tokens (indivisible lexical units) from each other. However, unlike white space code points, line terminators have some influence over the behaviour of the syntactic grammar. In general, line terminators may occur between any two tokens, but there are a few places where they are forbidden by the syntactic grammar. Line terminators also affect the process of automatic semicolon insertion (11.9). A line terminator cannot occur within any token except a StringLiteral, Template, or TemplateSubstitutionTail. Line terminators may only occur within a StringLiteral token as part of a LineContinuation. A line terminator can occur within a MultiLineComment (11.4) but cannot occur within a SingleLineComment. Line terminators are included in the set of white space code points that are matched by the \s class in regular expressions. The ECMAScript line terminator code points are listed in Table 33. Table 33 — Line Terminator Code Points Code Point Unicode Name Abbreviation U+000A LINE FEED (LF) <LF> U+000D CARRIAGE RETURN (CR) <CR> U+2028 LINE SEPARATOR <LS> U+2029 PARAGRAPH SEPARATOR <PS> Only the Unicode code points in Table 33 are treated as line terminators. Other new line or line breaking Unicode code points are not treated as line terminators but are treated as white space if they meet the requirements listed in Table 32. The sequence <CR><LF> is commonly used as a line terminator. It should be considered a single SourceCharacter for the purpose of reporting line numbers. Syntax LineTerminator :: <LF> <CR> <LS> <PS> LineTerminatorSequence :: <LF> <CR> [lookahead ≠ <LF> ] <LS> <PS> <CR> <LF> 11.5 Syntax CommonToken :: IdentifierName Punctuator NumericLiteral StringLiteral Template NOTE The DivPunctuator, RegularExpressionLiteral, RightBracePunctuator, and TemplateSubstitutionTail productions derive additional tokens that are not included in the CommonToken production. 11.6 IdentifierName and ReservedWord are tokens that are interpreted according to the Default Identifier Syntax given in Unicode Standard Annex #31, Identifier and Pattern Syntax, with some small modifications. ReservedWord is an enumerated subset of IdentifierName. The syntactic grammar defines Identifier as an IdentifierName that is not a ReservedWord (see 11.6.2). The Unicode identifier grammar is based on character properties specified by the Unicode Standard. The Unicode code points in the specified categories in version 5.1.0 of the Unicode standard must be treated as in those categories by all conforming ECMAScript implementations. ECMAScript implementations may recognize identifier code points defined in later editions of the Unicode Standard. NOTE 1 This standard specifies specific code point additions: U+0024 (DOLLAR SIGN) and U+005F (LOW LINE) are permitted anywhere in an IdentifierName, and the code points U+200C (ZERO WIDTH NON-JOINER) and U+200D (ZERO WIDTH JOINER) are permitted anywhere after the first code point of an IdentifierName. Unicode escape sequences are permitted in an IdentifierName, where they contribute a single Unicode code point to the IdentifierName. The code point is expressed by the HexDigits of the UnicodeEscapeSequence (see 11.8.4). The \ preceding the UnicodeEscapeSequence and the u and { } code units, if they appear, do not contribute code points to the IdentifierName. A UnicodeEscapeSequence cannot be used to put a code point into an IdentifierName that would otherwise be illegal. In other words, if a \ UnicodeEscapeSequence sequence were replaced by the SourceCharacter it contributes, the result must still be a valid IdentifierName that has the exact same sequence of SourceCharacter elements as the original IdentifierName. All interpretations of IdentifierName within this specification are based upon their actual code points regardless of whether or not an escape sequence was used to contribute any particular code point. Two IdentifierName that are canonically equivalent according to the Unicode standard are not equal unless, after replacement of each UnicodeEscapeSequence, they are represented by the exact same sequence of code points. Syntax IdentifierName :: IdentifierStart IdentifierName IdentifierPart IdentifierStart :: UnicodeIDStart $ _ \ UnicodeEscapeSequence IdentifierPart :: UnicodeIDContinue $ _ \ UnicodeEscapeSequence <ZWNJ> <ZWJ> UnicodeIDStart :: any Unicode code point with the Unicode property “ID_Start” UnicodeIDContinue :: any Unicode code point with the Unicode property “ID_Continue” The definitions of the nonterminal UnicodeEscapeSequence is given in 11.8.4. NOTE 2 The sets of code points with Unicode properties “ID_Start” and “ID_Continue” include, respectively, the code points with Unicode properties “Other_ID_Start” and “Other_ID_Continue”. 11.6.1 11.6.1.1 IdentifierStart :: \ UnicodeEscapeSequence It is a Syntax Error if SV(UnicodeEscapeSequence) is none of "$" , or "_" , or the UTF16Encoding (10.1.1) of a code point matched by the UnicodeIDStart lexical grammar production. IdentifierPart :: \ UnicodeEscapeSequence It is a Syntax Error if SV(UnicodeEscapeSequence) is none of "$" , or "_" , or the UTF16Encoding (10.1.1) of either <ZWNJ> or <ZWJ>, or the UTF16Encoding of a Unicode code point that would be matched by the UnicodeIDContinue lexical grammar production. 11.6.1.2 StringValue See also: 11.8.4.2, 12.1.4. IdentifierName :: IdentifierStart IdentifierName IdentifierPart Return the String value consisting of the sequence of code units corresponding to IdentifierName. In determining the sequence any occurrences of \ UnicodeEscapeSequence are first replaced with the code point represented by the UnicodeEscapeSequence and then the code points of the entire IdentifierName are converted to code units by UTF16Encoding (10.1.1) each code point. 11.6.2 A reserved word is an IdentifierName that cannot be used as an Identifier. Syntax ReservedWord :: Keyword FutureReservedWord NullLiteral BooleanLiteral NOTE The ReservedWord definitions are specified as literal sequences of specific SourceCharacter elements. A code point in a ReservedWord cannot be expressed by a \ UnicodeEscapeSequence. 11.6.2.1 The following tokens are ECMAScript keywords and may not be used as Identifiers in ECMAScript programs. Syntax Keyword :: one of break do in typeof case else instanceof var catch export new void class extends return while const finally super with continue for switch yield debugger function this default if throw delete import try NOTE In some contexts yield is given the semantics of an Identifier. See 12.1.1. In strict mode code, let and static are treated as reserved keywords through static semantic restrictions (see 12.1.1, 13.3.1.1, 13.7.5.1, and 14.5.1) rather than the lexical grammar. 11.6.2.2 The following tokens are reserved for used as keywords in future language extensions. Syntax FutureReservedWord :: enum await await is only treated as a FutureReservedWord when Module is the goal symbol of the syntactic grammar. NOTE Use of the following tokens within strict mode code (see 10.2.1) is also reserved. That usage is restricted using static semantic restrictions (see 12.1.1) rather than the lexical grammar: implements package protected interface private public 11.7 Syntax Punctuator :: one of { ( ) [ ] . ... ; , < > <= >= == != === !== + - * % ++ -- << >> >>> & | ^ ! ~ && || ? : = += -= *= %= <<= >>= >>>= &= |= ^= => DivPunctuator :: / /= RightBracePunctuator :: } 11.8 11.8.1 Syntax NullLiteral :: null 11.8.2 Syntax BooleanLiteral :: true false 11.8.3 Syntax NumericLiteral :: DecimalLiteral BinaryIntegerLiteral OctalIntegerLiteral HexIntegerLiteral DecimalLiteral :: DecimalIntegerLiteral . DecimalDigits opt ExponentPart opt . DecimalDigits ExponentPart opt DecimalIntegerLiteral ExponentPart opt DecimalIntegerLiteral :: 0 NonZeroDigit DecimalDigits opt DecimalDigits :: DecimalDigit DecimalDigits DecimalDigit DecimalDigit :: one of 0 1 2 3 4 5 6 7 8 9 NonZeroDigit :: one of 1 2 3 4 5 6 7 8 9 ExponentPart :: ExponentIndicator SignedInteger ExponentIndicator :: one of e E SignedInteger :: DecimalDigits + DecimalDigits - DecimalDigits BinaryIntegerLiteral :: 0b BinaryDigits 0B BinaryDigits BinaryDigits :: BinaryDigit BinaryDigits BinaryDigit BinaryDigit :: one of 0 1 OctalIntegerLiteral :: 0o OctalDigits 0O OctalDigits OctalDigits :: OctalDigit OctalDigits OctalDigit OctalDigit :: one of 0 1 2 3 4 5 6 7 HexIntegerLiteral :: 0x HexDigits 0X HexDigits HexDigits :: HexDigit HexDigits HexDigit HexDigit :: one of 0 1 2 3 4 5 6 7 8 9 a b c d e f A B C D E F The SourceCharacter immediately following a NumericLiteral must not be an IdentifierStart or DecimalDigit. NOTE For example:

3in

is an error and not the two input elements 3 and in . A conforming implementation, when processing strict mode code (see 10.2.1), must not extend, as described in B.1.1, the syntax of NumericLiteral to include LegacyOctalIntegerLiteral, nor extend the syntax of DecimalIntegerLiteral to include NonOctalDecimalIntegerLiteral. 11.8.3.1 A numeric literal stands for a value of the Number type. This value is determined in two steps: first, a mathematical value (MV) is derived from the literal; second, this mathematical value is rounded as described below. The MV of NumericLiteral :: DecimalLiteral is the MV of DecimalLiteral .

The MV of NumericLiteral :: BinaryIntegerLiteral is the MV of BinaryIntegerLiteral .

The MV of NumericLiteral :: OctalIntegerLiteral is the MV of OctalIntegerLiteral .

The MV of NumericLiteral :: HexIntegerLiteral is the MV of HexIntegerLiteral .

The MV of DecimalLiteral :: DecimalIntegerLiteral . is the MV of DecimalIntegerLiteral .

The MV of DecimalLiteral :: DecimalIntegerLiteral . DecimalDigits is the MV of DecimalIntegerLiteral plus (the MV of DecimalDigits × 10 – n ), where n is the number of code points in DecimalDigits .

The MV of DecimalLiteral :: DecimalIntegerLiteral . ExponentPart is the MV of DecimalIntegerLiteral × 10 e , where e is the MV of ExponentPart .

The MV of DecimalLiteral :: DecimalIntegerLiteral . DecimalDigits ExponentPart is (the MV of DecimalIntegerLiteral plus (the MV of DecimalDigits × 10 – n )) × 10 e , where n is the number of code points in DecimalDigits and e is the MV of ExponentPart .

The MV of DecimalLiteral :: . DecimalDigits is the MV of DecimalDigits × 10 – n , where n is the number of code points in DecimalDigits .

The MV of DecimalLiteral :: . DecimalDigits ExponentPart is the MV of DecimalDigits × 10 e – n , where n is the number of code points in DecimalDigits and e is the MV of ExponentPart .

The MV of DecimalLiteral :: DecimalIntegerLiteral is the MV of DecimalIntegerLiteral .

The MV of DecimalLiteral :: DecimalIntegerLiteral ExponentPart is the MV of DecimalIntegerLiteral × 10 e , where e is the MV of ExponentPart .

The MV of DecimalIntegerLiteral :: 0 is 0.

The MV of DecimalIntegerLiteral :: NonZeroDigit is the MV of NonZeroDigit.

The MV of DecimalIntegerLiteral :: NonZeroDigit DecimalDigits is (the MV of NonZeroDigit × 10 n ) plus the MV of DecimalDigits , where n is the number of code points in DecimalDigits .

The MV of DecimalDigits :: DecimalDigit is the MV of DecimalDigit .

The MV of DecimalDigits :: DecimalDigits DecimalDigit is (the MV of DecimalDigits × 10) plus the MV of DecimalDigit .

The MV of ExponentPart :: ExponentIndicator SignedInteger is the MV of SignedInteger .

The MV of SignedInteger :: DecimalDigits is the MV of DecimalDigits .

The MV of SignedInteger :: + DecimalDigits is the MV of DecimalDigits .

The MV of SignedInteger :: - DecimalDigits is the negative of the MV of DecimalDigits .

The MV of DecimalDigit :: 0 or of HexDigit :: 0 or of OctalDigit :: 0 or of BinaryDigit :: 0 is 0.

The MV of DecimalDigit :: 1 or of NonZeroDigit :: 1 or of HexDigit :: 1 or of OctalDigit :: 1 or

of BinaryDigit :: 1 is 1.

The MV of DecimalDigit :: 2 or of NonZeroDigit :: 2 or of HexDigit :: 2 or of OctalDigit :: 2 is 2.

The MV of DecimalDigit :: 3 or of NonZeroDigit :: 3 or of HexDigit :: 3 or of OctalDigit :: 3 is 3.

The MV of DecimalDigit :: 4 or of NonZeroDigit :: 4 or of HexDigit :: 4 or of OctalDigit :: 4 is 4.

The MV of DecimalDigit :: 5 or of NonZeroDigit :: 5 or of HexDigit :: 5 or of OctalDigit :: 5 is 5.

The MV of DecimalDigit :: 6 or of NonZeroDigit :: 6 or of HexDigit :: 6 or of OctalDigit :: 6 is 6.

The MV of DecimalDigit :: 7 or of NonZeroDigit :: 7 or of HexDigit :: 7 or of OctalDigit :: 7 is 7.

The MV of DecimalDigit :: 8 or of NonZeroDigit :: 8 or of HexDigit :: 8 is 8.

The MV of DecimalDigit :: 9 or of NonZeroDigit :: 9 or of HexDigit :: 9 is 9.

The MV of HexDigit :: a or of HexDigit :: A is 10.

The MV of HexDigit :: b or of HexDigit :: B is 11.

The MV of HexDigit :: c or of HexDigit :: C is 12.

The MV of HexDigit :: d or of HexDigit :: D is 13.

The MV of HexDigit :: e or of HexDigit :: E is 14.

The MV of HexDigit :: f or of HexDigit :: F is 15.

The MV of BinaryIntegerLiteral :: 0b BinaryDigits is the MV of BinaryDigits .

The MV of BinaryIntegerLiteral :: 0B BinaryDigits is the MV of BinaryDigits .

The MV of BinaryDigits :: BinaryDigit is the MV of BinaryDigit .

The MV of BinaryDigits :: BinaryDigits BinaryDigit is (the MV of BinaryDigits × 2) plus the MV of BinaryDigit .

The MV of OctalIntegerLiteral :: 0o OctalDigits is the MV of OctalDigits .

The MV of OctalIntegerLiteral :: 0O OctalDigits is the MV of OctalDigits .

The MV of OctalDigits :: OctalDigit is the MV of OctalDigit .

The MV of OctalDigits :: OctalDigits OctalDigit is (the MV of OctalDigits × 8) plus the MV of OctalDigit .

The MV of HexIntegerLiteral :: 0x HexDigits is the MV of HexDigits .

The MV of HexIntegerLiteral :: 0X HexDigits is the MV of HexDigits .

The MV of HexDigits :: HexDigit is the MV of HexDigit .

The MV of HexDigits :: HexDigits HexDigit is (the MV of HexDigits × 16) plus the MV of HexDigit. Once the exact MV for a numeric literal has been determined, it is then rounded to a value of the Number type. If the MV is 0, then the rounded value is +0; otherwise, the rounded value must be the Number value for the MV (as specified in 6.1.6), unless the literal is a DecimalLiteral and the literal has more than 20 significant digits, in which case the Number value may be either the Number value for the MV of a literal produced by replacing each significant digit after the 20th with a 0 digit or the Number value for the MV of a literal produced by replacing each significant digit after the 20th with a 0 digit and then incrementing the literal at the 20th significant digit position. A digit is significant if it is not part of an ExponentPart and it is not 0 ; or

; or there is a nonzero digit to its left and there is a nonzero digit, not in the ExponentPart, to its right. 11.8.4 NOTE 1 A string literal is zero or more Unicode code points enclosed in single or double quotes. Unicode code points may also be represented by an escape sequence. All code points may appear literally in a string literal except for the closing quote code points, U+005C (REVERSE SOLIDUS), U+000D (CARRIAGE RETURN), U+2028 (LINE SEPARATOR), U+2029 (PARAGRAPH SEPARATOR), and U+000A (LINE FEED). Any code points may appear in the form of an escape sequence. String literals evaluate to ECMAScript String values. When generating these String values Unicode code points are UTF-16 encoded as defined in 10.1.1. Code points belonging to the Basic Multilingual Plane are encoded as a single code unit element of the string. All other code points are encoded as two code unit elements of the string. Syntax StringLiteral :: " DoubleStringCharacters opt " ' SingleStringCharacters opt ' DoubleStringCharacters :: DoubleStringCharacter DoubleStringCharacters opt SingleStringCharacters :: SingleStringCharacter SingleStringCharacters opt DoubleStringCharacter :: SourceCharacter but not one of " or \ or LineTerminator \ EscapeSequence LineContinuation SingleStringCharacter :: SourceCharacter but not one of ' or \ or LineTerminator \ EscapeSequence LineContinuation LineContinuation :: \ LineTerminatorSequence EscapeSequence :: CharacterEscapeSequence 0 [lookahead ∉ DecimalDigit ] HexEscapeSequence UnicodeEscapeSequence A conforming implementation, when processing strict mode code (see 10.2.1), must not extend the syntax of EscapeSequence to include LegacyOctalEscapeSequence as described in B.1.2. CharacterEscapeSequence :: SingleEscapeCharacter NonEscapeCharacter SingleEscapeCharacter :: one of ' " \ b f n r t v NonEscapeCharacter :: SourceCharacter but not one of EscapeCharacter or LineTerminator EscapeCharacter :: SingleEscapeCharacter DecimalDigit x u HexEscapeSequence :: x HexDigit HexDigit UnicodeEscapeSequence :: u Hex4Digits u{ HexDigits } Hex4Digits :: HexDigit HexDigit HexDigit HexDigit The definition of the nonterminal HexDigit is given in 11.8.3. SourceCharacter is defined in 10.1. NOTE 2 A line terminator code point cannot appear in a string literal, except as part of a LineContinuation to produce the empty code points sequence. The proper way to cause a line terminator code point to be part of the String value of a string literal is to use an escape sequence such as

or \u000A . 11.8.4.1 UnicodeEscapeSequence :: u{ HexDigits } It is a Syntax Error if the MV of HexDigits > 1114111. 11.8.4.2 StringValue See also: 11.6.1.2, 12.1.4. StringLiteral :: " DoubleStringCharacters opt " ' SingleStringCharacters opt ' Return the String value whose elements are the SV of this StringLiteral. 11.8.4.3 SV A string literal stands for a value of the String type. The String value (SV) of the literal is described in terms of code unit values contributed by the various parts of the string literal. As part of this process, some Unicode code points within the string literal are interpreted as having a mathematical value (MV), as described below or in 11.8.3. The SV of StringLiteral :: "" is the empty code unit sequence.

The SV of StringLiteral :: '' is the empty code unit sequence.

The SV of StringLiteral :: " DoubleStringCharacters " is the SV of DoubleStringCharacters .

The SV of StringLiteral :: ' SingleStringCharacters ' is the SV of SingleStringCharacters .

The SV of DoubleStringCharacters :: DoubleStringCharacter is a sequence of one or two code units that is the SV of DoubleStringCharacter .

The SV of DoubleStringCharacters :: DoubleStringCharacter DoubleStringCharacters is a sequence of one or two code units that is the SV of DoubleStringCharacter followed by all the code units in the SV of DoubleStringCharacters in order.

The SV of SingleStringCharacters :: SingleStringCharacter is a sequence of one or two code units that is the SV of SingleStringCharacter .

The SV of SingleStringCharacters :: SingleStringCharacter SingleStringCharacters is a sequence of one or two code units that is the SV of SingleStringCharacter followed by all the code units in the SV of SingleStringCharacters in order.

The SV of DoubleStringCharacter :: SourceCharacter but not one of " or \ or LineTerminator is the UTF16Encoding (10.1.1) of the code point value of SourceCharacter .

The SV of DoubleStringCharacter :: \ EscapeSequence is the SV of the EscapeSequence .

The SV of DoubleStringCharacter :: LineContinuation is the empty code unit sequence.

The SV of SingleStringCharacter :: SourceCharacter but not one of ' or \ or LineTerminator is the UTF16Encoding (10.1.1) of the code point value of SourceCharacter .

The SV of SingleStringCharacter :: \ EscapeSequence is the SV of the EscapeSequence .

The SV of SingleStringCharacter :: LineContinuation is the empty code unit sequence.

The SV of EscapeSequence :: CharacterEscapeSequence is the SV of the CharacterEscapeSequence .

The SV of EscapeSequence :: 0 is the code unit value 0.

The SV of EscapeSequence :: HexEscapeSequence is the SV of the HexEscapeSequence .

The SV of EscapeSequence :: UnicodeEscapeSequence is the SV of the UnicodeEscapeSequence .

The SV of CharacterEscapeSequence :: SingleEscapeCharacter is the code unit whose value is determined by the SingleEscapeCharacter according to Table 34. Table 34 — String Single Character Escape Sequences Escape Sequence Code Unit Value Unicode Character Name Symbol \b 0x0008 BACKSPACE <BS> \t 0x0009 CHARACTER TABULATION <HT>

0x000A LINE FEED (LF) <LF> \v 0x000B LINE TABULATION <VT> \f 0x000C FORM FEED (FF) <FF> \r 0x000D CARRIAGE RETURN (CR) <CR> \" 0x0022 QUOTATION MARK " \' 0x0027 APOSTROPHE ' \\ 0x005C REVERSE SOLIDUS \ The SV of CharacterEscapeSequence :: NonEscapeCharacter is the SV of the NonEscapeCharacter .

The SV of NonEscapeCharacter :: SourceCharacter but not one of EscapeCharacter or LineTerminator is the UTF16Encoding (10.1.1) of the code point value of SourceCharacter .

The SV of HexEscapeSequence :: x HexDigit HexDigit is the code unit value that is (16 times the MV of the first HexDigit ) plus the MV of the second HexDigit .

The SV of UnicodeEscapeSequence :: u Hex4Digits is the SV of Hex4Digits.

The SV of Hex4Digits :: HexDigit HexDigit HexDigit HexDigit is the code unit value that is (4096 times the MV of the first HexDigit ) plus (256 times the MV of the second HexDigit ) plus (16 times the MV of the third HexDigit ) plus the MV of the fourth HexDigit .

The SV of UnicodeEscapeSequence :: u{ HexDigits } is the UTF16Encoding (10.1.1) of the MV of HexDigits. 11.8.5 NOTE 1 A regular expression literal is an input element that is converted to a RegExp object (see 21.2) each time the literal is evaluated. Two regular expression literals in a program evaluate to regular expression objects that never compare as === to each other even if the two literals' contents are identical. A RegExp object may also be created at runtime by new RegExp or calling the RegExp constructor as a function (see 21.2.3). The productions below describe the syntax for a regular expression literal and are used by the input element scanner to find the end of the regular expression literal. The source text comprising the RegularExpressionBody and the RegularExpressionFlags are subsequently parsed again using the more stringent ECMAScript Regular Expression grammar (21.2.1). An implementation may extend the ECMAScript Regular Expression grammar defined in 21.2.1, but it must not extend the RegularExpressionBody and RegularExpressionFlags productions defined below or the productions used by these productions. Syntax RegularExpressionLiteral :: / RegularExpressionBody / RegularExpressionFlags RegularExpressionBody :: RegularExpressionFirstChar RegularExpressionChars RegularExpressionChars :: [empty] RegularExpressionChars RegularExpressionChar RegularExpressionFirstChar :: RegularExpressionNonTerminator but not one of * or \ or / or [ RegularExpressionBackslashSequence RegularExpressionClass RegularExpressionChar :: RegularExpressionNonTerminator but not one of \ or / or [ RegularExpressionBackslashSequence RegularExpressionClass RegularExpressionBackslashSequence :: \ RegularExpressionNonTerminator RegularExpressionNonTerminator :: SourceCharacter but not LineTerminator RegularExpressionClass :: [ RegularExpressionClassChars ] RegularExpressionClassChars :: [empty] RegularExpressionClassChars RegularExpressionClassChar RegularExpressionClassChar :: RegularExpressionNonTerminator but not one of ] or \ RegularExpressionBackslashSequence RegularExpressionFlags :: [empty] RegularExpressionFlags IdentifierPart NOTE 2 Regular expression literals may not be empty; instead of representing an empty regular expression literal, the code unit sequence // starts a single-line comment. To specify an empty regular expression, use: /(?:)/ . 11.8.5.1 RegularExpressionFlags :: RegularExpressionFlags IdentifierPart It is a Syntax Error if IdentifierPart contains a Unicode escape sequence . 11.8.5.2 BodyText RegularExpressionLiteral :: / RegularExpressionBody / RegularExpressionFlags Return the source text that was recognized as RegularExpressionBody. 11.8.5.3 FlagText RegularExpressionLiteral :: / RegularExpressionBody / RegularExpressionFlags Return the source text that was recognized as RegularExpressionFlags. 11.8.6 Syntax Template :: NoSubstitutionTemplate TemplateHead NoSubstitutionTemplate :: ` TemplateCharacters opt ` TemplateHead :: ` TemplateCharacters opt ${ TemplateSubstitutionTail :: TemplateMiddle TemplateTail TemplateMiddle :: } TemplateCharacters opt ${ TemplateTail :: } TemplateCharacters opt ` TemplateCharacters :: TemplateCharacter TemplateCharacters opt TemplateCharacter :: $ [lookahead ≠ { ] \ EscapeSequence LineContinuation LineTerminatorSequence SourceCharacter but not one of ` or \ or $ or LineTerminator A conforming implementation must not use the extended definition of EscapeSequence described in B.1.2 when parsing a TemplateCharacter. NOTE TemplateSubstitutionTail is used by the InputElementTemplateTail alternative lexical goal. 11.8.6.1 TV and TRV A template literal component is interpreted as a sequence of Unicode code points. The Template Value (TV) of a literal component is described in terms of code unit values (SV, 11.8.4) contributed by the various parts of the template literal component. As part of this process, some Unicode code points within the template component are interpreted as having a mathematical value (MV, 11.8.3). In determining a TV, escape sequences are replaced by the UTF-16 code unit(s) of the Unicode code point represented by the escape sequence. The Template Raw Value (TRV) is similar to a Template Value with the difference that in TRVs escape sequences are interpreted literally. The TV and TRV of NoSubstitutionTemplate :: `` is the empty code unit sequence.

The TV and TRV of TemplateHead :: `${ is the empty code unit sequence.

The TV and TRV of TemplateMiddle :: }${ is the empty code unit sequence.

The TV and TRV of TemplateTail :: }` is the empty code unit sequence.

The TV of NoSubstitutionTemplate :: ` TemplateCharacters ` is the TV of TemplateCharacters .

The TV of TemplateHead :: ` TemplateCharacters ${ is the TV of TemplateCharacters .

The TV of TemplateMiddle :: } TemplateCharacters ${ is the TV of TemplateCharacters .

The TV of TemplateTail :: } TemplateCharacters ` is the TV of TemplateCharacters .

The TV of TemplateCharacters :: TemplateCharacter is the TV of TemplateCharacter .

The TV of TemplateCharacters :: TemplateCharacter TemplateCharacters is a sequence consisting of the code units in the TV of TemplateCharacter followed by all the code units in the TV of TemplateCharacters in order.

The TV of TemplateCharacter :: SourceCharacter but not one of ` or \ or $ or LineTerminator is the UTF16Encoding (10.1.1) of the code point value of SourceCharacter .

The TV of TemplateCharacter :: $ is the code unit value 0x0024.

The TV of TemplateCharacter :: \ EscapeSequence is the SV of EscapeSequence .

The TV of TemplateCharacter :: LineContinuation is the TV of LineContinuation .

The TV of TemplateCharacter :: LineTerminatorSequence is the TRV of LineTerminatorSequence .

The TV of LineContinuation :: \ LineTerminatorSequence is the empty code unit sequence.

The TRV of NoSubstitutionTemplate :: ` TemplateCharacters ` is the TRV of TemplateCharacters .

The TRV of TemplateHead :: ` TemplateCharacters ${ is the TRV of TemplateCharacters .

The TRV of TemplateMiddle :: } TemplateCharacters ${ is the TRV of TemplateCharacters .

The TRV of TemplateTail :: } TemplateCharacters ` is the TRV of TemplateCharacters .

The TRV of TemplateCharacters :: TemplateCharacter is the TRV of TemplateCharacter .

The TRV of TemplateCharacters :: TemplateCharacter TemplateCharacters is a sequence consisting of the code units in the TRV of TemplateCharacter followed by all the code units in the TRV of TemplateCharacters, in order.

The TRV of TemplateCharacter :: SourceCharacter but not one of ` or \ or $ or LineTerminator is the UTF16Encoding (10.1.1) of the code point value of SourceCharacter .

The TRV of TemplateCharacter :: $ is the code unit value 0x0024.

The TRV of TemplateCharacter :: \ EscapeSequence is the sequence consisting of the code unit value 0x005C followed by the code units of TRV of EscapeSequence .

The TRV of TemplateCharacter :: LineContinuation is the TRV of LineContinuation .

The TRV of TemplateCharacter :: LineTerminatorSequence is the TRV of LineTerminatorSequence .

The TRV of EscapeSequence :: CharacterEscapeSequence is the TRV of the CharacterEscapeSequence .

The TRV of EscapeSequence :: 0 is the code unit value 0x0030.

The TRV of EscapeSequence :: HexEscapeSequence is the TRV of the HexEscapeSequence .

The TRV of EscapeSequence :: UnicodeEscapeSequence is the TRV of the UnicodeEscapeSequence .

The TRV of CharacterEscapeSequence :: SingleEscapeCharacter is the TRV of the SingleEscapeCharacter .

The TRV of CharacterEscapeSequence :: NonEscapeCharacter is the SV of the NonEscapeCharacter .

The TRV of SingleEscapeCharacter :: one of ' " \ b f n r t v is the SV of the SourceCharacter that is that single code point.

The TRV of HexEscapeSequence :: x HexDigit HexDigit is the sequence consisting of code unit value 0x0078 followed by TRV of the first HexDigit followed by the TRV of the second HexDigit .

The TRV of UnicodeEscapeSequence :: u Hex4Digits is the sequence consisting of code unit value 0x0075 followed by TRV of Hex4Digits.

The TRV of UnicodeEscapeSequence :: u{ HexDigits } is the sequence consisting of code unit value 0x0075 followed by code unit value 0x007B followed by TRV of HexDigits followed by code unit value 0x007D.

The TRV of Hex4Digits :: HexDigit HexDigit HexDigit HexDigit is the sequence consisting of the TRV of the first HexDigit followed by the TRV of the second HexDigit followed by the TRV of the third HexDigit followed by the TRV of the fourth HexDigit.

The TRV of HexDigits :: HexDigit is the TRV of HexDigit .

The TRV of HexDigits :: HexDigits HexDigit is the sequence consisting of TRV of HexDigits followed by TRV of HexDigit .

The TRV of a HexDigit is the SV of the SourceCharacter that is that HexDigit .

The TRV of LineContinuation :: \ LineTerminatorSequence is the sequence consisting of the code unit value 0x005C followed by the code units of TRV of LineTerminatorSequence .

The TRV of LineTerminatorSequence :: <LF> is the code unit value 0x000A.

The TRV of LineTerminatorSequence :: <CR> is the code unit value 0x000A.

The TRV of LineTerminatorSequence :: <LS> is the code unit value 0x2028.

The TRV of LineTerminatorSequence :: <PS> is the code unit value 0x2029.

The TRV of LineTerminatorSequence :: <CR><LF> is the sequence consisting of the code unit value 0x000A. NOTE TV excludes the code units of LineContinuation while TRV includes them. <CR><LF> and <CR> LineTerminatorSequences are normalized to <LF> for both TV and TRV. An explicit EscapeSequence is needed to include a <CR> or <CR><LF> sequence. 11.9 Certain ECMAScript statements (empty statement, let , const , import , and export declarations, variable statement, expression statement, debugger statement, continue statement, break statement, return statement, and throw statement) must be terminated with semicolons. Such semicolons may always appear explicitly in the source text. For convenience, however, such semicolons may be omitted from the source text in certain situations. These situations are described by saying that semicolons are automatically inserted into the source code token stream in those situations. 11.9.1 In the following rules, “token” means the actual recognized lexical token determined using the current lexical goal symbol as described in clause 11. There are three basic rules of semicolon insertion: When, as a Script or Module is parsed from left to right, a token (called the offending token) is encountered that is not allowed by any production of the grammar, then a semicolon is automatically inserted before the offending token if one or more of the following conditions is true: The offending token is separated from the previous token by at least one LineTerminator.

The offending token is } .

The previous token is ) and the inserted semicolon would then be parsed as the terminating semicolon of a do-while statement (13.7.2). When, as the Script or Module is parsed from left to right, the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete ECMAScript Script or Module , then a semicolon is automatically inserted at the end of the input stream. When, as the Script or Module is parsed from left to right, a token is encountered that is allowed by some production of the grammar, but the production is a restricted production and the token would be the first token for a terminal or nonterminal immediately following the annotation “ [no LineTerminator here]” within the restricted production (and therefore such a token is called a restricted token), and the restricted token is separated from the previous token by at least one LineTerminator , then a semicolon is automatically inserted before the restricted token. However, there is an additional overriding condition on the preceding rules: a semicolon is never inserted automatically if the semicolon would then be parsed as an empty statement or if that semicolon would become one of the two semicolons in the header of a for statement (see 13.7.4). NOTE The following are the only restricted productions in the grammar: PostfixExpression [Yield] : LeftHandSideExpression [?Yield] [no LineTerminator here] ++ LeftHandSideExpression [?Yield] [no LineTerminator here] -- ContinueStatement [Yield] : continue; continue [no LineTerminator here] LabelIdentifier [?Yield] ; BreakStatement [Yield] : break ; break [no LineTerminator here] LabelIdentifier [?Yield] ; ReturnStatement [Yield] : return [no LineTerminator here] Expression ; return [no LineTerminator here] Expression [In, ?Yield] ; ThrowStatement [Yield] : throw [no LineTerminator here] Expression [In, ?Yield] ; ArrowFunction [In, Yield] : ArrowParameters [?Yield] [no LineTerminator here] => ConciseBody [?In] YieldExpression [In] : yield [no LineTerminator here] * AssignmentExpression [?In, Yield] yield [no LineTerminator here] AssignmentExpression [?In, Yield] The practical effect of these 