rfc:variable_syntax_tweaks

PHP RFC: Variable Syntax Tweaks

Date: 2020-01-07

Author: Nikita Popov nikic@php.net

Target Version: PHP 8.0

Status: Implemented

Implementation: https://github.com/php/php-src/pull/5061

Introduction

The Uniform Variable Syntax RFC resolved a number of inconsistencies in PHP's variable syntax. This RFC intends to address a small handful of cases that were overlooked.

Proposal

Terminology

PHP supports four principal types of “dereferencing” operations: Array: $foo[$bar] , $foo{$bar}

Object: $foo->bar , $foo->bar()

Static: Foo::$bar , Foo::bar() , Foo::BAR

Call: foo() A fully dereferencable construction is dereferencable by all four operations.

Interpolated and non-interpolated strings

Non-interpolated strings "foo" are currently considered fully dereferencable, i.e. constructions such as "foo"[0] or "foo"->bar() are considered legal (syntactically at least). However, interpolated strings "foo$bar" are non-dereferencable. This RFC proposed to treat both types of strings consistently, i.e. "foo$bar"[0] , "foo$bar"->baz() etc become legal. This inconsistency was originally reported in the context of scalar objects.

Constants and magic constants

Constants are currently array-dereferencable, that is FOO[0] is legal. However, magic constants are non-dereferencable. This RFC proposes to treat magic constants the same way as constants, that is, writing __FUNCTION__[0] etc becomes legal.

Constant dereferencability

Constants (and class constants) are currently special cased and only dereferencable under the [] operator (and do not even support the nominally equivalent {} operator). This RFC proposes to make constants (and class constants) fully array and object dereferencable, i.e. to allow both FOO{0} and FOO->length() (the latter only being relevant in conjunction with scalar objects). This makes the set of array-dereferencable and object-dereferencable operations strictly identical.

Class constant dereferencability

Currently static property accesses are array, object and static dereferencable. However, even with the change from the previous section, class constant accesses are only array and object dereferencable. This means that while Foo::$bar::$baz is legal, Foo::BAR::$baz is not. This RFC proposes to make class constant accesses static derefencable as well, so that Foo::BAR::$baz and Foo::BAR::BAZ become legal. It should be noted that the same is not possible for plain constants, as FOO::$bar already interprets FOO as a class name, not a constant name. Similarly, both types of constants cannot be call dereferenced, as FOO() is interpreted as a function name and Foo::BAR() as a static method call. This inconsistency was originally reported on Twitter.

Arbitrary expression support for new and instanceof

Most syntactic constructions that normally require an identifier/name also accept a syntax variation that allows an arbitrary expression. Depending on the case, this uses either {expr} or (expr) syntax. Places that accept a simple (non-namespaced) identifier use curly braces, such as ${expr} , $foo->{expr} , and FOO::{expr}() . Places that accept a namespaced name use parentheses, such as (expr)() and (expr)::FOO . These cases also accept more than just strings, in particular (expr)() additionally also allows other callables, and (expr)::FOO also allows objects. One place where arbitrary expressions are currently not supported are class names in new and instanceof (which are currently treated as syntactically the same). In line with the considerations above, this RFC proposes to introduce the syntax new (expr) and $x instanceof (expr) respectively.

Backward Incompatible Changes

There are not backwards incompatible changes, this RFC allows strictly more syntax than before.

Future Scope

It is in principle possible to relax the limitations on the right-hand-side of instanceof entirely: It could be treated as a normal expression, with plain constant accesses being reinterpreted as class name references instead (as this is the only ambiguity between them). While I think that this is the best option in principle, I am concerned that it will cause issues in the future if we introduce generic types, in which case a class name will no longer necessarily coincide with an ordinary expression.

Vote