hphp/hack/doc/HIPs/int_intish_shape_keys.md
Shape keys are currently required to be:
This proposal is to permit any arraykey literal, in addition to class
constants. More specifically:
Generally as already-present for Regex\Match; additionally, this will now be
possible:
// Takes any regex, returns the entire matched string
function helper<T as Regex\Pattern(shape(0 => string, ...))>(
T $pattern,
): string {
return Regex\Match($pattern);
}
Currently, it can only be typed as Regex\Match, which is declared as
shape(...) and it is impossible to refine it or redeclare it to say that
0 (entire string) or any specific numbered captures are present.
Already present for Regex\Match; no change expected.
Largely covered by 'motivation' above. No alternatives have been considered
for removal of ban on 'int-like' string keys. These are a PHPism.
The rest of this section addresses alternatives for actual int shape keys.
For example, $shape['0']. The main problem is potential future issues: while
PCRE currently bans named capture groups that start with a number, the syntax
for referencing them differs; it appears that it would be possible to remove
this restriction without breaking compatibility, unless we introduce this
syntax.
This would also have a minor drawback from usability/familiarity benefit, as it would be different to the representations in all other languages.
e.g. ->getNameCapture(string $name), ->getPositionalCapture(int $idx)
This is the approach taken by most other languages/libraries that support named captures.
This would remove the need for any changes to shapes and tuples, however to
maintain the same static safety that we currently have (i.e. we know which
named and positional captures are valid), these objects will in turn need
to be special-cased - for example, perhaps re"/(foo(?<bar>baz))/" is inferred
to be a RegexpPattern<tuple(string, string), shape('bar' => string)> - however,
if tuples are used as part of the generic, changes will be needed to support
subtyping.
These natively support sequential integer keys, however they have several drawbacks here:
shape(0 => string, 'foo' => string, ...)
((string, ...), shape('foo' => string, ...)shape(1 => string, ...)
((string, string, ...), shape(...))Combined with the fact that all elements are the same type, this problem feels
like it would be better solved by bounded-size vecs - i.e. a vec<string> with
at least n elements - however, in the regexp case, the user normally cares
about presence of a specific n, not 0..=n, which is a problem already addressed
by shapes.