Short and safe array iteration

Submitted by Larry on 22 October 2017 - 5:50am

One reason to follow development mailing lists is you sometimes pick up on some very neat tricks. Here's one that I spotted on the PHP Internals list recently to simplify array iteration in PHP 7.

PHP's largely loose, dynamic typing has plenty of both pros and cons. One con in particular is that you don't always know for sure if a value you're trying to use has been set yet, or is non-null. PHP will dutifully whine at you if you try to use a null value, sometimes fatally. (Yet another reason to structure your code to avoid nulls, period.)

One place this comes up in particular is in foreach() loops, especially when working with nested array structures. (PHP lacks a struct type, but makes anonymous hash maps so easy that they get used as the uber data type, for better or worse.) Consider the following:

<?php
foreach ($definition['keys'] as $id => $val) {
 
// ...
}
?>

This not-at-all uncommon code has a problem: There's no guarantee that $definition['keys'] has been set. Ideally it has been, and if it's not there's a good chance there's a bug elsewhere, but it could be in user-supplied data, say, a user-supplied YAML or JSON file. To be safe, you really ought to check it:

<?php
If (!empty($definition['keys']) {
  foreach (
$definition['keys'] as $id => $val) {
   
// ...
 
}
}
?>

That avoids any whining from PHP, but at the cost of more annoying boilerplate code. When you're parsing through a large and complex data structure (the aforementioned YAML or JSON data), that can add up to a lot of irritating extra code, especially if you also need to check even deeper array levels.

Fortunately, PHP 7 introduces a null-coalesce operator: ??. The ?? operator (pronounced "g'WHAT??") acts as a shorthand ternary for is_null(). That is, the following two statements are equivalent.

<?php
$a
= is_null($b) ? $default : $b;
$a = $b ?? $default;
?>

Note that ?? does require that the value being checked is defined; However, thanks to quirks of PHP an array key that is undefined evalues to null, not to, well, "undefined". Which means $definition['keys'] ?? $default will evaluate to $default if there is no "keys" index.

Which in turn means that this works:

<?php
foreach ($definition['keys'] ?? [] as $id => $val) {
 
// ...
}
?>

That is, if $definition['keys'] is null, or unset and PHP therefore casts to null, it will evaluate to an empty array. foreach() knows how to iterate an empty array (by doing nothing), and thus the code simply skips the foreach() entirely when "keys" is undefined. Just as we were trying to do in the longer version.

Unfortunately that doesn't work on a bare variable. This snippet will still throw a variable-undefined E_NOTICE if $definition is not set:

<?php
foreach ($definition ?? [] as $id => $val) {
 
// ...
}
?>

So no, it won't help in every situation; and favoring explicitly defined data structures over anonymous arrays is always a good idea when possible. But sometimes you just gotta iterate what you gotta iterate, and this is a neat trick to keep in your back pocket.

A major hat tip to Niklas Keller in this comment, who tipped me off to this technique.

The @ operator suppresses all error *display*, but not error *generation*.  It still causes error handlers to fire, while hiding actual errors.  It's pretty much never the right way to do anything, unless you're dealing with a badly written API (such as PHP's LDAP library).

What your code sample does is actively ignore all undefined and type errors.  That is wrong 99% of the time. The code in the article properly handles type and undefined issues in a safe manner. That's right 99% of the time.

Please, no one use @, ever.

r-j (not verified)

24 October 2017 - 6:20am

I don't see any advantages of the short form over the long code (if (!empty...) other than that it is shorter, but I have learned not to over-optimize for the sake of readability.
What I like is that it avoids another indention which might be taken as an advantage over the "longer" code, but that's it.

Looking at the short form, I read "if null then loop over empty array", which leaves me confused. Why should looping over an empty array be default behaviour? Every line of code costs time and if there is some code executed, there should be a reason for it. In this case, the developer already considered that there might not be any $definition['keys'] (wouldn't $definitions['key'] make more sense?) set, but instead of handling that properly and skip further processing of that non existing value, a default value is passed to the foreach loop in order to trick the loop to do what the developer was too lazy to implement.

Looking at the long code, I read "if not empty then foreach". That is precise. When reading that code I know exactly what is happening. No guessing, no wondering. It is not only much more easy to understand, it is also more responsible with resources by not wasting time on a superfluous loop. It does a) what it says and b) what it has to do. Nothing more nothing less.

Coming back to indention, the only advantage of the shortened code, I must say that I happily accept the disadvantage of more whitespace in a code I want to read, if the code is expressive and not tuned for shortness.

Oleh (not verified)

26 October 2017 - 10:53am

The last example works well without any notices, because ?? is equivalent to
<?php !empty($var) ? $var : $somethingElse ?>
and not to what you wrote (is_null).
But, anyways, thanks for the idea how to use it.