Categories


Recent Posts


Archives


PHP Generators From Scratch

astorm

Frustrated by Magento? Then you’ll love Commerce Bug, the must have debugging extension for anyone using Magento. Whether you’re just starting out or you’re a seasoned pro, Commerce Bug will save you and your team hours everyday. Grab a copy and start working with Magento instead of against it.

Updated for Magento 2! No Frills Magento Layout is the only Magento front end book you'll ever need. Get your copy today!

Generators landed back in PHP 5.5 and I’ve mostly ignored them. I had a vague understanding that they were a feature that allowed you to build iterators that didn’t require loading up a huge data structure with all your information. This also seemed to be the gist of most online generator tutorials. So, in the practical world of business programming where jamming everything into a giant PHP array is usually good enough, there wasn’t much of a need to understand generators.

So imagine my surprise when I discovered that generators are actually an alternative to linear code flow. Or maybe you don’t need to imagine any surprise and are thinking

Alternative to linear code flow — what does that even mean?

Today we’re going to cover generators “from scratch”. By the end of this article you should be able to reason about any generator function in PHP and understand the flow of code when a generator is invoked.

Generator Functions

For now, pretend we didn’t tell you that generators are for building iterators.

New Definition: Generators are a special type of function in PHP that always returns a Generator object. Generator function definitions are similar to regular function definitions, with one exception. Instead of using a return keyword, they use a yield keyword. Here’s a simple example program that demonstrates this.

#File: generator-example.php
<?php
function myGeneratorFunction()
{
    yield;
}

$returnValue = myGeneratorFunction();
echo get_class($returnValue),"\n";

This program defines a function named myGeneratorFunction. This function doesn’t have a return value, but does include the keyword yield. (We’re not quite ready to explain what yield does, but if you like to read ahead it’s similar to the return keyword — we’ll get to those details momentarily)

Next, we call myGeneratorFunction, and assign its return value to a variable named (appropriately) $returnValue. Finally, we pass $returnValue into the get_class function and echo the output.

If you aren’t familiar with generators, you might expect $returnValue to contain the value null. After all, myGeneratorFunction didn’t return anything. However, if you run the above program, you’ll see our function returned a object instantiated from the built-in Generator class.

$ php generator-example.php
Generator

Although our function is defined with the regular old function keyword, PHP’s internals treat it differently because the function includes the yield keyword. PHP will always treat a function that includes the yield keyword as a generator function, and a generator function will always return a Generator object.

Yield and Program Flow

Generator objects are PHP iterators. If you haven’t used iterators before, they are (from one point of view) classes that allow you to create objects that allow you to loop over values. This sample program demonstrates the built-in array iterator.

#File: generator-example.php
<?php
$values = [1,2,3,4,5];
// using foreach
foreach($values as $number) {
    echo $number, "\n";
}

// using an iterator
$iterator = new ArrayIterator($values);
while($number = $iterator->current()) {
    echo $number, "\n";
    $iterator->next();
}

In PHP an array iterator is a bit more verbose than a foreach statement. Syntactic sugar is popular when it creates less code, so iterators aren’t often used in day-to-day PHP code.

However, PHP also includes a special built-in iterator interface. This interface allows end-user-programmers (you!) to define their own objects with rules for how a set of data is traversed over — and you can use these objects as an iterator or directly in PHP’s foreach loops. If you’ve ever used a collection object in a framework like Magento, under the hood these collections all implement PHP’s base Iterator class.

Generators are another special type of iterator object. However, instead of relying on a defined class for their functionality, they rely on generator functions and the special properties of the yield keyword.

Yield as Return

In PHP, the yield keyword tells PHP to pause the current function execution and return a value to the generator/iterator object. This happens the first time the generator’s current method is called. When an end-user-programmer calls the generator object’s next function, PHP will return to the generator function and continue execution immediately after the point that yield was called.

If you’re a little confused by that, don’t worry. It’s involves breaking a bunch of base assumptions about how PHP code flows. This quick test program should help clear things up.

#File: generator-example.php
<?php
function myGeneratorFunction()
{
    echo "One","\n";
    yield 'first return value';

    echo "Two","\n";
    yield 'second return value';

    echo "Three","\n";
    yield 'third return value';
}

// get our Generator object (remember, all generator function return
// a generator object, and a generator function is any function that
// uses the yield keyword)
$iterator = myGeneratorFunction();

// get the current value of the iterator
$value = $iterator->current();

// get the next value of the iterator
// $value = $iterator->next();

// and the value after that the next value of the iterator
// $value = $iterator->next();

The output of this first program will be

$ php generator-example.php
One

When we called current on the iterator object, PHP began executing the code in the myGeneratorFunction function, and stopped when it reached the first yield.

You probably noticed a few lines commented at the bottom of our test program. If we uncomment the first call to next

#File: generator-example.php
<?php
function myGeneratorFunction()
{
    echo "One","\n";
    yield;

    echo "Two","\n";
    yield;

    echo "Three","\n";
    yield;
}

// get our Generator object (remember, all generator function return
// a generator object, and a generator function is any function that
// uses the yield keyword)
$iterator = myGeneratorFunction();

// get the current value of the iterator
$value = $iterator->current();

// get the next value of the iterator
$value = $iterator->next();

// and the value after that the next value of the iterator
// $value = $iterator->next();

we’ll see the following output

$php generator-example.php
One
Two

When we called next, PHP resumed executing myGeneratorFunction at the point it had previously stopped. That’s what we mean when we say the yield keyword pauses the function. Uncomment the last $value = $iterator->next(); and you’ll see that execution resumes after the second yield.

So that explains yield‘s power to pause a function — but what about when we said

[The yield keyword is] similar to the return keyword

Here’s another sample program that demonstrates this.

#File: generator-example.php
<?php

function myGeneratorFunction()
{
    echo "One","\n";
    yield 'first return value';

    echo "Two","\n";
    yield 'second return value';

    echo "Three","\n";
    yield 'third return value';
}

// get our Generator object (remember, all generator function return
// a generator object, and a generator function is any function that
// uses the yield keyword)
$iterator = myGeneratorFunction();

// get the current value of the iterator
$value = $iterator->current();
echo 'The value returned: ', $value, "\n";

// get the next value of the iterator
$iterator->next();
$value = $iterator->current();
echo 'The value returned: ', $value, "\n";

// and the value after that the next value of the iterator
$iterator->next();
$value = $iterator->current();
echo 'The value returned: ', $value, "\n";

Run this program and you’ll see the following output.

$ php generator-example.php
One
The value returned: first return value
Two
The value returned: second return value
Three
The value returned: third return value

This program is very similar to our first, with two exceptions

  1. We’ve included string values after the yield keywords (yield "a string value";)
  2. After calling next on the iterator object, we fetch the iterator’s current value with the current method

In addition to pausing a function — the yield keyword also returns a value that the generator/iterator object will know to use as the current value.

All this next/current business may seem verbose. Don’t forget that PHP knows how to handle an iterator in a foreach loop. Give the following program a try

#File: generator-example.php
<?php

function myGeneratorFunction()
{
    yield 'first return value';
    yield 'second return value';
    yield 'third return value';
}

$generator = myGeneratorFunction();
foreach($generator as $value) {
    echo 'My Value Is: ', $value, "\n";
}

Run it, and you’ll get the following output

$ php generator-example.php
My Value Is: first return value
My Value Is: second return value
My Value Is: third return value

Under the hood, when you use an iterator object in a foreach loop, PHP is making the same calls to that iterator’s next and current methods.

Pausing State

So far we’ve discussed generators and yield as though they were just a fancy version of the goto statement. There’s one key piece of information we’ve left out. When you yield inside a generator function and return control to the other part of your program, PHP pauses everything about that function. This includes the state of any variables inside the generator function.

The implications of that might not be immediately obvious. Let’s use the classic generator example (reimplementing the range function) to demonstrate the implications.

#File: generator-example.php
<?php

# 1. Define a Generator Function
function generator_range($min, $max)
{
    #3b. Start executing once `current`'s been called
    for($i=$min;$i<=$max;$i++) {
        echo "Starting Loop","\n";
        yield $i;   #3c. Return execution to the main program
                    #4b. Return execution to the main program again
        #4a. Resume exection when `next is called
        echo "Ending Loop","\n";
    }
}

#2. Call the generator function
$generator = generator_range(1, 5);

#3a. Call the `current` method on the generator function
echo $generator->current(), "\n";

#4a. Resume execution and call `next` on the generator object
$generator->next();

#5 echo out value we yielded when calling `next` above
echo $generator->current(), "\n";

// give this a try when you have some free time
// foreach(generator_range(1, 5) as $value) {
//    echo $value, "\n";
// }

If we run this program we’ll see the following output.

Starting Loop
1
Ending Loop
Starting Loop
2

In plain english, this program

  1. Defines a generator function
  2. Calls that generator function to get a generator object
  3. Starts executing the generator function when the program calls current, which yields a value
  4. Returns to the generator function when we call $generator->next() and makes another trip through the loop until yield is called again
  5. echos out the value of the second yield when we call current again

The most important step is #4. When we call next and return execution to the generator function — the values of $i, $min, and $max are all the same as when we left the function in step #3. PHP held on to these values when it paused the function. That’s the magic of generators, and what allows them to be more memory efficient than returning and storing a set of a values in an array.

Wrap Up

There’s a lot more to learn about generators. Here’s just a few

  1. The yield from statement, allows you to yield another generator
  2. Sending values and throwing exceptions back into the generator function
  3. The effect of a return statement inside a generator function

but I think we’ll wrap things up here today. The main thing I wanted to get across, which I think too many generator articles skip, is how generator code flow actually works. Once you understand that generators become just another piece of code to reason about.

In all honesty, you can probably get by as a PHP programmer without ever touching generators. In practice, when they’re used by other developers, it tends to be behind the scenes and transparent to end users of a library or API. However, as PHP starts to evolve towards providing support for asynchronous programming features, you’ll be hearing a lot more about generators. Generators are an example of a coroutine, and when combined asynchronous PHP frameworks like React (not the UI framework of the same name) they unlock a lot of new, powerful, programming metaphors and techniques.

Copyright © Alan Storm 1975 – 2018 All Rights Reserved

Originally Posted: 5th November 2018