Categories


Recent Posts


Archives


Async PHP: Symfony’s HttpClient 4.3

astorm

Frustrated by Magento? Then you’ll love Commerce Bug, the must have debugging extension for anyone using Magento. Whether you’re just starting out or you’re a seasoned pro, Commerce Bug will save you and your team hours everyday. Grab a copy and start working with Magento instead of against it.

Updated for Magento 2! No Frills Magento Layout is the only Magento front end book you'll ever need. Get your copy today!

This entry is part 1 of 1 in the series Async PHP. This is the first post in the series.

One luxury afforded to the average PHP programmer is never having to worry about threads or asynchronous programming. Unlike other dynamic languages of its era (ruby, python, etc.), PHP has no built-in concept of a thread. PHP also came along early enough that “asynchronous programming” wasn’t really a thing, and the language wasn’t built with any asynchronous primitives. Those that were added later (like generators) haven’t seen widespread adoption. For most PHP use cases “async” programming has meant sending a message to a job queue and running work offline.

This is, however, starting to change. Programming environments like NodeJS are making asynchronous programming techniques increasingly mainstream. This mainstream exposure has pushed other programming environments to explore async concepts. PHP is no exception.

In this new series we’ll explore how async concepts are leaking into PHP from the edges, explain how these APIs work, and describe the challenges involved in programming asynchronously in an environment that’s not designed for it.

Today we’ll start small and discuss the Symfony projects’s HttpClient library and its asynchronous features.

Threads and Asynchronous Programming

Before we start, let’s talk about what we mean when we say asynchronous programming.

In its most primitive form, a program is a simple series of steps.

  1. Go to the store
  2. Get loaf of bread
  3. Get bottle of milk
  4. Get stick of butter
  5. Pay for groceries
  6. Go home

Most operating systems, (and the programming languages used inside of them), have the concept of a thread. In over simplified terms, a thread lets you run two sets of code inside a single computer program at once. Another way to describe this is running code in parallel.

If we were to describe threads in terms of the above steps, a thread might be an adult (the main thread) sending three of their kids (the child threads) to the store to get the three items — or it might be a parent going to the store with three children and sending each one to a different aisle for the three items.

Threads help programmers access the full processing power of their computer, but threaded programming is hard to get right. Two threads of execution are not guaranteed to execute identically every time. Once a thread is started there’s no direct way to control it — it’s up to the end-user-programmer to coordinate between threads through shared global state. One technique for this is called Mutex locking). The unpredictability of threads creates the sort of unpredictable, hard to reproduce bugs and behavior that are the bane of any software project.

Newer languages and systems have tried to improve on threads or force end-user-programmers down threaded paths that are less unpredictable. Two examples of this are go’s go routines and channels. Another is Apple’s grand central dispatch.

Asynchronous Programming vs. Threads

When we talk about asynchronous programming in this series, we’re talking about how single programs can execute in an order other than the order the code’s written in without directly using threads.

So while external job-queue systems could accurately be described as asynchronous programming, that’s not what we’re interested in today.

Synchronous HTTP Requests

Now that we’re constrained by a few definitions, let’s get started. We’re going to write a small PHP program that will

  1. Output some text
  2. Fetch the https://httpstat.us/200?sleep=5000 URL
  3. Output some more text

At the time of this writing httpstat.us is a publicly available but privately run web-service that lets you

  1. Make an HTTP request
  2. Tell that request what status code it should return
  3. And how long to sleep before returning it

The URL https://httpstat.us/200?sleep=5000 will sleep for 5 seconds and then return a 200 OK status code.

Let’s create the program to make that request

#File: working.php
<?php
namespace Pulsestorm\Tutorials\AsyncPhp;

function main($argv) {
    echo "About to make the request","\n";
    echo "...","\n";
    echo file_get_contents('https://httpstat.us/200?sleep=5000');
    echo "Made the request!";
}
main($argv);

and then run it.

$ php working.php
About to make the request
...
// 5 second pause ... not actually printed
Made the request!

Pay attention to the output of your program as it runs. You’ll notice that it paused for around 5 seconds after outputting .... That’s because your program was waiting for the HTTPS request to finish. You’ll get the same results if you try to use PHP’s curl_* functions

<?php
namespace Pulsestorm\Tutorials\AsyncPhp;

function downloadPage($path){
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL,$path);
    curl_setopt($ch, CURLOPT_FAILONERROR,1);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
    curl_setopt($ch, CURLOPT_TIMEOUT, 15);
    $retValue = curl_exec($ch);
    curl_close($ch);
    return $retValue;
}

function main($argv) {
    echo "About to make the request","\n";
    echo "...","\n";
    $test = downloadPage('https://httpstat.us/200?sleep=5000');
    echo "Made the request!";
}
main($argv);

Both the file_get_contents URL wrappers and the curl_* functions operate synchronously. Once you make an HTTP request your program needs to wait until it finishes before moving on to the next line.

Symfony HttpClient

Now we’re going to compare the above to Symfony’s HttpClient library. We’ll use composer to install the latest 4.3.x version of this library

$ composer require symfony/http-client:~4.3.0

and then rewrite our small program to use Symfony’s HTTP client.

#File: working.php
<?php
namespace Pulsestorm\Tutorials\AsyncPhp;
use Symfony\Component\HttpClient\HttpClient;
require 'vendor/autoload.php';

function main($argv, $httpClient) {
    echo "About to make the request","\n";

    $response = $httpClient->request('GET', 'https://httpstat.us/200?sleep=5000');

    echo "Made the request!","\n";
    echo "...","\n";
}
main($argv, HttpClient::create());

If we run this program, we’ll see the following output.

$ php working.php
About to make the request
Made the request!
...
// 5 second pause ... not actually printed

This time our program printed out About to make the request and Made the request immediately. Unlike the curl and file_get_contents example, our call to HttpClient’s request method executed immediately and appears to have finished making the request in the background. Our program still halted at the end to finish the request, but while our program was running it behaved a like an async program would.

How is this possible in a language without async or threading primitives?

PHP is Just C

Readers of the There’s no Such Thing as PHP article may know where this is going. While PHP has no threads or (as of this writing) no async primitives, the PHP “runtime” is just a collection of compiled C code (called extensions). The distributed nature of how the PHP project is run means there isn’t even “one true PHP” — there’s just different vendors and package maintainers putting together a version of PHP.

There’s nothing stopping a PHP extension developer from using a thread to build async or async-like behavior into their functions and classes. In fact, a number of extensions do this. The Symfony HttpClient authors have identified these hidden places where there’s async-like behavior in PHP and used it to build an HTTP client library that doesn’t block the main execution of your program.

Tracking down how the symfony HttpClient library manages this, and how the C code that implements this works, is beyond the scope of this article. If you’re the curious sort though the two places to get started are

The Symfony HttpClient library will, by default, attempt to use the Symfony\Component\HttpClient\CurlHttpClient to make your HTTP request. If the curl_* functions are not available in your particular installation of PHP, HttpClient will fall back to using the Symfony\Component\HttpClient\NativeHttpClient class. This “native” client uses PHP’s stream context functionality to ensure the HTTP requests made are not synchronous.

Awkward APIs

While this is an interesting use of PHP, it’s not without its downsides. Consider the following program, which is very similar to the one we ran above.

<?php
namespace Pulsestorm\Tutorials\AsyncPhp;
use Symfony\Component\HttpClient\HttpClient;
require 'vendor/autoload.php';

function main($argv, $httpClient) {
    echo "About to make the request","\n";

    $httpClient->request('GET', 'https://httpstat.us/200?sleep=5000');

    echo "Made the request!","\n";
    echo "...","\n";
}
main($argv, HttpClient::create());

Run this program,

$ php working.php
About to make the request
// 5 second pause ... not actually printed
Made the request!
...

and you’ll see the 5 second pause happens between About to make the request and Made the request!. This tells us our program stopped and waited for the HTTP request to finish.

The problem is this line.

$httpClient->request('GET', 'https://httpstat.us/200?sleep=5000');

We never assigned the return value of $httpClient->request call to a variable. This means the PHP garbage collector thought it was OK to free the memory for the object that would normally be returned for this function, and Symfony’s HTTP client library made sure to finish the request before the object went away forever. So the request finishes, but finishes synchronously.

It’s an odd bug, and not the sort of thing you want to track down when all you’re trying to do is make some HTTP requests efficiently. It’s also inevitable when you’re trying to bring new concepts into a language that’s not designed for them.

To really make use of this asynchronous behavior, PHP programmers need an extension that will give them first-class asynchronous primitives. Our next task for this series? Find one of those extensions and give it a try.

Copyright © Alan Storm 1975 – 2020 All Rights Reserved

Originally Posted: 13th February 2020