Categories


Archives


Recent Posts


Categories


Bypassing a Slow Composer Repository

astorm

Frustrated by Magento? Then you’ll love Commerce Bug, the must have debugging extension for anyone using Magento. Whether you’re just starting out or you’re a seasoned pro, Commerce Bug will save you and your team hours everyday. Grab a copy and start working with Magento instead of against it.

No Frills Magento Layout is the only Magento front end book you'll ever need. Get your copy today!

This week I had a long standing to do item I was looking forward to tackling

Write a dead simple “Getting Started” tutorial for Magento/Composer/FireGento

FireGento are a group of European-centric Magento developers who run a hackathon, and cultivate several projects to help them all do their jobs. One of those projects is the ability to manage Magento modules via PHP Composer, and a corresponding Composer repository dedicated to those same Magento modules.

Unfortunately, when I set out to write my dead simple “Getting Started” tutorial, I ran into a major roadblock. Slowness abounded. PHP Composer was hanging after I ran every command. It was borderline unusable, and a far cry from how things should work.

Complicating things further, for all its revolutionizing PHP package/dependency management, there’s not many developers out there who understand Composer’s internals, and Composer’s documentation has that certain “we assume you’ve been coming to the meetings” feel about it.

Stack Overflow was no help, and I opened bugs in both the Composer and FireGento issue trackers with little activity or actionable advice.

So, instead of a dead simple “Getting Started” tutorial, I decided to write up how I’m working around the performance problems I’ve been having with Composer. While this tutorial specifically covers the FireGento repository, the concepts can apply to any Composer repository that’s suffering from network related latency.

The Problem

Composer repositories are lists of package names that link to git repositories (or subversion, mercurial, etc.). It works something like this

Composer: Hey, repository? Do you know about package foo?

Repository: Yes, I do know about package foo.

Composer: Do you know where I can find package foo?

Repository: Yes, it’s at this git/subversion/mercurial/archive URI

Composer: Thanks repository!

Each repository has a packages.json file that contains a list of every package name with a URI for the package’s real location. You can see the FireGento packages here.

If you tried clicking on that link you may see the problem. There’s over 3,000 unique packages in the FireGento repository — over 11,000 if you consider the different versions. That’s a packages.json file that’s over 12 MB in size. The performance problem I ran into was this. Every time Composer needed information in this file, it would download a new version.

Surprisingly, Composer doesn’t appear to use gzip compression. A quick check of both FireGento’s packages.json and the packagist packages.json (the main Composer repository) revealed neither returned gzip encoded content. There was, however, one seemingly odd thing about the packagist packages.json. Despite being responsible for every general PHP package distributed by Composer, it’s tiny.

#https://packagist.org/packages.json
{
"packages": [],
"notify": "\/downloads\/%package%",
"notify-batch": "\/downloads\/",
"providers-url": "\/p\/%package%$%hash%.json",
"search": "\/search.json?q=%query%",
"provider-includes": {
    "p\/provider-active$%hash%.json": {
        "sha256": "05eeca5abedcc69d4bde0e4ec9da117ee6db21b82966a2880b92d715250bbff3"
    },
    "p\/provider-archived$%hash%.json": {
        "sha256": "e3c4ecbf14703c7c27715c722166bfc415de2bd3b6e362d202cde52f2c0e6486"
    },
    "p\/provider-latest$%hash%.json": {
        "sha256": "12394fec0600f586c0da9aa32ae50920fd9eee6bb3ec2335d0245d677c432c3f"
    },
    "p\/provider-stale$%hash%.json": {
        "sha256": "faab0b5a39c5becd7d6beda2ba118484e5f1dad4b46449cbf0bb3d41dee41296"
    }
}}

Doing some reading, it turns out Composer has an undocumented system for splitting packages.json into multiple files to avoid large file downloads. Because of this, it may be the Composer developers never saw the need to implement a caching system for the main packages.json, or to use gzip compression. Put another way, the lead developer may have specifically decided against a caching system to ensure the team was forced to implement a file splitting system that worked.

Regardless, as a third (or fourth?) party, I’m left with a FireGento packages.json that’s over 12MB in size, and taking minutes to download each time. While not a big deal for something that happens once in a great while, it means adding/removing Composer packages to my project is a tedious, time consuming affair.

Decentralized Repositories

While it’d be ideal if Composer had a better caching strategy, or FireGento had a better implementation of packages.json, the creators of Composer’s architecture have given us a way out. Composer’s repository system is decentralized. While packagist is pushed as the main PHP repository, Composer was very deliberately built to support anyone operating a repository. That’s why FireGento can create and manage their own Composer repository of Magento packages.

It also means we can host a repository on our local machine, and make that repository a mirror of the FireGento repository. Hosting the packages.json file on a local machine (or local network) means the download time for that single, enormous packages.json file will be dramatically reduced.

The simplest solution? We setup a website/domain on our local machine with a copy of packages.json, such that we can access it via the development URL http://packages.pulsestorm.dev/packages.json. The rough steps I took to this were

  1. Setup an apache virtual host for a domain you control, or a fake development domain (pacakges.pulsestorm.dev for me)

  2. Setup DNS for that host to point to your computer (i.e. add 127.0.0.1 packages.pulsestorm.dev to your hosts file)

  3. Download the FireGento packages.json to the root of your new virtual host. I grabbed it with the following curl command: curl -LO http://packages.firegento.com/packages.json

With the above in place, let’s consider the performance differences. First, here’s a dead simple composer.json pointing to the official repository.

#File: composer.json
{
    "require": {
        "magento-hackathon/magento-composer-installer": "*"
    },
    "repositories": [
        {
            "type": "composer",
            "url": "http://packages.firegento.com"
        }
    ],    
    "extra":{
        "magento-root-dir": "magento/"
    }
}

If we run this with Composer’s --profile and -vvv verbose options, we see the performance problem. It takes Composer around 94 seconds to download the entire package.

$ composer.phar install --profile -vvv
[3.7MB/0.01s] Reading ./composer.json
...
[73.2MB/94.84s] Writing /path/to/.composer/cache/repo/http---packages.firegento.com/packages.json into cache

However, if we change the composer.json to point to our local mirror (the url field under the repositories field)

#File: composer.json
{
    "require": {
        "magento-hackathon/magento-composer-installer": "*"
    },
    "repositories": [
        {
            "type": "composer",
            "url": "http://packages.pulsestorm.dev"
        }
    ],    
    "extra":{
        "magento-root-dir": "magento/"
    }
}

we end up with dramatically different numbers.

$ composer.phar install --profile -vvv
[3.7MB/0.01s] Reading ./composer.json
...
[73.2MB/1.29s] Writing /path/to/.composer/cache/repo/http---packages.pulsestorm.dev/packages.json into cache

Here it only took Composer 1.29 second to download the repository file. A dramatic improvement, and one that makes the project usable again.

Official Mirror

While curling the packages.json file down to a local machine scratches my personal nerdy-http itch, it does present one problem — staying up to date with changes to the official FireGento repository. If you want to create a more official/stable mirror, you’ll need to familiarize yourself with the FireGento composer-repository project.

This repository is what the FireGento team uses to manage http://packages.firegento.com. It has two branches. The gh-pages branch is the public repository website branch. If you clone this branch.

git clone -b gh-pages git@github.com:magento-hackathon/composer-repository.git

you’ll end up with the full contents of the existing packages.firegento.com website.

In other words, a full mirror of the repository.

$ ls -1
CNAME
connect20
index.html
packages.json

If you’re curious how this is generated, try checking out the master branch of this project

git clone -b master git@github.com:magento-hackathon/composer-repository.git

When you take a look at the project’s master branch, you’ll see

$ ls -l

README.md
repairedConnectPackages
satis.json
script

This is a Satis project. Satis is a “Simple static Composer repository generator”. In other words — it lets users generate a packages.json for their own repository (using the meta-data in satis.json), but lacks the logic that splits packages.json into multiple files like the official packagist repository. If you want to generate this file yourself, you’ll also need to clone the satis repstory and install it on your computer.

Wrap Up

The FireGento Composer repository, like so much of open source developer infrastructure these days, suffers from lack of resources. Make no mistake, it’s awesome that it exists, but a two day hackathon project is never going to take scaling, usability, and user-adoption concerns into account. On paper (or in bits), a one to two minute download for packages.json doesn’t seem like a big deal, but in practice it makes working with the project a horrible tedium that only the most dedicated/vested developers will stick around for.

For us, this is another great example of how knowing and understanding the underlying concepts behind our tools can make us much more efficient developers. Realizing that Composer isn’t magic, and being able to diagnose a problem without Google means you can solve problems that third parties like FireGento haven’t gotten around to solving yet. As “open source” and software development becomes more and more fragmented, these skills will become more valuable than ever.

Originally published April 23, 2014
Series NavigationSlow Composer Followup >>