Categories


Archives


Recent Posts


Categories


In Depth Magento Dispatch: Rewrites

astorm

Frustrated by Magento? Then you’ll love Commerce Bug, the must have debugging extension for anyone using Magento. Whether you’re just starting out or you’re a seasoned pro, Commerce Bug will save you and your team hours everyday. Grab a copy and start working with Magento instead of against it.

No Frills Magento Layout is the only Magento front end book you'll ever need. Get your copy today!

We’re in the middle of a series covering the various Magento sub-systems responsible for routing a URL to a particular code entry point. So far we’ve covered the general front controller architecture and four router objects that ship with the system. Today we’ll we covering Magento’s two rewrite systems. You may want to bone up on parts one, two, and three before continuing.

Notice: While the specifics of this article refer to the 1.6 branch of Magento Community Edition, the general concepts apply to all versions.

Rewrite, Rewrite, Rewrite

Step one to understanding Magento’s request rewrites is to understand what they’re not. For a working web developer there’s two other things called rewrites that have very little to do with Magento’s request rewrites and will often confuse conversions about the system.

First, there’s plain old apache (or nginx, or etc.) url rewriting. At one time, working knowledge of how apache handled rewriting URLs (via MOD_ALIAS, MOD_REWRITE, and their ilk) was required knowledge for a working web developer. It’s become less import for web application developers to understand their intricacies, as most PHP frameworks rewrite every request to index.php and then programmatically parse out the URL.

RewriteRule .* index.php [L]

Interestingly, this change in methodology has been slower to happen in ecommerce. You’ll still find lots of IT and SEO centric professionals who swear by an .htaccess file full of hard coded rewrites.

Magento’s request rewrite system is often used for similar purposes as the web server based rewrite system, but it’s a completely separate system.

The second system that’s often confused for Magento’s request rewrite system is Magento’s infamous class rewrite system. As any new convert becomes quickly aware of, Magento allows you to swap in your own classes for Magento model, helper, and block classes. Part of this process includes using a <rewrite> tag in Magento’s merged config.xml tree. As we’ll learn later, request rewrites may also involve using a <rewrite> tag, but in a completely different context. Extra confusingly, Magento doesn’t allow class rewrites of controller classes, but you can achieve similar results using the request rewrite system.

What is a Request Rewrite

Now that you know what a request rewrite isn’t, we can tell you what a request rewrite is.

The purpose of a Magento request rewrite is to change the request object prior to looping over the various router objects such that the request is interpreted differently. Essentially, a Magento request rewrite allows you to change the path information seen by the router objects, which in turn means you can use the rewrite system to send a URL that would eventually be routed one place (a 404 page) to somewhere else (a product landing page).

If that’s a little esoteric, let’s consider a stock Magento system with the sample data installed. When you access the following URL

http://magento.example.com/electronics/cameras/accessories/universal-camera-case.html

Magento will apply a request rewrite such that the path information extracted in a router’s match method looks like this

catalog/product/view/id/133/category/25

instead of

electronics/cameras/accessories/universal-camera-case.html  

So, rather than returning a 404 page for a request to the electronics module, cameras controller and accessories action, the request is rewritten to the product controller’s view action in the catalog module.

That’s the core mission of a rewrite. In turn, this behavior is used for a variety of things, including the aforementioned SEO, as well as an earlier version of “controller overriding” which proved vexing enough that the core team added support for the <modules> tag, covered previously. Also, and unsurprisingly, as we dive deeper into the systems we’ll see where it has picked up the unavoidable feature related cruft of agile startup development.

Where are Rewrites Applied

Earlier articles in this series covered the main routing foreach loop in the front controller object’s dispatch method.

#File: app/code/core/Mage/Core/Controller/Varien/Front.php
while (!$request->isDispatched() && $i++<100) {
    foreach ($this->_routers as $router) {
        if ($router->match($this->getRequest())) {
            break;
        }
    }
}

If you look above that block of code in Front.php, you’ll see where Magento applies its rewrites.

#File: app/code/core/Mage/Core/Controller/Varien/Front.php
if (!$request->isStraight()) {
    Mage::getModel('core/url_rewrite')->rewrite();
}
$this->rewrite();

We’ve removed the calls to Varien_Profiler to make the flow clearer. As you can see, there’s two separate method calls related to rewrites. That’s because there’s two different request rewrite systems in Magento. The first rewrite system is based on a set of rewrite rules located in the Magento database/model-layer, accessed by the core/url_rewrite model class. The second rewrite system is based on a set of rewrite rules added to the combined config.xml tree.

While each of these rewrite systems has a similar aim (allowing users to change the request object’s path information before routing) each system achieves these goals in a different way. This is another one of those situations where the abundance of options makes Magento more confusing to newcomers, and confuses conversation about the system online. Open up your Magento toolkit, pull out your deep breaths, and remember it’s all just code.

Model Based Rewrites

We’re going to consider the model based rewrites first.

#File: app/code/core/Mage/Core/Controller/Varien/Front.php
if (!$request->isStraight()) {
    Mage::getModel('core/url_rewrite')->rewrite();
}

This system is kicked off by instantiating a core/url_rewrite model, and then calling its rewrite method. The rewrite method will examine the request, compare it to a list of rewrites in the database and, if appropriate, change the path information. Before we get to that, let’s consider the conditional it’s wrapped in

#File: app/code/core/Mage/Core/Controller/Varien/Front.php    
if (!$request->isStraight()) {

If we take a look at the isStraight method definition, we see it’s a simple property setter/getter

#File: app/code/core/Mage/Core/Controller/Request/Http.php
public function isStraight($flag = null)
{
    if ($flag !== null) {
        $this->_isStraight = $flag;
    }
    return $this->_isStraight;
}

Before we get to the intent of this method, it’s worth noting the pattern used here. This method is both the getter and setter for the _isStraight property. If you use a method parameter, it acts as a setter. If you don’t use a parameter, it simply returns the value. Not rocket science, but worth noting if you’re used to the different semantics of Varien_Object.

As for the method’s intent, it appears to be a mechanism that would allow earlier system/observer code to indicate the rewrite system should be skipped for this particular request. In practice this mechanism isn’t used anywhere in the core system, and can be a confusing a mechanism as it’s only the database rewrites that are skipped for a “straight” request, not the configuration based rewrites. While it’s important to understand how this property interacts with the system, I’d avoid using or relying on its value for anything until a clearer definition of a “straight” request emerges.

Anatomy of a Rewrite Model

Before we examine the rewrite method, it’s worth looking at the structure of the core/url_rewrite model itself. If you run the following code (say, in an empty controller action), it will dump every database rewrite in your system out to the browser

$c = Mage::getModel('core/url_rewrite')->getCollection();
foreach($c as $item)
{
    var_dump($item->getData());
}

Here’s one example, again from the sample data

array(
    'url_rewrite_id' => string '213' (length=3)
    'store_id' => string '1' (length=1)
    'category_id' => string '25' (length=2)
    'product_id' => string '133' (length=3)
    'id_path' => string 'product/133/25' (length=14)
    'request_path' => string 'electronics/cameras/accessories/universal-camera-case.html' (length=58)
    'target_path' => string 'catalog/product/view/id/133/category/25' (length=39)
    'is_system' => string '1' (length=1)
    'options' => null
    'description' => null
);

The core/url_rewrite model has a number of data properties that suggest the rewrite system is about more that just the request object. We’ll get to them eventually, but for now concentrate on the two most important properties, request_path and target_path.

'request_path' => string 'electronics/cameras/accessories/universal-camera-case.html' (length=58)
'target_path' => string 'catalog/product/view/id/133/category/25' (length=39)

At its most basic level, the job of the rewrite method is to look for a request whose path information is the value in request_path, and then alter the request object such that its path information is the value in target_path. Keeping this in mind, let’s take a look at the model’s rewrite method

Loading the Rewrite Model, Applying the Rewrite

Step one in our model based rewrite process is determining if this particular request needs to be rewritten. This is done by attempting to load the current model (remember, this rewrite method is on a core/url_rewrite object) by its request_path property, using the request object’s path information as the value.

The way Magento goes about this is a little verbose, but at the end of the day it’s nowhere near as complex as it looks. Here’s the start of the method

#File: app/code/core/Mage/Core/Model/Url/Rewrite.php
public function rewrite(Zend_Controller_Request_Http $request=null, Zend_Controller_Response_Http $response=null)
{
    if (!Mage::isInstalled()) {
        return false;
    }
    if (is_null($request)) {
        $request = Mage::app()->getFrontController()->getRequest();
    }
    if (is_null($response)) {
        $response = Mage::app()->getFrontController()->getResponse();
    }
    if (is_null($this->getStoreId()) || false===$this->getStoreId()) {
        $this->setStoreId(Mage::app()->getStore()->getId());
    }
    ...
}

As you can see above, the method starts with a few sanity checks. If Magento hasn’t been installed yet we bail by returning false. Then, we ensure we have a reference to both the global/singleton request and response objects. Pretty standard stuff so far. Next up is the following block of code

#File: app/code/core/Mage/Core/Model/Url/Rewrite.php
$requestCases = array();
$pathInfo = $request->getPathInfo();
$origSlash = (substr($pathInfo, -1) == '/') ? '/' : '';
$requestPath = trim($pathInfo, '/');

$altSlash = $origSlash ? '' : '/'; // If there were final slash - add nothing to less priority paths. And vice versa.
$queryString = $this->_getQueryString(); // Query params in request, matching "path + query" has more priority
if ($queryString) {
    $requestCases[] = $requestPath . $origSlash . '?' . $queryString;
    $requestCases[] = $requestPath . $altSlash . '?' . $queryString;
}
$requestCases[] = $requestPath . $origSlash;
$requestCases[] = $requestPath . $altSlash;

This is the code that pulls the path information from the request object, which we’re going to use to load this rewrite model. This ends up getting a little complex, because we want to load up the $requestCases array with two versions of the path information. One with a trailing slash, the other without

sony-vaio-vgn-txn27n-b-11-1-notebook-pc.html/
sony-vaio-vgn-txn27n-b-11-1-notebook-pc.html

foo/baz/bar
foo/baz/bar/

On the database rewrite level, these URIs are considered (mostly) interchangeable. This means a user may enter either of them into the browser, and a store owner could setup a custom rewrite with either form. The correct way to handle this long standing URL issue isn’t always clear, so Magento attempts to handle either case seamlessly for you. Once the $requestCases array is populated, we attempt to load the model

#File: app/code/core/Mage/Core/Model/Url/Rewrite.php
$this->loadByRequestPath($requestCases);

We’ll cover the actual logic involved in loading a rewrite model in a future article, for now just assume Magento queries the database for a request_path that matches one of our $requestCases.

The next important bit of code is this

#File: app/code/core/Mage/Core/Model/Url/Rewrite.php
if (!$this->getId()) {
    return false;
}

$request->setAlias(self::REWRITE_REQUEST_PATH_ALIAS, $this->getRequestPath());

If we’ve failed to load a rewrite (that is, there’s no ID set on the core/url_rewrite model), we bail from the method by returning false. Otherwise, our next step is to set an alias on the request object using the request_path property of our core/url_rewrite model. For our previous example, that’s the string

/electronics/cameras/accessories/universal-camera-case.html

A request object can hold an unlimited number of aliases, each a key/value pair. The aliases are a mechanism that signal to someone inspecting the request object that although it has a particular value set for it’s path information, this information isn’t the original path information from the HTTP request.

Ghosts of Apache

Next up we have this bit of code

$external = substr($this->getTargetPath(), 0, 6);
$isPermanentRedirectOption = $this->hasOption('RP');

if ($external === 'http:/' || $external === 'https:') {
    if ($isPermanentRedirectOption) {
        header('HTTP/1.1 301 Moved Permanently');
    }
    header("Location: ".$this->getTargetPath());
    exit;
} else {
    $targetUrl = $request->getBaseUrl(). '/' . $this->getTargetPath();
}

For the example we’ve been considering so far (rewriting /electronics/cameras/accessories/universal-camera-case.html to catalog/product/view/id/133/category/25) this bit of code doesn’t apply. It is, however, worth examining as it reveals more advanced functionality of the rewrite system.

If your rewrite object’s target_path starts with http or https, Magento will treat the rewrite as an http redirect rather than an internal rewriting of the request path. It’s interesting to note that Magento’s response object isn’t used for the redirect here, instead it’s a simple call to PHP’s header method and an explicit exit. Possibly legacy, possibly a difference of opinion within the core team, probably both. It’s also important to note the explicit exit, meaning system code halts execution. This means no future events will fire for this request.

You also may have noticed the call to the rewrite object’s hasOption method. Each core/url_rewrite model has a data parameter named options. If we look at the definition of hasOption

#File: app/code/core/Mage/Core/Model/Url/Rewrite.php
public function hasOption($key)
{
    $optArr = explode(',', $this->getOptions());
    return array_search($key, $optArr) !== false;
}

we can take an educated guess that this field is intended to be a comma separated list of values. If one of those values is the string 'RP', then we use a HTTP status code of 301 (instead of the default 302) for the request.

Before we explain what these options mean, let’s jump to the next bit of code.

#File: app/code/core/Mage/Core/Model/Url/Rewrite.php
...
else {
    $targetUrl = $request->getBaseUrl(). '/' . $this->getTargetPath();
}

$isRedirectOption = $this->hasOption('R');
if ($isRedirectOption || $isPermanentRedirectOption) {
    if (Mage::getStoreConfig('web/url/use_store') && $storeCode = Mage::app()->getStore()->getCode()) {
        $targetUrl = $request->getBaseUrl(). '/' . $storeCode . '/' .$this->getTargetPath();
    }
    if ($isPermanentRedirectOption) {
        header('HTTP/1.1 301 Moved Permanently');
    }
    header('Location: '.$targetUrl);
    exit;
}    

If our rewrite model indicated an external redirect, this code never runs as we’ve already exited. If we’re still here, we can see that the code starts building an alternate $targetUrl using the base request URL as well as the target_path property of our rewrite object. Then we check for another option, this one named R, and if it’s present (or our previous check for PR was true), we again treat this rewrite as an HTTP redirect, using our constructed $targetUrl as the ultimate destination. The big different here is we use the request object’s base url ($request->getBaseUrl()) to provide a protocol and domain name.

Rewrite Options

In addition automatically treating a full URL as an HTTP redirect, Magento will also look for these mysterious option strings, (R and PR). If they’re present, the rewrite will become an HTTP redirect.

So what are these mysterious and cryptic option strings? The syntax is borrowed from the apache web server’s flag syntax. To be be clear, this has nothing to do with apache itself. Instead, the syntax for indicating a rewrite should be a permanent [P] redirect [R] has been borrowed from apache. This helps increase the confusion level around what rewrites do, and what they’re intended for.

Earlier in this article we said that the only thing a rewrite does is change the request object that’s sent to the routers. As you can see, we went a little bit Obi-Wan Kenobi on you to help reduce the complexity of what happens in a rewrite. From the point of view of a router object, the only thing we need to be concerned with is the changing of the path information, as the redirects will cause the system to exit before getting to a router. It’s arguable that the redirects don’t belong at this level or in this system, but those are thoughts better suited for a discussion of alternate realities where Magento development took a different direction.

Back on the Path (Information)

That little detour into my lies redirect land out of the way, there’s one more small bit of code to consider before we get back on the path information path.

#File: app/code/core/Mage/Core/Model/Url/Rewrite.php
if (Mage::getStoreConfig('web/url/use_store') && $storeCode = Mage::app()->getStore()->getCode()) {
        $targetUrl = $request->getBaseUrl(). '/' . $storeCode . '/' .$this->getTargetPath();
    }

$queryString = $this->_getQueryString();
if ($queryString) {
    $targetUrl .= '?'.$queryString;
}

If we haven’t redirected and exited, the above code executes and continues to build our $targetPath string by adding any existing query string. If Magento is configured to add the store ID to the URL that will also be done. Finally, the method ends with the following.

#File: app/code/core/Mage/Core/Model/Url/Rewrite.php
$request->setRequestUri($targetUrl);
$request->setPathInfo($this->getTargetPath());

return true;

Before returning true (indicating a rewrite has taken place), we set the request object’s path information by consulting the target_path property of the core/url_rewrite model. We also call the setRequestUri method on the request object, which sets the request object’s protected _requestUri property with the value of our $targetUrl.

Coming from the context of routing, the purpose of the rewrite system is to set a different path information variable. However, in the interest of being complete, the rewrite system also jiggers the request object in such a way that it completely resembles a request that was naturally made with the new URI. The one difference is the alias information we set earlier in the request object.

$request->setAlias(self::REWRITE_REQUEST_PATH_ALIAS, $this->getRequestPath());

By setting an alias, we ensure the original request information is available should a need for it arise.

Our duties done, we return true and allow normal system functions to resume.

Configuration Based Rewrites

Those are the basics of database based rewrites. There’s a lot more to discuss and explore there, primarily how the loading mechanism works with regards to multiple matches and priorities, as well as the meaning of all those extra data parameters. We’re going to save that for the advanced article and go back to the front controller’s dispatch method.

#File: app/code/core/Mage/Core/Controller/Varien/Front.php
if (!$request->isStraight()) {
    Mage::getModel('core/url_rewrite')->rewrite();
}
$this->rewrite();

We’ve just finished calling the core/url_rewrite model’s rewrite method. At this point we may, or may not, have had our request object’s path information rewritten. If the database/model rewrite indicated an HTTP redirect was appropriate, we’re not here as that code kicks off the Location header redirect and exits. Otherwise, our next method call to $this->rewrite();.

This is the front controller’s rewrite method, and it kicks off the rewrite process for configuration based rewrites. These are the rewrites most commonly advertised as being used for creating “controller overrides“. While that’s certainly one use for them, all they’re really doing is changing a request object’s path information, a side-effect of which is the loading of a different controller class to handle the request.

A few things to stress before we begin. Although configuration based rewrites perform a task similar to the core/url_rewrite models, the systems are separate and share no code. The only way they interact is the request object may have already been rewritten the database rewrite system when it gets to the configuration based system. This won’t stop the configuration based system from performing its tasks, but it will mean it’s working with an already jiggered request object. It’s also worth noting that, unlike the database rewrites, there’s no capacity in the configuration based system for http 301/302 redirects

Let’s start by taking a look at the front controller’s rewrite method

#File: app/code/core/Mage/Core/Controller/Varien/Front.php
public function rewrite()
{
    $request = $this->getRequest();
    $config = Mage::getConfig()->getNode('global/rewrite');
    if (!$config) {
        return;
    }
    foreach ($config->children() as $rewrite) {
        ...
    }
}

This method starts simply enough. First, we fetch a reference to the global/singleton request object. Then, we fetch a list of nodes from the combined config.xml tree at the path global/rewrite. Using the de-facto canonical Magento Wiki example, this would fetch the sub-nodes mynamespace_mymodule_checkout_cart and another_rewrite_node_with_a_unique_name from the configuration below

#File: app/code/local/Package/Module/etc/config.xml
<global>        
    <rewrite>
        <mynamespace_mymodule_checkout_cart>
            <from><![CDATA[#^/checkout/cart/#]]></from>
            <to>/mymodule/checkout_cart/</to>
        </mynamespace_mymodule_checkout_cart>

        <another_rewrite_node_with_a_unique_name>
            <from><![CDATA[#^/checkout/cart/#]]></from>
            <to>/mymodule/checkout_cart/</to>
        </another_rewrite_node_with_a_unique_name>            
    </rewrite>
</global>

Then, having fetched the list of nodes, we foreach over them. The foreach loop itself looks like this

#File: app/code/core/Mage/Core/Controller/Varien/Front.php
foreach ($config->children() as $rewrite) {
    $from = (string)$rewrite->from;
    $to = (string)$rewrite->to;
    if (empty($from) || empty($to)) {
        continue;
    }
    $from = $this->_processRewriteUrl($from);
    $to   = $this->_processRewriteUrl($to);

    $pathInfo = preg_replace($from, $to, $request->getPathInfo());

    if (isset($rewrite->complete)) {
        $request->setPathInfo($pathInfo);
    } else {
        $request->rewritePathInfo($pathInfo);
    }
}

First, we extract values from the <from/> and <to/> nodes. If these nodes don’t exist we continue on to the next loop iteration. Otherwise, $from and $to are processed via the _processRewriteUrl method (more on this later). Then we reach the meat of this loop. We create a new value for the request object’s path information by using $from and $to as parameters for a call to preg_replace.

#File: app/code/core/Mage/Core/Controller/Varien/Front.php
$pathInfo = preg_replace($from, $to, $request->getPathInfo());

With our example above, that would look like

#File: app/code/core/Mage/Core/Controller/Varien/Front.php
$pathInfo = preg_replace('#^/checkout/cart/#', '/mymodule/checkout_cart/', $request->getPathInfo);

Notice there’s no resolving the trailing / as there was with the database rewrites. This is a brute force regular expression applied directly to the $pathInfo. Next, we need to assign our new path information back to the request object

#File: app/code/core/Mage/Core/Controller/Varien/Front.php
if (isset($rewrite->complete)) {
    $request->setPathInfo($pathInfo);
} else {
    $request->rewritePathInfo($pathInfo);
}

This code checks our rewrite node code for a <complete/> node. If this node exists, setPathInfo is called. If it doesn’t exist, then rewritePathInfo is called instead. If we take a look at the definition of the rewritePathInfo method,

#File: app/code/core/Mage/Core/Controller/Request/Http.php
public function rewritePathInfo($pathInfo)
{
    if (($pathInfo != $this->getPathInfo()) && ($this->_rewritedPathInfo === null)) {
        $this->_rewritedPathInfo = explode('/', trim($this->getPathInfo(), '/'));
    }
    $this->setPathInfo($pathInfo);
    return $this;
}

we can see that it ultimately calls setPathInfo as well, after conditionally doing something with the _rewritedPathInfo property. We’ll come back to that in a bit, but right now your main takeaway should be that these configuration rewrites can be used to apply a regular expression to the request object’s path information variable.

After looping through each rewrite rule $from and $to pair, the rewrite method ends. At this point, both rewrite systems have had their chance at the request object, and the front controller is free to start its main router foreach loop.

Processing $from and $to

Earlier we glossed over the _processRewriteUrl method that both $from and $to were passed into. Before we dive in deep, I’ll tell you that I’ve never seen this functionality used in wild, but you need to be aware of it as it impacts the sorts of regular expressions you can use in your configuration rewrites. Here’s the method definition

#File: app/code/core/Mage/Core/Controller/Varien/Front.php
protected function _processRewriteUrl($url)
{
    $startPos = strpos($url, '{');
    if ($startPos!==false) {
        $endPos = strpos($url, '}');
        $routeName = substr($url, $startPos+1, $endPos-$startPos-1);
        $router = $this->getRouterByRoute($routeName);            
        if ($router) {
            $fronName = $router->getFrontNameByRoute($routeName);
            $url = str_replace('{'.$routeName.'}', $fronName, $url);
        }
    }
    return $url;
}

The _processRewriteUrl method implements a simple variable substitution to the $from and $to paramaters. The strings are scanned once for a substring surrounded by braces, such as

{catalog}/baz/bar

If this bracketed template string is found, we use its value (catalog) to fetch a router object. This object is our Standard, Admin, etc. router object that would handle a URL with specified route name.

If we managed to fetch a router, we again use the $routeName to fetch the actual frontName that applies to that particular route name. Then we do a string replace call to replace our bracketed template string ({catalog}) with the frontName we found. The intention here appears to be providing a “configuration user” the ability to say “this request should use the front name in router node X”, where router node X is configured in a separate module.

Unfortunately, because of some mushiness and lack of decisive pattern around route names, front names, and module names, we can only speculate as to the need for this bit of code. Like a lot of the routing system, this appears to be a bit of scaffolding for bigger ideas that were never implemented. If anyone has any insight into the use-case for this feature, please get in touch.

The reason you need to care is, since the ad-hoc template variables use the {} characters, that means you can’t safely use a regular expression pattern in <from> that uses these meta-characters.

(In) Complete

Another often confusing, underutilized, or misused feature of the configuration rewrites is <complete> node. Without this node, the rewritePathInfo method on the request object is called, otherwise the setPathInfo method is called.

#File: app/code/core/Mage/Core/Controller/Varien/Front.php
if (isset($rewrite->complete)) {
    $request->setPathInfo($pathInfo);
} else {
    $request->rewritePathInfo($pathInfo);
}

As mentioned, the rewritePathInfo method

#File: app/code/core/Mage/Core/Controller/Request/Http.php
public function rewritePathInfo($pathInfo)
{
    if (($pathInfo != $this->getPathInfo()) && ($this->_rewritedPathInfo === null)) {
        $this->_rewritedPathInfo = explode('/', trim($this->getPathInfo(), '/'));
    }
    $this->setPathInfo($pathInfo);
    return $this;
}

ultimately calls setPathInfo, but not before doing the following

#File: app/code/core/Mage/Core/Controller/Request/Http.php
if (($pathInfo != $this->getPathInfo()) && ($this->_rewritedPathInfo === null)) {
    $this->_rewritedPathInfo = explode('/', trim($this->getPathInfo(), '/'));
}

This code stashes the original, pre-rewritten path information in the _rewritedPathInfo array the first time rewritePathInfo is called. Subsequent calls (for multiple rewrite nodes) will skip this if branch.

If present, the information in _rewritedPathInfo is used when calling the request object’s getRequestedRouteName, getRequestedControllerName, and getRequestedActionName methods. This is hugely important for two reasons.

The first is these methods are used to generate the full controller action name, which in turn dictates which blocks are loaded for a particular request. Secondly, these methods are also referenced by Magento’s URL helper, which means all URLs generated in the system will use the original path information values unless a <complete/> node is present. So, the semantics here are “Are we completely rewriting the request, or just partially rewriting it”.

With the advantage of hindsight, it seems like the configuration based rewrite system was originally designed to be a general way to altering the request’s path information for the router objects. Consequently, people started using it for controller overrides, which necessitated adding confusing feature like the one above to to solve problems people were having with their usage of the system.

Other Problems

Two other quick things to note about the configuration rewrite system, both related to how it compares with the database rewrite system.

First, you’ll notice that unlike the database rewrites, we never set an alias on the request object or change its request URI. This functionality is partially duplicated with the above mentioned _rewritedPathInfo method, although this servers a different purpose. This is just something to be aware of, and another example of the difficulties which arise when your project is organized as a system of sub-systems, each sub-system managed by different people with different visions for the final system.

The second bit has to do with starting and ending slashes on URLs. The canonical wiki example of a rewrite rule includes a forward slash

<from><![CDATA[#^/checkout/cart/#]]></from>

This is technically correct. The path information for a standard request starts with this forward slash, so you’ll want to include it

<from><![CDATA[#^/catalog/product/#]]></from>
/catalog/product/view/id/27

However, configuration rewrite rules are applied after database rewrite rules. It’s very possible that a database rewrite rule could leave the path information in the form of

catalog/product/view/id/27

That is, without the leading slash. This means the above <from> would fail, and fail in a hard to diagnose way. A better <from> regular expression would make the slash optional using the ? meta character

<!-- untested -->
<from><![CDATA[#^/?catalog/product/#]]></from>

This demonstrates one of the most common problems people have using the rewrite system: writing regular expressions against a source path information they have no exposure to. Keep this in mind when you’re creating your own rewrites.

Rewrite Wrap UP

The two Magento rewrite systems appear to have started with a plain and simple goal: Provide a mechanism for altering the request object’s path information, allowing developers to implement their own URL rewrites any way they saw fit. However, after digging into the systems a bit, it’s clear that over the years more and more functionality was bolted on that, in hindsight, may have belonged elsewhere.

Next time, in our final article of the series, we’ll dive deep into the logic behind loading a database rewrite rule, as well as explore how the Magento frontend cart application starts to get its claws and special case code into the request rewrite system. Once you’re though that, you’ll be a true Magento routing black belt.

Originally published September 18, 2011
Series Navigation<< In Depth Magento Dispatch: Stock RoutersIn Depth Magento Dispatch: Advanced Rewrites >>