Regex Bug in Apache's mod_include (SSI)Filed on Jul 14, 2006 by Anthony
Apache’s Server-Side Includes (SSI) feature, provided by mod_include, is a fantastic tool. Not only does it let you pull content from other files (to include a standard header & footer across an entire site, for example), but it lets you use regular expressions to control flow and determine output.
Today I got bit by a nasty bug in Apache’s regex implementation, though. In standard regular expression syntax, if you wanted to match this:
...then you would need to escape the question-mark in your regex, like this:
However, that does not match in Apache’s SSI expr tests.
I spent a couple hours debugging the problem, trying different combinations of query-string, no query-string, trailing dollar-sign, escaped trailing dollar-sign, all to no avail. I wrote up a test page that showed some base cases with an output of "matches" or "does not match" and it almost seemed that somehow the $REQUEST_URI didn’t actually include the $QUERY_STRING (even though echoing it did show the QS).
I only discovered the solution as I was preparing to file a bug in Apache’s bugzilla. Someone else had had a related problem and in the course of discussing it, an Apache developer revealed that all backslashes must be escaped within regex portions of expr statements. So instead of this:
<!--#if expr="$REQUEST_URI = /foo.shtml\?1-5/" -->
...you need to use this:
<!--#if expr="$REQUEST_URI = /foo.shtml\\?1-5/" -->
Now that’s buggy and ridiculous, but that in itself isn’t a huge deal. What is a huge deal, though, is that nowhere in the mod_include documentation is this glaring flaw ever mentioned.