Regex Bug in Apache's mod_include (SSI)

# Filed on by / reply

Apache’s Server-Side Includes (SSI) feature, provided by mod_include, is a fantastic tool.  Not only does it let you pull content from other files (to include a standard header & footer across an entire site, for example), but it lets you use regular expressions to control flow and determine output.

Today I got bit by a nasty bug in Apache’s regex implementation, though.  In standard regular expression syntax, if you wanted to match this:

foo.shtml?1-5

...then you would need to escape the question-mark in your regex, like this:

/foo.shtml\?1-5/

However, that does not match in Apache’s SSI expr tests.

I spent a couple hours debugging the problem, trying different combinations of query-string, no query-string, trailing dollar-sign, escaped trailing dollar-sign, all to no avail.  I wrote up a test page that showed some base cases with an output of "matches" or "does not match" and it almost seemed that somehow the $REQUEST_URI didn’t actually include the $QUERY_STRING (even though echoing it did show the QS).

I only discovered the solution as I was preparing to file a bug in Apache’s bugzilla.  Someone else had had a related problem and in the course of discussing it, an Apache developer revealed that all backslashes must be escaped within regex portions of expr statements.  So instead of this:

<!--#if expr="$REQUEST_URI = /foo.shtml\?1-5/" -->

...you need to use this:

<!--#if expr="$REQUEST_URI = /foo.shtml\\?1-5/" -->

Now that’s buggy and ridiculous, but that in itself isn’t a huge deal.  What is a huge deal, though, is that nowhere in the mod_include documentation is this glaring flaw ever mentioned.



Reply to this message here:

Your name
Email (why?)
Website (if you have one)
Subject
search posts:

[ archives ]