Tag: HTTP

Setting up a Zimbra authenticated proxy

June 11, 2019 / halfgaar

On March 18th, Synacor posted about a critical Zimbra security vulnerability (CVE 2019 9670), which was quick to be exploited in the wild, and subsequently evolved to be harder to erradicate.

I’ve always had a weariness of authentication implementations by hosted applications, so I decided to block the Zimbra web mail interface using iptables (firewall), and only allow access through a separately hosted HTTP proxy which requires authentication. This way, no stray requests to API endpoints accidentally left open will be allowed. That is, almost none: I had to add exceptions to allow webdav traffic for contact and calendar synchronization. If you don’t use that, the exceptions can be left out.

Below is an example Apache configuration. Apache requires several modules to be enabled, which is an exercise left to the reader. Also, a similar proxy is easily implemented in Nginx; I just happened to have a spare Apache server.

Note that it’s best to not make the proxy the default virtual host on the web server. This avoids it being seen by IP probes. If set up properly, there is no trace visible from the outside that you’re using this proxy, and if you make it such that access to it requires the actual domain name (like mywebmail.example.net), it’s very hard for bots to see it (especially if you make the domain name a bit more unguessable).

When you access the web mail page, first you have to authenticate using old style HTTP authentication:

zimbra-pre-login

Anyway, here’s the proxy config:

<VirtualHost *:80>
        RewriteEngine on
        RewriteRule ^/(.*) https://%{HTTP_HOST}/$1 [L,R]
        ServerName webmail.example.net
</VirtualHost>
 
<VirtualHost *:443>
        ServerName webmail.example.net
        ServerAdmin webmaster@localhost
 
        SSLEngine on
        SSLCertificateFile    /etc/letsencrypt/live/webmail.example.net/cert.pem
        SSLCertificateKeyFile /etc/letsencrypt/live/webmail.example.net/privkey.pem
        SSLCertificateChainFile /etc/letsencrypt/live/webmail.example.net/chain.pem
 
        SSLProxyEngine On
        ProxyPass        / https://mail.example.net/
        ProxyPassReverse / https://mail.example.net/
 
        # For Webdav/carddav/caldav
        <Location /dav>
                Satisfy any
                Require all granted
        </Location>
 
        # For Let's Encrypt
        <Location /.well-known/>
                Satisfy any
                Require all granted
        </Location>
 
        # For Webdav/carddav/caldav
        <Location /principals/>
                Satisfy any
                Require all granted
        </Location>
 
        # For Webdav/carddav/caldav
        <Location /SOGo/>
                Satisfy any
                Require all granted
        </Location>
 
        # For Webdav/carddav/caldav
        <Location /groupdav.php>
                Satisfy any
                Require all granted
        </Location>
 
        <Location />
                AuthType Basic
                AuthName "Zimbra webmail pre-login"
                AuthUserFile /etc/apache2/htpasswd/webmail
                Require valid-user
 
                # Exception IPs: no auth needed (for monitoring for instance)
                Require ip 1.2.3.4
        </Location>
 
        ErrorLog ${APACHE_LOG_DIR}/webmail.example.net/error.log
        CustomLog ${APACHE_LOG_DIR}/webmail.example.net/access.log combined
</VirtualHost>

How to test payformystay.com

May 19, 2012 / Rowan Rodrik

I haven’t got much experience when it comes to testing web applications. Instead (and more so out of apathy than belief), I’ve always adhered to the ad-hoc test approach. However, the usage of pure Posgres unit tests back when I worked on a complicated investment database with Halfgaar did teach me the advantages of test-driven development.

For payformystay, though, unit tests simply won’t cut it. The database design is quite straight-forward with not that many relationships and the schema’s only complexities arise from it being remarkably denormalized and full of duplication. Besides and contrary to mine and Halfgaar’s PostgreSQL project for Sicirec, the business logic doesn’t live all neatly and contained on the database level. And I’m not using a clean ORM wrapper either, which I could use as a unit test target. And what would be the point, since in typical MySQL/PHP fashion it would be much too easy to circumvent for a particular function.

What I want for this application is full functional test coverage so that I know that all parts of the website function correctly in different browser versions across operating systems. In other words: I want to know that the various parts are functioning correctly as implied by the fact that the whole is functioning correctly.

But how do you do automated tests from a browser?

At first, I thought I should probably cobble something together myself with jQuery, maybe even using a plugin such as Qunit with the composite addon.

But how was I going to run the tests for JavaScript independence then? Using curl/wget or one of these hip, headless browsers which seem to be bred for this purpose?

Choises, choises…

Selenium

Then, there’s Selenium which is a pretty comprehensive set of test tools, meant precisely for what I need. Sadly my wants weren’t easily aligned with my needs. Hence, it took me some time (months, actually) before I was sure that Selenium was right for me.

Selenium provides the WebDriver API—implemented in a number programming languages—that lets you steer all popular browsers either through the standalone Selenium Server or Selenium Grid. The server executes and controls the browser. Since Selenium 2, it doesn’t even need a JavaScript injection in the browser to do this, which is very interesting for my tests related to my desire to make my AJAX-heavy toy also available to browsers with JavaScript disabled for whatever reason.

Selenium versus my pipe dream

Selenium IDE is a Firefox extension which lets you develop Selenium scripts by recording your interactions with the browser. It stores its script in “Selenese”. This held quite some appeal to me, because my initial testing fantasy revolved around doing it all “client-side”, in the sense that I wouldn’t have to leave my browser to do the testing. I wanted to be able to steer any browser on any machine that I happened to stumble upon at my test site and fire those tests.

Well, Selenese can be interpreted by some WebDriver API implementations to remotely steer the browser, but it can’t be executed from within the browser, except by loading it into the Selenium IDE, which is a Firefox-only extension. Also, driving the browser through JavaScript has been abandoned by Selenium with the move away from Selenium-RC to WebDriver (which they’re currently trying to push through the W3C standardization process).

With everyone moving away from my precious pipe-dream, I remained clinging to some home-grown jQuery fantasy. But, how was I going to test my JavaScript-free version? Questions.

I had to eventually replace my pipe dream with the question of which WebDriver implementation to use and which testing framework to use to wrap around it.

PHPUnit

I thought PHPUnit had some serious traction, but seeing that it had “unit” in its name, I thought it might not be suitable for functional testing. The documentation being unit-test-centric, in the sense of recommending you to name you test cases “[ClassYouWannaTest]Test” didn’t help in clearing the confusion.

Luckily, I came across an article about acceptance testing using Selenium/PHPUnit [acceptence test = functional test].

I’ve since settled on PHPUnit by Sebastian Bergmann with the Selenium extension also by Bergmann. His Selenium extension provides two base TestCase classes: PHPUnit_Extensions_SeleniumTestCase and PHPUnit_Extensions_Selenium2TestCase. I chose to use the latter. I hope I won’t be sorry for it, since it uses Selenium 2’s Selenium 1 backward compatible API. Otherwise, they’ll probably have me running for Facebook’s PHP-WebDriver in the end. (PHP-Webdriver also has a nice feature that it allows you to distribute a FF profile to Selenium Server/Grid.)

But what about my pipe dream?

If only I’d be able to visit my test site from any browser, click a button and watch all the test scripts run, the failures being filed into the issue tracker (with error log + screenshot) and a unicorn flying over the rainbow…

Anyway, it’s a pipe dream and the best way to deal with it is probably to put the pipe away, smoothen the sore and scratch the itch.

PEAR pain

As customary for PEAR projects, PHPUnit and its Selenium extension have quite a number of dependencies, meaning that installing and maintaining them manually in my project repo would be quite a pain. I’ve used the pear command to install everything locally, but my hosting provider doesn’t have all these packages intalled, so if I want to run tests from there (calling Selenium Server here), I’ll have to manage all that pear pain along with my project files.

Doesn’t PEAR offer some way to manage packages in any odd location? I’m not interested in what’s in /usr/share/php/. I want my stuff in ~/php-project-of-the-day/libs/

Process pain

So far, I’ve remotely hosted both the production and the development version of payformystay, which is specially nice if you want to share development features with others. Now, it’s difficult to decide what’s more annoying:

Creating a full-fledged, locally hosted version of the website, so that I can execute the tests locally as well as host the testing version (Apache+PHP+MySQL) locally. A lot of misleading positive test results assured due to guaranteed differences between software versions and configurations
Installing all the PEAR packages remotely so that I can run the test from my hosting provider’s shell. This implies having to punch a hole through the NAT wall at home or anywhere I happen to be testing at any moment. Bad idea. I don’t even have the password to all the routers that I pass during the year.
Running the development version of the website remotely, but running the tests locally so that there are no holes to punch, except that I’ll have to tunnel to my host’s MySQL process because my tests need to setup, look-up and check stuff in the database. At least, now I don’t have to install server software on my development machine and need only the php-cli stuff.

Safari: don’t give gzipped content a .gz extension

January 16, 2012 / Rowan Rodrik

Yesterday, while helping Caloe with the website for her company De Buitenkok, I came across the mother of all stupid bugs in Safari. Me having recently announced payformystay.com, I loaded it up in Apple’s hipster browser only to notice that the CSS wasn’t loaded. Oops!

Reloading didn’t help, but … going over to the development version, everything loaded just fine. Conclusion? My recent optimizations—concatenating + gzipping all javascript and css—somehow fucked up payformystay for Safari users. The 14 Safari visitors (16.28% of our small group of alpha users) I received since the sixth must have gotten a pretty bleak image of the technical abilities of payformystay.com’s Chief Technician (me). 😥

The old `cat | gzip`

So, what happened?

To reduce the number of HTTP requests per page for all the JavaScript/CSS stuff (especially when none of it is in the browser cache yet), I made a few changes to my build file to scrape the <head> of my layout template (layout.php), which I made to look something like this:

<?php if (DEV_MODE): ?>
  <link rel="stylesheet" type="text/css" href="/layout/jquery.ui.selectmenu.css" />                                   <!--MERGE ME-->
  <link rel="stylesheet" type="text/css" href="/layout/fancybox/jquery.fancybox-1.3.4.css" />                         <!--MERGE ME-->
  <link rel="stylesheet" type="text/css" href="/layout/style.css" />                                                  <!--MERGE ME-->
 
  <script src="/layout/jquery-1.4.4.min.js" type="text/javascript"></script>                                          <!--MERGE ME-->
  <script src="/layout/jquery.base64.js" type="text/javascript"></script>                                             <!--MERGE ME-->
  <script src="/layout/jquery-ui-1.8.10.custom.min.js" type="text/javascript"></script>                               <!--MERGE ME-->
  <script src="/layout/jquery.ui.selectmenu.js" type="text/javascript"></script>                                      <!--MERGE ME-->
  <script src="/layout/jquery.cookie.js" type="text/javascript"></script>                                             <!--MERGE ME-->
  <script src="/layout/fancybox/jquery.fancybox-1.3.4.js" type="text/javascript"></script>                            <!--MERGE ME-->
  <script src="/layout/jquery.ba-hashchange.min.js" type="text/javascript"></script>                                  <!--MERGE ME-->
  <script src="/layout/jquery.writeCapture-1.0.5-min.js" type="text/javascript"></script>                             <!--MERGE ME-->
<?php else: # if (!DEV_MODE) ?>
  <link href="/layout/motherofall.css.gz?2" rel="stylesheet" type="text/css" />
  <script src="/layout/3rdparty.js.gz?2" type="text/javascript"></script>
<?php endif ?>

It’s very simple: All the files with a “” comment on the same line got concatenated and gzipped into motherofall.css.gz and 3rdparty.js.gz respectively, like so:

MERGE_JS_FILES := $(shell grep '<script.*<!--MERGE ME-->' layout/layout.php|sed -e 's/^.*<script src="\/\([^"]*\)".*/\1/')
MERGE_CSS_FILES := $(shell grep '<link.*<!--MERGE ME-->' layout/layout.php|sed -e 's/^.*<link .*href="\/\([^"]*\)".*/\1/')
 
all: layout/3rdparty.js.gz layout/motherofall.css.gz
 
layout/3rdparty.js.gz: layout/layout.php $(MERGE_JS_FILES)
        cat $(MERGE_JS_FILES) | gzip > $@
 
layout/motherofall.css.gz: layout/layout.php $(MERGE_CSS_FILES)
        cat $(MERGE_CSS_FILES) | gzip > $@

Of course, I simplified away the rest of my Makefile. You may notice that I could have used yui-compressor or something alike to minify the concatenated files before gzipping them, but yui-compressor chokes on some of the third-party stuff. I am using it for optimizing my own css/js (again, only in production).

Safari ignores the `Content-Type` for anything ending in `.gz`

As far as the HTTP spec is concerned, “file” extensions mean absolutely nothing. They’re trivial drivel. Whether an URL ends in .gz, .css, .gif or .png, what it all comes down to is what the Content-Type header tells the browser about the response being sent.

You may have noticed me being lazy in the layout template above when I referenced the merged files:

<link href="/layout/motherofall.css.gz?2" rel="stylesheet" type="text/css" />
  <script src="/layout/3rdparty.js.gz?2" type="text/javascript"></script>

I chose to directly reference the gzipped version of the css/js, even though I had a .htaccess files in place (within /layout/) which was perfectly capable of using the right Content-Encoding for each Accept-Encoding.

`$ cat /layout/.htaccess`

AddEncoding gzip .gz
 
RewriteEngine On
 
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteCond %{REQUEST_FILENAME}.gz -f
RewriteRule ^(.*)$ $1.gz [QSA,L]
 
<Files *.css.gz>
ForceType text/css
</Files>
 
<Files *.js.gz>
ForceType application/javascript
</Files>

You may notice that the .htaccess file contains some configuration to make sure that the .gz files are not served as something like application/gzip-compressed.

Anyway, I went to see if there were any browsers left that do not yet Accept-Encoding: gzip and could find none. When, yesterday, I was faced with an unstyled version of my homepage, my first reaction was (after the one where I was like hitting reload 20 times, embarrassedly mumbling something about “those damn browser-caches!”): “O then, apparently, Safari must be some exception to the rule that browsers have all been supporting gzip encoding for like forever!”

No, it isn’t so. Apparently Safari ignores the Content-Type header for any resource with an URL ending in .gz. Yes, that’s right. Safari understands Content-Encoding: gzip just fine. No problem. Just don’t call it .gz.

The new `cat ; gzip`

So, let’s remove the .gz suffix from these files and be done with it. The .htaccess was already capable of instructing all necessary negotiations to be able to properly serve the gzipped version only when it’s accepted (which is always, but I digress).

A few adjustments to my Makefile:

MERGE_JS_FILES := $(shell grep '<script.*<!--MERGE ME-->' layout/layout.php|sed -e 's/^.*<script src="\/\([^"]*\)".*/\1/')
MERGE_CSS_FILES := $(shell grep '<link.*<!--MERGE ME-->' layout/layout.php|sed -e 's/^.*<link .*href="\/\([^"]*\)".*/\1/')
 
all: layout/3rdparty.js.gz layout/motherofall.css.gz layout/pfms.min.js.gz
 
layout/3rdparty.js: layout/layout.php $(MERGE_JS_FILES)
	cat $(MERGE_JS_FILES) > $@
 
layout/motherofall.css: layout/layout.php $(MERGE_CSS_FILES)
	cat $(MERGE_CSS_FILES) > $@
 
%.gz: %
	gzip -c $^ > $@

And here’s the simple change to my layout.php template:

<link href="/layout/motherofall.css?2" rel="stylesheet" type="text/css" />
  <script src="/layout/3rdparty.js?2" type="text/javascript"></script>

That’s it. I welcome back all 14 Safari users looking for paid work abroad! Be it that you’re looking for international work in Africa, in America, in Asia or in Europe, please come visit and have a look at what we have on offer. 😉

Atom and REST

October 9, 2009 / Rowan Rodrik

There’s is an interesting discussion about when (not) to use the Atom protocol for your REST API.

While looking at the trackbacks of that post, I came across a very nice post by Subbu detailing the differences between and uses of the Location and Content-Location HTTP headers.

XMLHttpRequest caching behaviour

October 9, 2009 / Rowan Rodrik

I was suprised to learn that the caching mechanism in the web browser often behaves differently for AJAX requests than it does for regular HTTP requests. Someone has created a set of functional tests to analyze differences in browser behaviour.

www.stichting-ecosafe.org

September 16, 2009 / Rowan Rodrik

Stichting EcoSafe is a Dutch foundation for the safe-keeping of the funds that are necessary for the maintenance of hardwood plantations. In July of 2006, together with Johan Ockels, I created a website for the Foundation. Johan was responsible for the organization of the whole process. This went very smooth and the website ended up being an emblem of simplicity and clarity. That’s why I wanted to blog a bit about it now, even though there are a few things that I’d probably end up doing different if I were to start from scratch. [There’s actually a disturbing number of things for which this is true, I’m coming to notice.]

The Welcome page of the EcoSafe website

EcoSafe page for plantations

EcoSafe cost structure

File structure

Like with most websites, I started with creating an SVN repo so that I wouldn’t have to be afraid of ever losing anything.

The file structure was pretty standard:

a css dir for stylesheets;
img for images;
inc for shared PHP and mod_include stuff and for AJAX partials;
jot for to-do’s and other notes;
and js for JavaScript files and libraries.

Possible file structure improvements

If I were to redesign this file structure, I’d collapse css, img and js into one directory called layout, because these are typically things that require the same robots.txt and caching policy. Also, it is meaningless to organize things by file extension. If you want to sort something by file extension, use ls -X (or ls --sort=extension if you’re on GNU).

Server-side includes

The site would be so simple that I felt that any type of CMS or content transformation would be completely unnecessary. Instead, I decided to rely on Apache’s mod_include and just use a few partials for repeating page elements such as the left sidebar containing the logo and the menu.

Also, because I didn’t need to transform the HTML files, I decided I could use good ol’ HTML 4 instead of XHTML 1 (which I’d have to send to the browser with the wrong mime-type anyway).

This is the HTML for contact.nl.shtml:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/loose.dtd">
 
<html lang="en">
  <head>
    <title>Contact EcoSafe</title>
 
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
 
    <link rel="stylesheet" type="text/css" href="/css/style.css"></link>
  </head>
 
  <body>
    <!--#include virtual="/inc/left-side.en.html"-->
 
    <!--#include virtual="/inc/alt-lang.phtml"-->
 
    <div id="content">
      <h1>Contact</h1>
 
      <p>Your email to EcoSafe kan be sent to the following address:
      <a href="mailto:service@stichting-ecosafe.org">service@stichting-ecosafe.org</a>.
      Or, alternatively, you can fax us at +31 50 - 309 66 58.</p>
 
      <h2>About this website</h2>
 
      <p>For comments and/or suggestions concerning this website,
      you can direct an email message at:
      <a href="mailto:webmaster@stichting-ecosafe.org">webmaster@stichting-ecosafe.org</a>.</p>
    </div>
  </body>
</html>

Alternative language selection

I use  to include the repeating parts.  has several advantages over  in that it allows for content-negotiation, execution of dynamic content etc., but here the only place were it holds an advantage is in the inclusion of /inc/alt-lang.phtml. alt-lang.phtml is a messy PHP script that figures out which language variants of a page are available and displays a selection of alternative language versions (variants with a language different from the current).

SSI and caching

Without the XBitHack directive set to full, all content handled by mod_include is sent without a Last-Modified header. However, I don’t want to use XBitHack at all, because I don’t want just any executable file to be handled by mod_include; that just too much of a … hack.

If I were to do something similar now, I’d use some kind of (sed) substitution to pre-process the includes locally so that more of what I end up uploading is simple static content. The dynamic part of the included PHP script, I would simply replace with JavaScript.

Visual design

As you can see in the HTML example, there’s hardly anything layout oriented in the HTML source. This is good, and means that I have to touch only the CSS for most minor and major lay-out modifications. (It is a pipe-dream to think that you only need to change the CSS to make the same HTML page look however you want as long as that HTML is rich enough in meaning, but for a site with pages of such simple structure, it’s a dream that comes pretty close to reality.)

I’m not much of a designer, but I think design is overrated anyway. Actually, I think that most website suffer from too much design.

The EcoSafe logo

To start the design, I got a logo made by Huite Zijlstra. Because the logo was pretty big and didn’t look good scaled down, I decided to put it at the left of the content area instead of at the top. This would still leave enough room for the menu (which actually takes less space horizontally than the logo).

Colors

For the color scheme, I just picked a few colors from the logo. As always, the base of the scheme would be black text on a white background for maximum readability. The print version hardly uses any colors.

@media screen {
body            {:;  }
*               {:;             }
a:link          {: #585;              }
h1              {: #880;              }
h2              {: #888;              }
strong          {: #a62;              }
#menu li a      {: #660;              }
}

Underlines

I wanted an underline below the level 1 and 2 headings. Because I didn’t like the effect of text-decoration:underline (too thick for <h2>s, too dark for <h1>s and different from browser to browser) and because border-bottom was set too far from the text, I made two simple PNG images that I could repeat-x along the bottom edge.

@media screen {
h1 {:('/img/h1-border-bottom.png'); }
h2 {:('/img/hx-border-bottom.png'); }
}

The menu is very simple. The markup is part of inc/left-side.en.html for the English version and inc/left-side.nl.html for the Dutch version:

cat inc/left-side.en.html

<div id="left" lang="en">
  <a class="logo" href="/index.en"><img class="logo" alt="[Logo]" src="/img/logo.jpg"></img></a>
 
  <ul id="menu" class="menu">
    <li><a href="/index.en" rel="start">Start page</a></li>
    <li><a href="/plantations.en">For plantations</a></li>
    <li><a href="/investors.en">For investors</a></li>
    <li><a href="/history.en">History</a></li>
    <!--<li><a href="/goals">Goals</a></li>-->
    <li><a href="/methods.en">How it works</a></li>
    <li><a href="/cost-structure.en">Cost structure</a></li>
    <li><a href="/cost-calculator.en">Cost calculator</a></li>
    <!--<li><a href="/clients.en">Clients</a></li>-->
    <li><a href="/contact.en">Contact</a></li>
  </ul>
</div>
 
<script type="text/javascript" src="/js/menu.js"></script>

The EcoSafe menu (in English)

As is customary, I started by removing all the default list styles and made the anchors behave as block-level elements. I used the big O from the logo for bullets in the list (using background-image instead of list-style-image because the latter gives unpredictable cross-browser results and doesn’t make the bullet clickable).

#menu {
 : 2em;
 : 2em;
 :;
 : 0;
}
 
#menu li {
 : 0;
}
 
#menu li a {
 :;
 :('/img/o-21x16.png');
 :;
 :;
 : 30px;
 :;
 :;
 :;
 : #660;
}
 
#menu li a:hover,
#menu li.active a {
 :('/img/oSafe-21x16.png');
}
 
#menu a:hover {
 : #787800;
}

JavaScript menu item activation

To add the active class to the currently active list item (<li>), I used a client-side solution using JavaScript. After all, it’s proper use of JavaScript to enhance your user interface with it (as long as, as many would say, it isn’t required for the UI to function (as it is in the Cost Calculator)).

// menu.js
 
var menu = document.getElementById('menu');
var anchors = menu.getElementsByTagName('a');
var locationHref = window.location.pathname.toString();
  
for (i = anchors.length - 1; i >= 0; i--) {
  a = anchors[i];
  aHref = a.href;
    
  // Does this menu item link to the current page?
  // We find out by looking if the window location contains the URL in the anchor
  // or the other way arround. The reason to look at both is content-negotiation.
  // It's also true if the location is just '/' and we're looking at the anchor of
  // the 'start' page.
  if ( (locationHref === '/' && a.rel === 'start') ||
       (locationHref !== '/' && ( locationHref.indexOf(aHref) !== -1 ||
                                  aHref.indexOf(locationHref) !== -1 ) ) ) {
    a.parentNode.className = 'active';
    break;
  }
}

I actually just fixed a long-standing bug that was caused by me not being able to fully rely on HTTP language negotiation for the selection of the appropriate language variant, which made me change all links from being language-neutral to including the language in the link target (e.g.: http:///history became http:///history.en and http:///history.nl), the problem with this being that, instead of being able to link to link to http:/// (http://www.stichting-ecosafe.org/), I had to link to http:///index.en or http:///index.nl, making it more difficult to detect the active anchor if we’re requesting the home page through http:/// instead of on of its language-specific URLs.

The JavaScript rested on the assumption that by reverse iterating through all the anchors in the menu and thus processing the link to http:/// as last, I’d know that I had struck the home page and wouldn’t need to worry that any of the links contain a slash. (I don’t know if I intended it to work this way, but it sure seems to me now that the only way this could ever have worked was as an apparent side-effect of the looping order; the SVN logs seem to agree.)

I could have solved this by redirecting all requests for http:/// to the appropriate variant. Maybe I should have (to avoid duplicate content). Instead I chose to add a rel="start" attribute to the links to the home page, as can be deduced from the JavaScript above. (To resolve the duplicate content issue, I could also add a canonical link to the header of the two language variants.)

Anyway, all this brings me to the messy subject of content negotiation.

Content and language negotiation

The EcoSafe website would be bi-lingual (English and Dutch) from the onset. Initially, I wanted to use language negotiation to the extend of having completely language-neutral URLs. For example: http:///cost-calculator instead of http:///cost-calculator.en and http:///cost-calculator.nl. In the end, you can make this work properly in the browser with the help of a cookie, but it’s still a pipe-dream because nothing else will work if you do not also offer another navigational path to the different variants. Maybe, we’ll revisit this topic for a later experiment.

Content-type negotiation is almost effortless with Apache thanks to mod_negotiation. If, like me, you despise to have .html, .htm, .xhtml, .phtml, .pxhtml. .sxhtml, .php, .xml in your URL (I actually used all of these at some time or other), you only have to make sure that MultiViews is in your options.

I’ve configured SSI by means of the following instead of a “magic mime-type”:

AddType         text/html       .shtml
AddHandler      server-parsed   .shtml
AddCharset      UTF-8           .shtml
AddOutputFilter Includes        .shtml

For PHP I couldn’t do the same because my web host was still at Apache 1.3. Otherwise, the following should have worked equally well for PHP:

# This doesn't work with Apache 1.3
AddType        text/html       .phtml
AddHandler     php-script      .phtml
AddCharset     UTF-8           .phtml

Configuring language priority is easy with Apache:

Integrating PHP and SSI

The integration of PHP with all the weirdness that I had configured and created around SSI took some figuring out. Luckily, PHP offers a virtual() function that works roughly the same as mod_include's . Here’s an example:

<body>
  <?php virtual('/inc/left-side.en.html'); ?>
  <?php $uri = '/cost-calculator.en.phtml'; include('inc/alt-lang.phtml'); ?>

In retrospect, it’s pretty much bullshit to use it. I could have just as well require()d the partials (which I actually did for the alternate language selection), but I probably started out using virtual on a more generic URL without language and content-type selection in it.

406 handling

Because I deployed on Apache 1.3 and the ForceLanguagePriority directive was only introduced with Apache 2.0.30, I had to write an ugly hack to avoid visitors getting 406 errors. To that end, I added a 406 handler to my .htaccess file:

LanguagePriority en nl
ForceLanguagePriority Prefer Fallback # This doesn't work with 1.3
 
ErrorDocument 406 /error-406.php # Luckily, this does

error-406.php is a PHP file that figures out the available variants based on $_SERVER['REQUEST_URI']. Then, it simply picks the first one (which works because, accidentally, that’s the one I’ve given priority using the LanguagePriority directive as well), outputs a 200 OK header instead of the 406, and virtual()s the file of the variant. The code looks somewhat like this:

<?php
chdir($_SERVER['DOCUMENT_ROOT']);
$filenames = glob(basename($_SERVER['REQUEST_URI']) . ".*");
 
$filename = $filenames[0];
 
apache_setenv('DOCUMENT_URI', "/$filename");
 
header('HTTP/1.1 200 OK');
virtual("$filename");

EcoSafe Cost Calculator

EcoSafe Cost Calculator results

The Cost Calculator

The EcoSafe Cost Calculator is some of the least neatly contained and most procedurally oriented PHP code I’ve ever produced while knowing full well what I was doing. It does almost everything it does in global scope. Yet, it does it well.

The thing is designed as a dynamic web page rather than a web application. What I mean by this is that it’s simply two pages (one for English and one for Dutch) using PHP among a number of pages using SSI. In an application, it’s usual to have just one ‘view’ that is the same for all languages, but here I chose to put the different language versions in different language pages and then include everything reusable (and language-neutral) from within these files.

Most of the actual processing and calculating is done in inc/costs-functions.php. (The part about gotos is a joke. (Labeled blocks would have been quite sufficient. 😉 ))

<?php # costs-functions.php - Stuff that's includes by cost-calculator.{nl,en}.phtml
/**
 * Just remember that this code was never meant to be general purpose or anything.
 * So, relaxeeee and keep your OO-axe burried where it belongs.
 * Oh, if only PHP would support GOTO's ... Sigh ...
 */

The rest of the file is just a whole lot of processing of form data and turning it into something that can be easily traversed for display to the user. There are even the function calls without arguments doing all their work on globals. These are actually only added to make it clearer em a piece of code is doing. And—I must say—after a few years it’s still remarkably clear to me what each part of the code is doing. There’s no deep, confusing nesting structures or anything. There’s just a whole lot of very simple code.

Some simple AHAH increases form interactivity

Users of the calculator can add any number of plantings and locations. When the user decides to add a planting or a location, the onClick event triggers the execution of addExtraPlanting() or addExtraLocation(). Here’s how addExtraPlanting() looks:

function addExtraPlanting() {
  lang = document.documentElement.lang;
 
  new Ajax.Updater(
    'plantings', '/inc/planting.' + lang, {
      method: 'get',
      insertion: Insertion.Bottom
    }
  );
}

Ajax.Updater comes from the Prototype JavaScript framework.

Here’s what inc/planting.en.phtml looks like. The same file is also included in a loop to rebuild the form’s state after submitting.

<li>
  <input name="num_hectares[]" type="text" size="5" value="<?php echo $num_hectares ?>" />
 
  hectares have been planted in
 
  <select name="plant_years[]"><?php require('planting_options.php') ?></select>
 
  (<a title="Remove this planting" href="#" onclick="removePlanting(this); return false;">x</a>)
</li>

I think that I’ve gone into small enough detail by now to get to the conclusion. Also showing the contents of planting_options.php would be pushing it. Ah, well…

<?php
 
if ( !isset($this_year) ) $this_year = intval(date('Y'));
if ( !isset($plant_year) ) $plant_year = $this_year;
 
for ($i = $this_year; $i >= $this_year - 20; $i--)
  echo "<option" . ($i == $plant_year ? " selected='1'" : "") . ">$i</option>\n";

(Yesterday, I couldn’t resist the temptation of turning this into a simple file to require() instead of the function definition it was. I think it’s funny to refactor something to remove encapsulation.)

Conclusion

As is usual when looking at old code, I see many things that I’d do (even just a little) different today, but I saw a surprising number of solutions that I actually still like now that I see them back after three years. Removing some of the remaining warts probably won’t do much good besides the masturbatory satisfaction it could give me. (It’s likely that the website won’t live much longer, making such extra attention very undeserved.) But, nevertheless, I’ve enjoyed blogging about it now to recoup the whole experience and to at least look at what I’d do different now and what I learned in the meantime.

Some links

Separate development/production environments for WordPress

July 12, 2008 / Rowan Rodrik

When you’re out Googling on how to maintain a separate development environment for a WordPress installation, you will only stumble across information about how to install all kinds of WAMPP packages. Well, I don’t care about WAMP (or WAMPP). I want to be able to edit my theme, change my plugins, mess with my database locally and then deploy my changes when they’re ready and well-tested (as if I ever…)

Rails was the obvious inspiration for how to do this properly. In Rails, the whole development and deployment process is very intuitive and powerful. In WordPress documentation I never even see the awareness of the need for this separation. They usually tell you to download stuff, upload it and muck about with it on the life production server. But, I’m not the mucking-about-in-live-configurations type. I’m the I-fucked-this-up-so-often-I-want-a-staging-area type. This post is about how I managed to fulfill this wish with WordPress.

Changing the environment

The first thing I had to do was to find some way to decide which environment to go into. For some reason I decided to use Apache’s mod_rewrite to set an environment variable based on the HTTP Host header. This is in fact very illogical, but we’ll get to that later.

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} =bsblog [OR]
RewriteCond %{HTTP_HOST} =bsblog.molensteen
RewriteRule . - [env=WP_ENV:development]
RewriteCond %{HTTP_HOST} =blog.bigsmoke.us
RewriteRule . - [env=WP_ENV:production]
</IfModule>
 
# Keep out of WP's own block of rewrite rules below
# BEGIN WordPress

. - looks like a needle because it’s voodoo. The dot says I match anything and the dash says I change nothing of what I match. I do set an environment variable to whether I want to be in development or in production.

So I now have an Apache environment variable available for querying from within PHP (as if PHP doesn’t have a $_SERVER['HTTP_HOST'] variable:-? ) and I can make use of that in my wp-config.php.

Multiple configurations in one file

I love configuration files that share the program’s language; wp-config.php being simple PHP code is what make this whole thing so easy:

<?php
 
if ( getenv('WP_ENV') == 'production' ) {
  // ** MySQL settings ** //
  define('DB_NAME', 'blog');             // The name of the database
  define('DB_USER', 'wordpress');        // Your MySQL username
  define('DB_PASSWORD', '[my password]'); // ...and password
  define('DB_HOST', 'bigsmoke.db');        // 99% chance you won't need to change this value
  define('WP_SITEURL', 'http://blog.bigsmoke.us');
}
elseif ( getenv('WP_ENV') == 'development' ) {
  // ** MySQL settings ** //
  define('DB_NAME', 'bsblog');          // The name of the database
  define('DB_USER', 'root');            // Your MySQL username
  define('DB_PASSWORD', '[my password]'); // ...and password
  define('DB_HOST', '127.0.0.1');       // 99% chance you won't need to change this value
  define('WP_SITEURL', 'http://bsblog');
  //define('WP_DEBUG', true);
}
 
define('WP_HOME', WP_SITEURL);
 
// You can have multiple installations in one database if you give each a unique prefix
$table_prefix  = 'wp_';   // Only numbers, letters, and underscores please!
 
// The rest of the stuff in this config file just isn't interesting
?>

There’s a few things to note here. You have to use getenv() or $_SERVER instead of $_ENV because variables set by Apache end up in the former two. Another thing to note is that I should have just checked $_SERVER['HTTP_HOST'] instead of resorting to mod_rewrite voodoo. For the rest it’s all very straight-forward: I make some database settings depend on which environment I’m in and I set the URL accordingly.

Development URLs

I had some trouble putting the pieces back together when newer WordPress versions started doing automatic redirects for URL that didn’t match siteurl in the wp_options. This change meant that when going to http://bsblog/ (the development URL for this weblog) for example, I’d inevitably end up at http://blog.bigsmoke.us/.

Links had always been constructed according to this setting, so I had already been planning a plug-in to transform production URLs to development URLs. But, I learned (a little late) that this is completely unnecessary since wp-config.php supports the configuration of a base URL. I had wrongly assumed that settings that weren’t in the sample config file, simply didn’t exist.

Thus, after adding WP_SITEURL and WP_SITEHOME to wp-config.php, everything was working.

Ideas to further enhance your configuration

Don’t limit yourself to one development environment if you have more than one development server.
Automate your deployment process. I use rsync for this.
Write a script to clone your production database to your development database. There’s no substitute for actual data.

Native PostgreSQL authentication in Rails with rails-psql-auth

July 16, 2007 / Rowan Rodrik

A while ago, I wrote a PostgreSQL auth plugin for Rails. The plugin basically defers all authentication and authorization worries to the database layer where they are supposed to be taken care of anyway.

Using this plugin, the user is asked for his or her credentials using a HTTP Basic authentication challenge. (The code for this is adapted from Coda Hale‘s Basic HTTP authentication plugin.) It’s possible to specify a guest_username in the database.yml which will be used as a fall-back if no credentials are supplied. After successful login or if a guest user is found, the plugin will make sure that all database operations run as that user. If any operation fails due to insufficient user rights, the user will be prompted for a username/password pair again.

Detailed and up-to-date documentation for the plugin can always be found at the plugin’s homepage. Go to the plugin’s project page for getting help or for reporting issues with the plugin.

Apache’s `ForceType` directive overrides `AddCharset` directives

February 26, 2007 / Rowan Rodrik

Yesterday, after uploading a refreshed www.sicirec.org, some character encoding issues popped up because I had converted the website’s content from ISO-8859-1 (Latin 1) to UTF-8. (I wanted to be able to type and paste special characters from PuTTY into VIM without worrying about the particular encoding of each file.)

The Apache HTTPD at InitFour, our webhosting provider, is configured to send ISO-8859-1 by default, while the one on our test server is configured for UTF-8. This caused a little bit of a surprise when I uploaded the refreshed website and saw all characters outside the ASCII range mangled on the life website!

I quickly dug into my .htaccess file to add the AddCharset utf-8 .xhtml directive. To my surprise, this didn’t do squat. A lot of fiddling, reloading and researching later, I realized that the following section in my .htaccess file rendered the AddCharset directive irrelevant:

<Files *.xhtml>
ForceType text/html
</Files>

I had to change the ForceType directive to include the charset as a MIME parameter:

<Files *.xhtml>
ForceType 'text/html; charset=UTF-8'
</Files>

Now, it all seemed to work. (Except that it didn’t really because I do some ridiculously complex content negotiation stuff involving a 406 handler in PHP that virtuals the most appropriate variant when no match is found. This script didn’t send a useful Content-Type header. After first adding it to the script, I noticed that the AddDefaultCharset is actually allowed in .htaccess context—a discovery which luckily rendered the other hacks useless.)