/usr/portage

Antipattern: chaining stateless protocol requests 3

As we all know, HTTP is a stateless protocol. We do all sort of hacks to add state, like ext/session in PHP. While such hacks work great for a lot of use cases, we should remind ourselves that they are hacks. There is a phenomenon of state creep: coupling unrelated HTTP requests. Think of a page that references a thumbnail in an <img/>-tag and the picture is generated as needed: it would be possible to generate that image in the context of the request that embeds that image. So the template calls a helper to generate the thumbnail and the thumbnail is generated in the file system.

While this works well for a single host, your personal weblog about cooking and cats, it won’t work for something serious. When you start load balancing between two webserver nodes you are set on fire as you can’t guarantee that the image is present on the correct node (beside you are generating the image n times where n is the number of nodes). The solution is not that hard: pregenerate all the images with a queuing system and display “This image is currently not available”-placeholders as long as they are not ready or – in case of little image uploads – generate them when uploading the image. The other option is to generate them on the fly when they are requested. If you do the latter, do it in the context of the request that tries to receive the image, not in the embedding context (the page that embeds the image). Generating on the fly means that you deliver your files through PHP or something similar: this is fine as long as you have an HTTP accelerator in place.

One of the systems that does it in the way described above is Drupal. I’ve implement MogileFS for image storage and retrieval for Drupal and let me say, it was not a pleasure.

On a side note: HTTP 1.1 allows resources to be fetched in parallel, which makes generating images in the wrong context even worse from a user experience point of view, as the page will not show up until each thumbnail is generated.

Filed under , , , & three comments & no trackbacks

8 Hints out of Testing-Turmoil 6

  1. Have a continuous integration solution in place. Really. If you don’t, you just burn money by writing tests. I would go so far and say, if you don’t have continuous integration, you should stop writing unit tests and do click testing. Let your CI system generate API docs, high level docs, code coverage report, testdox and every statical analysis info you generate.
  2. The definition of “tests pass” is “tests pass on the continous integration system”. “Works for me” has neither a place in the bugtracker nor everywhere else.
  3. If you can’t test it, the architecture is most likely wrong (exceptions are sessions and caching related code which is generally hard to test). Testability should be your main concern when writing code. What’s the use of fast or wonderful looking code, if you can’t repeatable prove it is working?
  4. Prefer method calls over annotations. A typo in setExpectedException will trigger a transparent error, while a typo in expectedException will lead to Obscure Test, and most likely a Mistery Guest.
  5. Run the whole test harness twice. This will hellp to identify setup/teardown bugs. Create a random test suite to identify the hard to track mistakes.
  6. Run your testsuite really often. We run it with 15 seconds delay every minute and I’m pretty happy with it.
  7. Use good test names that describe the behavior of the unit. The behavior is not the unit you test itself, that’s what I see in the code, it is something like “calling register changes the status of the user to foobar” so the good test name would be “testRegisterChangesTheStatus …”.
  8. Aim for 100% code coverage. 95% is nothing to be proud about, I can guarantee, the missing 5% will be the hardest part.

Filed under , , , & six comments & no trackbacks

Recovering a software RAID 0

The scenario: my RAID crashed because I’ve messed around with the partition table of one of the disks in there. This results in a RAID array not being able to assemble itself because the superblock of the messed up device is invalid. The trick is pretty easy: just recreate the whole RAID with mdadm. The existing metadata will not be overwritten, the current information is just replicated. I used to have a simple RAID1, but I’ve now recreated it as an incomplete RAID5 (--level=5, --raid-devices=2) as the missing disk is soon to be bought.

$ mdadm --create /dev/md0 --level=5 --raid-devices=2 /dev/<original> /dev/<crashed>

If you like to stick with a RAID1, and not doing the migration to RAID5 along the way, just use --level=1 instead. I’m not really sure if the order of the disks matter and I’m not brave enough to find it out.

Tomorrow I’m going to buy the next disk for the RAID to make sure the redundancy level is alright. Generally I’m pretty amazed that this kind of setup is so robust. Even me messing around with it can’t bring it down.

Filed under , , , & no comments & no trackbacks