Posts

Problem Solving

June 25, 2009

It occurred to me recently that I have this notion of programming as a process that involves breaking a problem down into a sets of smaller and smaller problems until I have something I know how to solve. (I mentioned this to Steve yesterday which reminded him of a joke about an engineer and a mathematician.)

I have previously just assumed that I therefore follow this process when I’m actually problem solving however, on reflection, I’n not so sure. More specifically, I’m either not doing it at all or, at the very least, I’m doing it intuitively.

I wonder how many people do (or have done) this as an explicit part of their own problem solving and if so, what effects they’ve noticed as a consequence.

No, Sleep, Till Bedtime

By Simon Harris

June 14, 2009

Or at least until all those Twitter client developers have fixed their Twitpocalypse bugs. In case you didn’t know, a few days ago the ID range used for Twitter’s messages exceeded 2^31 (approximately 2 billion) causing any apps that stored them as 32-bit integers to think they were really small negative numbers.

It’s usual – and I say usual because I don’t always adhere to it – policy for storing external identifiers is to treat them as text, even when I know they are numbers. Why? Essentially because I consider it a coincidence that they’re numbers. That identifier, number though it may be, has no special significance to me over and above being an opaque handle to some entity in another system. As such, I like to treat them as text.

Discussing this with a good friend and colleague of mine, the question of column width came up. Ie. if you’re going to make it text, how long should the column be? If you’re lucky enough to be using a database such as PostgreSQL, then the answer is: it doesn’t matter – there’s no performance benefit to artificially limiting the size of the column. For other databases, the common practice is to use something like VARCHAR(255). Think about it, even if it is a number that’s 10^255!

Twitter claims that its API is RESTful. And if to you, REST means nice, predictable URLs with some semantic path possibly followed by a numeric id and returning numeric ids in search results, then yes, it’s RESTful. Want to see the most recent messages for a user? There’s a simple HTTP request you can make to a nice, semantic (if you speak English) URL that returns a list of them and their identifiers. And, as expected, our Twitter clients have been dutifully squirrelling these ids away in integer fields (probably because that’s the default) and all has well until 2 days ago.

Now, without going into too much of an ideological rant, I happen subscribe to the principle that RESTful URLs should be opaque. That is, a URL is a URL is a URL. No slicing, no dicing, no assembling, no joining. If I have a URL to a resource then that’s what I use. Period. End of story. (You can find plenty of discussion on this by Roy Fielding using Google.)

So, back to our column widths. Assuming we have a text field in our database large enough to accommodate a URL, we could go one step further. Rather than treat the identifier as text and storing that, why not go the whole hog and store the URL instead?

As far as I can tell, the only reason is that Twitter’s API, RESTful though they may claim, sends back numeric identifiers rather than URLs which in turn leads developers to incorrectly assume that they should be storing them as numbers.

On their own, identifiers are meaningless and in fact, useless. To utilise an identifier requires us to know the system in which it is stored and the collection in which it belongs. If instead each piece of information was identified by a URL we get all that context for free and the power to share information grows phenomenally.

To me, the beauty and power of the internet is the ability to link together disparate systems in ways no one had previously imagined. More specifically, in ways the publishers of the information never considered.

Opaque URLs combined with idiomatic use of HTTP verbs can help reduce the coupling between producers and consumers by giving back control to producers in how and where they store information and at the same time increasing the freedom for others to share and use that information.

(That last paragraph reads like an Amnesty International commercial!)

Shameless Self Promotion

By Simon Harris

May 15, 2009

So the past couple of months, I’ve finally had the luxury of starting to realise my (and Cogent’s) dream of doing product development.

We just recently launched what we hope is a very simple, easy to use and somewhat opinionated web application for Getting Things Done™ (GTD). It’s a crowded market to be sure but we really believe we understand GTD well enough to deliver a system that is more than just a to-do list with GTD inspired keywords.

Runway is still in the early stages of feature development. For those that know anything about GTD, you’ll be happy to hear that we’re working on delivering Projects, Artifacts, Agendas and of course an Inbox, to name but a few, in the very near future.

What you see now is, and will always be, free. at some point we’ll be adding pay-for features but we’ll also be doing the right thing by all our early adopters. So, if you have 5 or so minutes, we’d love for you to sign up, have a play, and of course, tell us what you think.

Web standards and all I got was this lousy website

By Simon Harris

April 17, 2009

Over the Easter long weekend, I had a great break from work and a great opportunity to think about and reflect on my career, my job, and my profession as a whole. It’s safe to say I become a bit disheartened and disillusioned. The one striking conclusion I kept arriving at is that we are so technology focused that we spend too much time, money and effort building things that the customer is “happy” with but not blown away by. That we artificially constrain the end user experience based on our notions of “correctness”. In particular, web application development is largely a bunch of dick-pulling technical masturbation, forever re-inventing the wheel at a ridiculously low level of abstraction shoving our technological solutions down user’s throats in the name of “software engineering”. What’s worse is that I’ve been complicit. Not only by buying the hype but often by trying to do “the right thing” even when I felt as though I was bashing my head against a brick wall.

Rewind the clock to somewhere between 1996 and 1999. During those years I, along with a good friend and colleague built a desktop application that was delivered to thousands of users across Australia using nothing more than good old-fashioned client-server SQL written in, of all things, PowerBuilder – kinda like VisualBasic. More than 10 years ago, the user experience was compelling and sophisticated, it performed exceptionally well over 2400 baud dialup modems, and we built the initial release with only 2 people over 3 months from scratch. As shameful as it is, especially coming from one so vocal about automated testing as I, we had nothing but manual testing but we also had few bugs and when users did find a problem, we fixed and redeployed within 24 hours – mostly because we didn’t want to interrupt users as they worked and so waited until after hours. Over the next 12 months, we were able to adapt to the user’s needs immediately. Rarely did we add the features as request but we always managed to produce a solution they actually needed. Fast forward a decade and I feel like I suck because I honestly don’t believe I could do the same thing again today. In fact, I challenge any of us to build the same user experience with our existing technology stack.

To those that know me well, I will no doubt sound like a broken record but I can’t help feel we’ve been trying to coerce HTML & CSS into something they just aren’t and doing so for a decade now.

Think about it, HTML: HyperText Markup Language. Does that sound like it has anything to do with layout and design? In fact do you know any designers, even those that call themselves web designers, that do any of their design work in HTML/CSS? No – well none that I’ve ever heard of. The closest I can think of is a colleague who does his wireframes in OmniGraffle and then generates HTML/CSS. Why? I put it to you it’s because we don’t think in HTML/CSS. You CAN’T effectively think in HTML/CSS and if a guy who’s expertise lies in designing user interfaces can’t think in terms of HTML/CSS why the hell do we think we should?

HTML was designed for linking documents with a modicum of layout and has served that purpose admirably. As a result, the web browser largely won the battle for desktop supremacy and almost everyone has a web browser and regularly uses a number of web sites. Similarly, pretty much everyone has a computer running an 80x86 based CPU and run dozens of applications built specifically for it. HTML/CSS are the machine language of the web.

For those of us lucky enough to have done any assembler programming, we’ve also been lucky enough not to have had to do any for a very long time. Instead, we chose to move away from assembler to other languages. C, C++, Java, Smalltalk, Python, Perl, Ruby, literally dozens of other programming languages that have systematically improved the level of abstraction. Many of these languages now run on top of the JVM, LLVM, CLR, etc. themselves abstractions on top of the underlying CPU.

Did we move because the runtime was faster? Hardly. In fact in almost all cases outrageous claims were made early on that poor performance would be the undoing of these languages and in almost all cases these claims ultimately proved unfounded. No, we moved to these languages because we hoped they would give us a better level of abstraction. That we could code more closely to the way we think. That we would one day realise the dream of literally thinking in code.

Even within languages we constantly strive to improve the level of abstraction. In many cases we’ve created Domain-Specific-Languages in order that we are better able to think IN the language most appropriate to the task at hand rather than needing to perform some contorted mapping process. This is the reason the Ruby community has slowly moved from Test::Unit to RSpec/Shoulda: Test::Unit does the job just fine but it’s verbose and “too close to the metal”. Just like assembler. When I’m the most productive I’m literally thinking in code.

We’ve largely sorted the back-end problems: Database access layers, routing, data format conversion, validation, you name it it’s all been largely worked out in whatever framework and language combination you can imagine. The same cannot be said of the front-end WHERE IT ACTUALLY MATTERS.

Granted, HTML/CSS has undergone change but to what extent and to what end? We have JSP, ASP, ERB, HAML, SASS, Liquid, blueprint, jQuery, Prototype, MooTools, Dojo, YUI, etc. but none of them appreciably raises the level of abstraction. Most advances in the world of HTML/CSS are lipstick. They’re all constrained by the fallacy that HTML/CSS is the holy grail of web design. No, the whole problem with web development is that we haven’t abstracted away the underlying technology, instead we’ve been conned by a bunch of HTML/CSS gurus and boffins who think that designing the perfect machine code is all the world needs. There is nothing more primitive than HTML+CSS when it comes to the web.

HTML & CSS try to be all things to all people and by doing so, much like J2EE, we ended up with a set of primitive tools that are repetitive, verbose, hard to test, maintain and refactor and ultimately provide a user experience that can best be described as a tarted up, 24-bit 3270 terminal. Don’t believe me? Point me at a website where the user experience feels liquid and natural. Where it literally gets out of your way so that you never even realise you’re using it? For the most part you can’t. The poster children of the Rails world provide at best a rudimentary user experience. I suspect people use them because there is no alternative, not because it’s actually a great UX. Why? IMHO because the technology choices are just plain awful. If you can find a website with a rich user experience that just melts away, you’ll likely find a bunch of developers who either had nervous breakdowns or spent many years building some superduper framework, or both!

To be fair I’m no doubt coming across as though HTML/CSS is to blame for all the world’s problems. Not at all. We suffer from similar problems across the board in software development. It just so happens that I’ve been in the world of web development for a long time now and feeling the effects.

I’m not advocating the use of any particular technology – that would kinda defeat the purpose of my argument. What I am saying is that I believe we’re stuck in a mindset that only allows us to think inside the incredibly narrow bounds of something we’re used to, IMHO, only because it’s all we’re used to.

Rather than embracing the “web paradigm” how about we embrace the user and their experience and decide what technology would best enable us to deliver that.

A Title Case Gem for Ruby

By Simon Harris

February 2, 2009

A project I’m working on called for some “smart” capitalisation of page titles. Essentially I wanted to take a URL slug and generate a page title.

Rails comes with a built-in String#titleize method that capitalises every word but that looked a little odd when the title was something like: “My Hovercraft Is Full Of Eels”. So I went on a hunt for something “smarter”.

After a little search I stumbled upon Marshall Elfstrand’s JavaScript, Ruby, and Objective-C ports of John Gruber’s “Title Case” algorithm and decided to turn it into a Gem that adds String#titleize and String#titleize! (aliased as #titlecase, and #titlecase! respectively). When used in a Rails environment, this effectively replaces the Rails versions.

Now my page titles look a little more human-like: “My Hovercraft is Full of Eels”.

Plugins move

By Simon Harris

January 18, 2009

Following hot on the heels of my blog move, I’ve finally moved all my rails plugins off the venerable RubyForge and onto GitHub.

Since I started working at CogentConsulting – no we’re not “The Company of ex-ThoughtWorkers” unless you count all 3 of us as somehow statistically significant – I’ve had less and less time and less and less inclination to spend any appreciable effort on RedHill related stuff to the point where the company really exists just to support and market Simian.

As a consequence, I’ve also dropped the RedhillOnRails moniker in favour of publishing the plugins under my personal account.

Blog move

By Simon Harris

January 17, 2009

If you’re reading this then the move of my blog was successful and thank-you for putting up with a screwy RSS feed during the transition. No doubt you received double or possibly even triple posts.

Why the move? Well, even though GeekISP have been a fantastic hosting provider over the years and MovableType has been pretty reliable as a blogging platform, in my never ending quest to Do Less Stuff, I figured it was time to move the pain somewhere else.

From a technical perspective, the move was fairly easy though not without some pain. There is no direct way to import from MT to Blogger however I did find a tool that helped convert the MT export file into something Blogger could import.

I also wrote a quick script to replace all internal references with new links as well as generating a new .htaccess file for any links from the outside world. This step was pretty easy although it took some trial and error to work out what how Blogger converts titles into URLs – as near as I can tell it truncates to a maximum of 40 characters with a bias towards word boundaries. The duplicate posts appearing in the RSS feed were as a direct result of me re-creating the entire blog several times fixing little things here and there.

And so it is that my blog comes to be here on Blogger. The next step is to move all my domain hosting to Google Sites but that’s for another day. Hopefully this will be the last move for some time and, with someone else maintaining my blogging software, hopefully less stuffing around on my part.

Acts As Teapot

By Simon Harris

January 12, 2009

No, it’s not April Fools yet but I thought I’d get in early this year. Acts As Teapot is a Ruby on Rails plugin that ensures your Ruby on Rails applications conform to RFC2324. My assumption here is that your application is not a coffee pot and therefore does not understand the Hyper Text Coffee Pot Control Protocol (HTCPCP/1.0). Thus, if ever a BREW request or any other request with the Content-Type set to “application/coffee-pot-command” is received, the server will respond with 418 I’m a teapot.

Rails, meet Drupal.

By Simon Harris

January 12, 2009

If you’ve been considering integrating (or replacing) your Drupal application with a Rails application, then Drupal Fu may come in handy.

It’s pretty rough-and-ready – I essentially just ripped the code out of an existing application and cobbled it together – with, as yet, no plugin infrastructure, Rakefile, or anything else that might give you a degree of confidence in the quality of the code :)

That said, the code has been working in a production application for a while and we figured it might help out some others going through the same pain.

TimeMachine FTW!

By Simon Harris

July 25, 2008

Not withstanding the fact that I needed to restore my operating system in the first place – due to an inexplicable and catastrophic failure of the Java installation resulting in segfaults – I was able to restore my entire 100GB system in around 4 hours. For posterity:

Boot off the OS X System Install DVD – hold down option while the system starts
Connect the external drive with the TimeMachine backup – in my case a TimeCapsule attached via ethernet
Select “Restore from TimeMachine backup” in the Utilities menu
Select the specific backup (by timestamp) from which to restore
And away you go!

The disk is then automatically erased and a fully bootable system is restored sans temp directories and cache files. It even managed to restore my PostgreSQL databases that were running at the time – which probably says more about PostgreSQL than anything.

The one grumble I do have is that the timestamps in the name of the backups were some non-obvious period relative to the actual date the backup was made. The difference wouldn’t have been much of an issue had I simply needed to restore the most recent backup but as it turned out I needed to go back a couple of days in order to get a clean system. Thankfully I got lucky on the second attempt :)

Once I had restored the system I took a look at the backup folders and sure enough there are two timestamps: the one in the folder name, and the created date. The created timestamp was spot on but the one in the folder name – the one presented to you when restoring – was whacky. I honestly didn’t spend long enough to calculate if the difference was consistent.

What is really interesting is that I had SuperDuper! on my list of software to start using but it would appear there is little need – at least in my case.

Generating lots of little test cases

By Simon Harris

June 7, 2008

When writing code that is largely algorithmic, I find I end up writing specs that sit in a loop repeating the same operations over a set of data.

This works well enough but it has the downside that the tests abort as soon as a single failing case is detected which can lead to vicious cycles of fixing one case only to find you’ve broken another.

The solution is of course to break out each expectation – inputs and expected outputs – so they are all run and reported individually. Doing so by hand however, is tedious to say the least so why not generate them on the fly instead:

describe Fibonacci do
  [[0, 0], [1, 1], [2, 1], [3, 2], [4, 3], [5, 5], [6, 8]].each do |input, output|
    it "should generate #{output} for #{input}" do
      Fibonacci.calculate(input).should == output
    end
  end
end

This is such a elegant solution I’m not sure why it only just occurred to me.

Prevent accidental deployment with a prompt

By Simon Harris

June 7, 2008

This morning I went to push out a new version of an application to our staging environment on an Engine Yard slice. I knew I had done exactly that last night so I navigated through my bash history and hit enter. Two minutes later the new version had been deployed and I was about to walk out the door to do some chores before coming back to start playing with the app. Thankfully, Steve came online and informed me that production was broken.

A quick look through my bash history and it seemed I’d used the deploy to production rather than deploy to staging but, being in a hurry, hadn’t looked carefully enough. Of course some might argue that I should have looked more carefully. That I shouldn’t have deployed before heading out. All valid points but I very rarely have full control over what’s going on around me. So, while it’s all very well and good to hope that I will be more careful next time, that’s a bit like hoping global warming isn’t a reality: we all hope it’s not but maybe we should do something about it just in case?

And so I added the following at the start of the :production task in my deploy.rb file:

unless Capistrano::CLI.ui.agree("Are you sure you want to deploy to production? (yes/no): ")
  puts "Phew! That was a close call."
  exit
end

For all other environments, the deployment goes through without question. Attempt to deploy to production however, and I’m now forced to be explicit about my intentions.

Quickly Migrate all database times to UTC

By Simon Harris

June 3, 2008

If you’re thinking about updating to Rails 2.1 to get the timezone support, you’ll need to update all database records to UTC. Here’s a quick migration script to do just that:

class ConvertTimestampsToUtc < ActiveRecord::Migration
  # Assume all times were in UTC+10:00
  OFFSET = "interval '10 hours'"

  # Adjust any date/time column
  COLUMN_TYPES = [:datetime, :timestamp]

  def self.up
    adjust("-")
  end

  def self.down
    adjust("+")
  end

  private

  def self.adjust(direction)
    connection = ActiveRecord::Base.connection
    connection.tables.each do |table|
      columns = connection.columns(table).select { |column| COLUMN_TYPES.include?(column.type) }
      updates = columns.map { |column| "#{column.name} = #{column.name} #{direction} #{OFFSET}"}.join(", ")
      execute("UPDATE #{table} SET #{updates}") unless updates.blank?
    end
  end
end

As you can see, I’ve assumed that the dates were previously stored as AEST (UTC+10:00) so you’ll likely need to adjustthat and I’m also assuming PostgreSQL for date manipulation though it should be pretty simple to convert to run under MySQL. It may even work asis.

Deploying branches with Capistrano

By Simon Harris

May 30, 2008

This morning I had occasion to deploy a branch of a git repository to a staging server but hadn’t the foggiest idea how. A quick search through the capistrano source code revealed that I could use set :branch "branch_name" in my deploy script. I tried it and it worked. I then figured I would need to make a similar change across all my branches. Of course, I’m a lazy sod and wondered if there wasn’t a better way.

If you’re not familiar with git, the output of the git branch command is a list of branches with an asterisk marking the one currently checked out on your local machine. For example:

> git branch
* drupal_authentication
fragment_caching
master

So, I figured, what if I just parsed the output and searched for the branch marked as current:

set :branch, $1 if `git branch` =~ /\* (\S+)\s/m

Now I’m able to deploy whatever branch is current on my local machine from a single, shared, deploy script.

Giving the Anchor tag some Ajax Lov'n

By Simon Harris

May 13, 2008

It seems I’m forever needing to submit links using an XMLHttpRequest rather than the default full-page refresh. One approach commonly used in the Rails community is to render each link with the JavaScript already in place. My preferred approach however is to keep the HTML as free from JavaScript as possible and unobtrusively add behaviour using LowPro.

LowPro already comes with a built-in behaviour for links but sometimes I need something little more complex than simply submitting the request and so I usually end up doing the following:

anchor = ...;
new Ajax.Request(anchor.href, { method: "get", parameters: ... });

Granted that’s not a lot of effort but it still felt as though I were repeating myself and that the overall intention of my code was largely obscured by the infrastructure. It then struck me that submitting a form using the Prototype JavaScript framework is almost trivial:

form = ...;
form.request({ parameters: ... });

So I cooked up a version for anchors as well:

Element.addMethods("A", {
  request: function(anchor, options) {
    new Ajax.Request(anchor.href, Object.extend({ method : "get" }, options || {}));
  }
});

Now I can submit links in pretty much the same was as I do forms:

anchor = ...;anchor.request({ parameters: ... });

I’m wondering what other possibilities might occur were I to add a serialize() method to extract the request parameters.

JavaScript Date Helpers

By Simon Harris

May 13, 2008

It’s all the rage these days to have timestamps displayed in words to indicate how long ago some event occurred. You know something like “less than a minute ago” or “about 2 months ago”, etc. You’ll see plenty of examples on news sites, blog entries, and bug tracking tickets to name but a few.

If you’ve ever had to build this kind of thing in Rails you’ll be familiar with all the Date Helper methods that make the task pretty trivial. The problem is that the result is fixed to whatever the date was when the page was rendered and as a result these timestamps go stale very quickly.

Save refreshing the page every minute – or hour or whatever – just to update the times, I figured what was needed was a little but of client-side action. Rather than send the text in the HTML, I decided to instead send the raw timestamps and have the browser periodically generate the textual representation.

To this end, I blatantly copied two methods from the afore-mentioned Rails helper – distance_of_time_in_words(from, to), and time_ago_in_words(from) – and, taking some liberties along the way, converted them to JavaScript:

function distanceOfTimeInWords(to) {
  var distance_in_milliseconds = to - this;
  var distance_in_minutes = Math.abs(distance_in_milliseconds / 60000).round();
  var words = "";

  if (distance_in_minutes == 0) {
    words = "less than a minute";
  } else if (distance_in_minutes == 1) {
    words = "1 minute";
  } else if (distance_in_minutes < 45) {
    words = distance_in_minutes + " minutes";
  } else if (distance_in_minutes < 90) {
    words = "about 1 hour";
  } else if (distance_in_minutes < 1440) {
    words = "about " + (distance_in_minutes / 60).round() + " hours";
  } else if (distance_in_minutes < 2160) {
    words = "about 1 day";
  } else if (distance_in_minutes < 43200) {
    words = (distance_in_minutes / 1440).round() + " days";
  } else if (distance_in_minutes < 86400) {
    words = "about 1 month";
  } else if (distance_in_minutes < 525600) {
    words = (distance_in_minutes / 43200).round() + " months";
  } else if (distance_in_minutes < 1051200) {
    words = "about 1 year";
  } else {
    words = "over " + (distance_in_minutes / 525600).round() + " years";
  }

  return words;
};

Date.prototype.timeAgoInWords = function() {
  return this.distanceOfTimeInWords(new Date());
};

Now all I do is periodically invoke a function that calls one or other of these methods and updates the text of whatever display element is appropriate. Even better, because the raw timestamps have timezone information in them, the display doesn’t suffer from, in my case here in Australia, always being 10 hours out because the server is sitting in the US with a US date/time.

HTTP Ranges for Pagination?

By Simon Harris

April 22, 2008

Would it be a gross perversion to use HTTP ranges for pagination?:

Client asks the server what range types it accepts for people:

HEAD /people HTTP/1.1

Server responds:

Status: 200Accept-Ranges: pages; records

Client requests the first page of people:

GET /people HTTP/1.1Range: pages=1-1

Server Responds:

Status: 206Content-Range: pages 1-1/13

Finding the index of an item using a block

By Simon Harris

April 18, 2008

Ruby 1.9 has it but if you’re not that bleeding edge, you can have it now:

class Arraydef index_with_block(*args)
  return index_without_block(*args) unless block_given?
  each_with_index do |entry, index|
    return index if yield(entry)
  end
  nil
end
alias_method :index_without_block, :index
alias_method :index, :index_with_block

def rindex_with_block(*args)
  return rindex_without_block(*args) unless block_given?
  index = sizereverse_each do |entry|
    index -= 1
    return index if yield(entry)
  end
  nil
end
alias_method :rindex_without_block, :rindex
alias_method :rindex, :rindex_with_blockend

If you’re using Rails you can substitute the two calls each to alias_method with a single call to alias_method_chain.

UPDATE: Ruby 1.8.7 also has this.

It's OK for GET Requests to Update the Database

By Simon Harris

April 16, 2008

We’ve all been indoctrinated into associating the HTTP request methods POST, GET, PUT and DELETE with the standard database (aka CRUD) operations Create (INSERT), Read (SELECT), Update and Delete respectively. For the most part the analogue holds. When we make a GET request, our intention is to read whatever the server hands back. When we POST some data, our intention is to update something.

The general view however, seems to be that the HTTP methods relate directly to database operations. In fact many developers seem to think that they are in fact one in the same thing: POST is for INSERTing data, GET for SELECTing, etc. The popularity of which seems to have strengthened with the growing interest in REST and the wide-spread adoption of Rails 2. I don’t mean to imply that Rails is the culprit here. Nothing in Rails explicitly makes these assertions. However the fact that the idiom is explicitly referred to as CRUD resources certainly doesn’t help. In fact, the HTTP/1.1 Method Definitions specification explicitly states that so-called “Safe Methods” such as GET:

… SHOULD NOT have the significance of taking an action other than retrieval.

But what happens when we have a site that tracks where you have visited, updates the “last read date” on each record retrieved, or remembers the last search criteria you used? Each of these features requires recording information into some kind of database – be it relational or simply a log file.

What people seem to overlook is the paragraph that follows in the same document:

Naturally, it is not possible to ensure that the server does not generate side-effects … The important distinction here is that the user did not request the side-effects, so therefore cannot be held accountable for them.

The HTTP methods should be used to indicate the user’s intention without regard to the underlying implementation. The web application is an abstraction so we need to model the interaction on that abstraction. If the user’s intention is to make a change to something then go ahead and use a PUT but if they’re only reading some data use a GET even if you know it involves some database writes.

It may seem somewhat esoteric but spending a bit of time thinking about what the user’s intention is exactly has helped me better flesh out an application’s API.

Getting Too Fancy with HTTP Response Codes

By Simon Harris

April 15, 2008

As part of my adoption of REST and all its goodness, I’ve started using HTTP response `s more, responsibly ;-) So, for example, instead of always returning 200 (OK) for just about everything, I’m using 201 (Created) with a Location header set to the new URL after a POST. For PUT, I send back 204 (No Content), a 404 (Not Found) after a GET for a resource that no longer exists, and a good old 200 (OK) after a successful DELETE or GET.

Interestingly, in the system I’m developing at present, an update (PUT) might actually cause a resource to move due to the application of server-side business rules. In this case, the 204 response also sets the Location header so that the client knows where it can be found.

All this was working beautifully on my local machine using both Safari and Firefox so once I was happy with the result I deployed it into the remote test site and started playing in FireFox. So far so good. Everything checks out. Next let’s try Safari…not so great.

Some bits of the application worked just fine but others seemed to have no effect. Then mysteriously things would start working again. Even stranger was the fact that hitting the browser’s refresh button had no effect either.

At first I suspected that nesting Ajax calls might be to blame but as everything seemed to work perfectly in FireFox and a Google search turned up nothing, I decided to do some more investigation.

I logged in to the server box and tailed the logs for signs of life. Everything looked normal. All the expected requests and responses were there but still nothing client side. Using Safari’s new Network Timeline I could see what the browser thought was going on. All the requests and responses were there but something was odd. In all but a few cases, the response code was 204 (No Content). I double checked the server logs but no, the server was definitely sending back the correct responses; a mixture of 204, 200 and 404 as appropriate.

On a hunch I went back and re-read the HTTP Status Codes document and in particular the definition for 204:

The server has fulfilled the request but does not need to return an entity-body … the client SHOULD NOT change its document view from that which caused the request to be sent …

That might actually explain it. If Safari received a 204 and interpreted that to mean “Don’t change anything” then hitting refresh would indeed have no effect even if my code subsequently went on to perform more asynchronous requests as a result.

So, I dutifully changed all the 204s to 200s and voila! Safari started to behave just as expected and FireFox continued to work as had previously.

I’ve also noticed a difference in the way both browsers handle 303 (Redirect) from within an XML HTTP Request: Safari performs the redirect and keeps all the headers as per the original, whereas FireFox seems to essentially construct an entirely new request. The upshot is that you can’t actually detect (server-side) an XML HTTP Request from FireFox if it is the result of a redirect.

I’m really not sure why the two browsers have such differing opinions of what the appropriate behaviour should be in either case but I hope this helps some other poor sod keep from pulling their hair out.