haruki zaemon

Conventions Were Made to be Broken

By Simon Harris

December 2, 2006

The Rails convention for naming foreign key fields is to use the singular name of the table with a suffix of _id.

In the past few days a number of people using the Foreign Key Migrations plugin together with ActiveRecord session store have discovered that the exception to the rule is session_id in the sessions table. In this case, session_id is not a recursive relationship but an unfortunately named field.

The solution is to update the migration script and add :references => nil to the line that adds the session_id. The plugin will then ignore the column and not attempt to generate a database foreign key constraint.

RedHill on Rails Plugins 1.2

By Simon Harris

December 2, 2006

With the pending release of Rails 1.2, I’ve taken the opportunity to begin updating the plugins to take advantage of various 1.2 features such as alias_method_chain and to remove where possible, work-arounds for bugs that have been fixed. So, as of today, the trunk will only run against Rails 1.2.

For those not lucky or perhaps not foolish enough to start using the latest and greatest that Rails has to offer, you can find all the Rails 1.1.6 compatible versions of the plugins at: svn://rubyforge.org/var/svn/redhillonrails/tags/release-1.1.6/vendor/plugins

Enjoy!

Aikido: Saito Hitohiro Soke Seminar, Melbourne, April 2007

By Simon Harris

November 17, 2006

I suspect many of you aren’t interested in Aikido nor do you live in Australia. However, for those few that are and do, I will be hosting my teacher from Japan for a Seminar over the Easter Holiday weekend (6 ~ 9 April) 2007 here in Melbourne. All the details can be found in the flyer.

Email Signatures

By Simon Harris

November 14, 2006

I’ve always liked the subtitle to Charles Miller’s blog:

> tail -f /dev/mind &gt; blog

Today, my brother was reading through some KDE newsgroups and came across this as the signature to one of the messages:

> # cd /usa/whitehouse# rm -rf *

Ahhh geek humour!

Feeling Poor?

By Simon Harris

November 14, 2006

Visit the Global Rich List; enter your annual income – I used Canadian Dollars as a close approximation for Australian Dollars; and click “show me the money!”

Changed your mind yet?

How to Entice People to Vote

By Simon Harris

November 9, 2006

Turn it into a lottery:

Voters weren’t keen about another, more quirky Arizona measure: They defeated a proposal that would have awarded $1 million to a randomly selected voter in each general election.

Rails Housekeeping

By Simon Harris

October 28, 2006

Since moving from lighttpd+FasCGI to Apache 2.2+mongrel our production rails application has been rock solid – one unexplained ruby core-dump notwithstanding.

To keep everything humming along, we run a few cron jobs which I thought I’d share.

The first is to ensure the application starts on boot. There is an rc-script to do this but I never bothered to get it running on FreeBSD. Instead, we use the @reboot keyword built into vixie-cron:

cd ~/www/production/current; mongrel_rails cluster::stop; mongrel_rails cluster::start

Next, session expiration. Even though plenty have argued against them in favour of memcached, we’ve found file-system sessions to be just fine for our, relatively low traffic, application. To keep timeout sessions after one hour – with a margin of error of an extra hour – we run an hourly cron job to delete session files that haven’t been updated since it last ran:

cd ~/www/production/current; find tmp/sessions -name 'ruby_sess.*' -amin +60 -exec rm -rf {} \;

Next, to keep log file sizes manageable, we run a cron job once a day to rotate the log files using logrotate, followed by a re-cycle of the mongrel cluster:

cd ~/www/production/current; logrotate -s log/logrotate.status config/logrotate.conf; mongrel_rails cluster::restart

And here’s config/logrotate.conf:

"log/*.log" {compressdailydelaycompressmissingoknotifemptyrotate 7}

And finally, just because, we run another daily cron job to vacuum the PostgreSQL database:

cd ~/www/production/current; psql cjp_production -c 'vacuum full'

Plugging a Team City Security Hole with a Little Obfuscation

By Simon Harris

October 27, 2006

If you’re not sure what I’m talking about, have a quick read of my earlier post.

The trick – as far as I can tell – to plugging the hole is to: disable guest logins to the server; and ensure each build configuration requires each agent to have a secret environment variable set to a, secret, value – something like a generated WEP key for example.

This seems to be a reasonable solution until JetBrains figures out a better mechanism. That said, in many respects, it’s not that different to the way WEP authentication works anyway.

Oh, an why the need to disable guest login? TeamCity shows all logged in users – yes, even guests – which agents aren’t compatible and, in particular, why.

Of course there may well be a way to interrogate programatically what the requirements are in which case, you’re hosed anyway :(

A City Full of Code Thieves

By Simon Harris

October 26, 2006

After nearly falling over at the ease with which I could use TeamCity to crush helpless machines, it suddenly occurred to me that I may have found yet another security hole.

As you probably already know, the TeamCity server doesn’t do the builds itself; rather, it farms the work off to build-agents. You can connect as many build-agents as you like to a server: simply download (or otherwise obtain) a copy of the build-agent code; configure it to point at the server; and start it up. No server-side credentials are necessary.

When a build is required, the server checks out the source code and sends a delta to the agent. This has a number of benefits, one being that you can securely configure the source-code-repository credentials in one place – the server – and the agents will be sent the source code as needed. This also poses a potential security risk.

Let’s imagine that disgruntled developer X from our previous exploit wants to obtain a copy of source code from a repository for which he has no access but for which he knows there is a TeamCity build. He simply configures his build-agent to connect to the server and waits. When the server decides it’s his machine’s turn to do a build, the server dutifully sends him a copy of the source-code…!

A City of Trojan Ants

By Simon Harris

October 20, 2006

As much as I’ve bitched about IntelliJ performance, and as much as I wish I didn’t have to do Java development on a regular basis, the fact of the matter is I do and IntelliJ just rocks my world in terms of features – I guess I’ll just have to wait for the new Core 2 Duo MacBook Pro to address the performance issue– so today I purchased a copy of the latest version.

IntelliJ 6.0 comes with TeamCity, a continuous-integration/build-server tool. It would appear that TeamCity isn’t just restricted to building applications, but could really do anything assuming you do in ant script.

TeamCity has the concept of build agents so all the real work is farmed off to other machines, so I setup my machine as a build-agent which required opening up a port – 9090 is the default – through the firewall on my machine. This immediately rang warning bells in my head: I had just run the agent using sudo and any application run using sudo that needs ports opened up scares me. In the end I realised I could easily run the agent as a non root user and all was happy. Excellent!

That did get me thinking however, as to what would happen on say, M$ Windoze machines where developers typically run with administrator priveleges or even poor saps like me who had stupidly run using sudo or even just as my own login.

Imagine a team of 20 developers, all of whom have obligingly run a build-agent so that their spare CPU cycles don’t go to waste. One day developer X becomes disgruntled with his employer and decides to run wild. Before he goes home at night, he checks in a change to build.xml. Not a large change, nothing special, just a one line change that recursively deletes everything starting at the root directory or even just the user’s home directory.

This change quickly makes it to the TeamCity server and then out to a build machine which dutifully executes the script, destroying everything taking down the machine with it!

Thankfully, the TeamCity server detects that the agent has gone down and, doing it’s best to utilise the resources at its disposal, re-schedules the build on the next available agent…

To be fair, TeamCity isn’t really to blame, if a developer runs build.xml without first ensuring that it does nothing macilicious, it’s really no different to running an un-verified shell script. TeamCity just makes it really easy to setup agents and ensure that the script is executed.

Me thinks it’s time make a seperate, locked-down account for running the build-agent!

Lies, Damned Lies and Statistics

By Simon Harris

October 12, 2006

Whilst reading the latest news headlines on the ABC (Australian Broadcasting Corporation) web site just now, I happened upon what seemed like a rather interesting article entitled Brain fluid draining eases dementia: research. Fascinated, I read on.

…The study investigated 20 patients…71 per cent of our patients improved in memory and mental function and 94 per cent improved in balance and walking…

Hang on a second…71 percent of 20 patients would be…14.2 patients; 94 percent of 20 patients would be…18.8 patients. Huh?

Abstract ActiveRecord Classes by Convention

By Simon Harris

October 11, 2006

Ruby on Rails provides a very simple mechanism for specifying that a model class is an abstract base class and therefore has no corresponding database table:

class MyAbstractClass &lt ActiveRecord::Base**self.abstract_class = true**...end

Code can then interrogate a model class to see if it is abstract:

puts "it's abstract" if MyAbstractClass.abstract_class?

Not so hard, however I pretty much always prefix the name of my abstract classes with, you guessed it, 'Abstract'. So, I added some code to the RedHill on Rails Core Plugin the other day to extend the definition of an abstract class to include the name:

def abstract_class?@@abstract_class || !(name =~ /^Abstract/).nil?end

With that simple change, I no longer need to explicitly set self.abstract_class = true; it just works by magicconvention.

I suppose I could/should have created a plugin for it but I was feeling lazy :)

Perforce Client Setup

By Simon Harris

October 2, 2006

For anyone who is is unfortunate enough to work with Perforce – and so I don’t have to remember – here’s a quick-and-dirty kick-start guide for setting up a client workstation. (Note I’m on a Mac so your mileage may vary.)

First things first, install the perforce software available from http://www.perforce.com/perforce/downloads.

Next, in order to avoid various command-line arguments, I have the following environment variables set:

export P4EDITOR="$EDITOR"export P4USER="username"export P4PORT="1666"export P4HOST="clientname.local"export P4CLIENT="clientname"

If you’re using SSH, you’ll need to create a tunnel to the server. Something like this should work:

ssh -L1666:server:1666 -p 22 -N -t -x username@server

This sets things up so all requests to localhost:1666 are routed over ssh to the remote server. You can then setup the client:

p4 client

This will launch your default editor – in my case that’s TextMate but nano/pico/emacs/vi/etc will do – and allow you to modify the following fields:

Client:  clientnameOwner:  usernameHost:  clientname.localRoot:  /path/to/projects/View:

See the documentation for an explanation on how to set-up the View. In my case, I’m running a rails application, so I have some rules to exclude various generated and client specific directories:

-//depot/projectname/config/database.yml //clientname/projectname/config/database.yml-//depot/projectname/db/schema.rb //clientname/projectname/log/schema.rb-//depot/projectname/log/... //clientname/projectname/log/...-//depot/projectname/tmp/... //clientname/projectname/tmp/...

Finally, to get a copy of the latest source code in /path/to/projects/projectname, run:

p4 sync

And because I just can’t help myself, by way of comparison, here’s the equivalent instructions for subversion:

svn co svn+ssh://username@repositoryurl/trunk/projectname

Gosh, wasn’t that difficult.

Zip and Preserve File Permissions with Ant

By Simon Harris

October 1, 2006

Yes, it’s been a while since I posted an entry related to Java! Believe it or not, we still do Java development, lots of it in fact, but it’s mostly large-scale re-factoring and cleanup work on what can best be described as “legacy” applications so there’s rarely much if anything to write home about. That said, I have a couple of posts just itching to be written when I find some time. Until then, a relatively short entry will have to do :)

A client distributes one particular Java-based web application to hundreds of customers using a zip file. The distribution contains, among other things, the war file and some scripts for database migration, etc. It’s these scripts that cause us some headaches as they need to have execute permission. The problem arises because Ant’s built-in zip task specifically doesn’t handle file permissions. So, naturally, we concocted our own using macrodef:

<macrodef name="zipdir">
  <attribute name="destfile"/>
  <attribute name="sourcedir"/>
  <echo>Building zip: @{destfile}</echo>
  <exec executable="zip" dir="@{sourcedir}">
    <arg value="-qR"/>
    <arg value="@{destfile}"/>
    <arg value="*"/>
    <arg value="-x *.svn* "/>
  </exec>
</macrodef>

This simply calls the operating system’s – read *nix – zip command to compress the specified directory thus preserving all the file permissions that SVN lovingly maintains.

Deploying to Multiple Rails Environments

By Simon Harris

September 29, 2006

On one Rails project, we have two deployment environments: production; and UAT. Using the default Capistrano configuration makes deploying to these two environments rather difficult so, I thought I’d share our deploy.rb with a bit of explanation along the way. Ok, here goes:

For a start, we deploy to a directory that includes the environment as part of the path:

set :deploy_to, lambda { "/home/#{user}/www/#{rails_env}" }

For subversion, we checkout the code as the user who is running the deployment making sure not to cache authentication details on the server:

set :svn_user, ENV['USER']set :svn_password, lambda { Capistrano::CLI.password_prompt('SVN Password: ') }set :repository, lambda { " -- username #{svn_user} --password #{svn_password} --no-auth-cache svnurl/trunk/#{application}" }

In both cases, we run a mongrel cluster. Because the mongrel configuration files share a lot in common and because they largely duplicate information contained within the deployment script, we generate an appropriate configuration on deployment. More of that in a bit but for now, the common bits look like:

set :mongrel_address, "127.0.0.1"set :mongrel_environment, lambda { rails_env }set :mongrel_conf, lambda { "#{current_path}/config/mongrel_cluster.yml" }

Now, for the environment specific portions. For each environment we have a task that simply sets variables appropriately – I toyed with using an environment variable such as RAILS_ENV rather than the pseudo-tasks but it was more typing and I’m allergic to typing :).

For production, we want 3 mongrel instances in the cluster, listening on ports 8000-8002:

desc "Production specific setup"task :production doset :rails_env, :productionset :mongrel_servers, 3set :mongrel_port, 8000end

For UAT, we want 2 mongrel instances in the cluster, listening on ports 8010-8011:

desc "UAT specific setup"task :uat doset :rails_env, :uatset :mongrel_servers, 2set :mongrel_port, 8010end

And finally, a custom deployment script based almost entirely on the built-in deploy_with_migrations with the major difference being the configuration of the mongrel cluster just prior to restart:

desc "Generic deployment"task :deploy doupdate_`beginold_migrate_target = migrate_targetset :migrate_target, :latestmigrateensureset :migrate_target, old_migrate_targetendsymlink**configure_mongrel_cluster**restartend

That’s it really. Now whenever we need to deploy to a particular environment, say for example UAT, we do something like:

cap uat deploy

UPDATE: By request, here is our database.yml file:

common: &commonadapter: postgresqlusername: &lt%= ENV['USER'] %&gt;development:database: foo_development&lt;&lt;: *commontest:database: foo_test&lt;&lt;: *commonuat:database: foo_uat&lt;&lt;: *commonproduction:database: foo_production&lt;&lt;: *common

As you can probably tell, we’re lucky enough that the database user is always the same as the user under which the application will be run and is that the database itself is named according to the environment. That makes it very easy to wrap up most of the common parts – Thanks goes to Jon Tirsen for that YAML tip.

This could also easily be generated. I guess it just hasn’t needed any attention since it was created so YAGNI overrode DRY ;-)

No Really, Perforce Does Suck

By Simon Harris

September 29, 2006

Ok, so after my rant yesterday I was feeling a bit better. So many people rushed to the defence of Perforce and on the authority of people I know, respect and work for – not mutually exclusive roles – I thought I’d get stuck into it and read the manuals, read news groups and even rushed out to buy a copy of Practical Perforce.

The documentation is plentiful and very informative and the support groups are very helpful. As for the book, well, the book is most excellent, a very easy read indeed and full of tonnes of really great tips – recipes, idioms, patterns, hacks, call them what you will – which just about sums up my experience thus far: Lots and lots of rather involved processes to do what I consider to be normal everyday activities. (At this point I feel compelled to direct you to an excellent article on why patterns are indicative of unsophisticated systems.)

To give you a 100% practical example, just today I committed 1600 files which I had to back-out almost immediately because I realised I had broken something. Now, ignoring the why’s and how’s I managed to get myself into such a pickle, the fact is I needed to rollback a commit. Here’s what I did:

> svn merge -c -27289 svn+ssh://me@therepositoryurl
> svn commit

Tricky stuff that!

So then on my way home I picked up the book mentioned earlier and went straight to the index to find “Backing out a recent change”. Whoot! Just what I wanted to know. So here’s the deal:

> p4 files @=27289         # This lists all the files that have changed
> p4 sync @27288p4 add ... # For each deleted file
> p4 edit ...              # For each changed file
> p4 syncp4 delete ...     # For each added file
> p4 resolve -ayp4 submit

Yes! Pretty impressive! And, straight from the book, re-printed without any permission whatsoever (emphasis added by yours-truly):

When a change involves a lot of files, you can filter the output of the files command to produce a list of files to open. Unfortunately, files can’t be piped directly to other p4 commands because its format isn’t acceptible to them. This can be easily fixed by using a filter; namely sed.

Wow. Cool! Just what I wanted to have to do. Ok, so let’s try that:

> p4 sync @27288p4 files @=27289 | sed -n -e "s/#.* - delete .*//p" | p4 -x- add
> p4 files @=27289 | sed -n -e "s/#.* - edit .*//p" | p4 -x- edit
> p4 sync
> p4 files @=27289 | sed -n -e "s/#.* - add .*//p" | p4 -x- delete
> p4 resolve -ay
> p4 submit

Awesome! That’s sooooo much better. Sheesh, I might even be able to script it, fan-bloody-tastic. Thankfully, Perforce is touted as being lightning fast because unless I’m very much mistaken, that’s seven, count ’em, seven calls to the server!

So, what have we learned so far? We’ve learned that precisely the scenario I’ve been told Perforce is great at handling, it really, really, really, ok once more, really, sucks!

Oh, but there’s more. I forgot to mention that I was also working offline before I committed the original sin. When I eventually connected this is what I did:

> svn commit

Ok, so technically I did:

> svn up
> svn commit

So, what would have been the equivalent if I had been using Perforce you might ask?

> p4 sync
> p4 diff -se | p4 -x- edit
> p4 diff -sd | p4 -x- delete
> p4 submit

(As a side note, adding new files in both systems is about the same amount of work. That said, at least with subversion a simple svn sta will show me which files are not yet under version control. For the life of me I can’t seem to find an easy way to do this with Perforce.)

Not too bad but technically, three times as many commands. And yes, again, I could script it but why should I need to? This is something I, as a developer, do every day. Am I mistaken for thinking that developers are by far the largest users of a tool such as this? Perhaps.

It’s no wonder Google want people to know how to use Perforce; it pretty much proves the candidate has a brain large enough to even feel like working out how to use it.

Perforce: Just A Faster CVS?

By Simon Harris

September 27, 2006

So, it’s 7am-ish and I’ve had 6 or so hours of sleep to ruminate on this but yup, from a developers perspective, I still think Perforce sucks.

Can anyone tell me why they believe it seems like a good idea to:

Require an ssh tunnel to have encyrpted communication;
Keep a secondary workspace to enable offline revert;
Have a command-line tool that uses environment variables – or command-line arguments – to specify connection details;
Display a diff of which files changed as a tree – I just want to see the individual files not my entire project;
The list goes on…

I like to work offline, a lot, on planes, trains and in taxi-cabs; I like to be able to see immediately what’s changed; and I like to be able to revert everything (or only somethings) several times while I’m prototyping.

With subversion I get a lot out-of-the-box and while there will always be nice to have features such as “add all unknown files” it does pretty much everything I need.

As I moved from C to C++ to Java and then to Ruby, I felt empowered each step of the way. I had a similar experience moving from CVS to SVN. Perforce seems like a step backwards.

Google may use and recommend Perforce but when the answer to “why can’t I do …” is “you can, just write a script to …” I’m not sure I’m convinced.

ActiveRecord Identity Map for Rails Transactions

By Simon Harris

September 19, 2006

I happened to be reading a blog entry last night that mentioned some “short comings” in Rails’ ActiveRecord and its handling of record loading. Specifically, AR will load the same record twice, into two different instances, within the same transaction. Ie. the following test fails:

Customer.transaction doc = Customer.find_by_name('RedHill Consulting, Pty. Ltd.')assert_same c, Customer.find(c.id)end

To be honest, I’ve not yet been burned by this but it may just catch-out some so I quickly whipped up a very basic plugin to see how difficult it would be solve:

module RedHillConsultingmodule IdentityMapclass Cachedef initialize@objects = {}enddef put(object)objects = @objects[object.class] ||= {}objects[object.id] ||= objectendendmodule Basedef self.included(base)base.extend(ClassMethods)base.class_eval doalias_method_chain :create, :identity_mapendendmodule ClassMethodsdef self.extended(base)class &lt;&lt; base[:instantiate, :increment_open_transactions, :decrement_open_transactions].each do |method|alias_method_chain method, :identity_mapendendenddef instantiate_with_identity_map(record)enlist_in_transaction(instantiate_without_identity_map(record))enddef enlist_in_transaction(object)identity_map = Thread.current['identity_map']return object unless identity_mapidentity_map.put(object)endprivatedef increment_open_transactions_with_identity_mapincrement_open_transactions_without_identity_mapThread.current['identity_map'] ||= Cache.newenddef decrement_open_transactions_with_identity_mapThread.current['identity_map'] = nil if decrement_open_transactions_without_identity_map &lt; 1endenddef create_with_identity_map()create_without_identity_mapself.class.enlist_in_transaction(self)idendendendend

The code essentially interferes with create and instantiate (called from find) and ensures that, within a transactions, the same record will always be returned for the same id (IdentityMap).

As I mentioned, unlike all my other plugins, I’ve never used nor needed to use this one – and I’m not sure I will unless it proves to be a problem for me – but it’s yet another example of how easy it is to extend Rails to do pretty much whatever you might imagine.

Automatically Validate Uniqueness of Columns with Scope

By Simon Harris

September 15, 2006

The first cut at Schema Validations only applied validates_uniqueness_of for single-column unique indexes. This removed 80% of the cases in my code base but there were still cases where a scope was specified that lingered. Not any more.

The plugin now automatically generates validates_uniqueness_of with scope for multi-column unique indexes as well.

As always, there are some assumed conventions – which I believe will handle close to 99% of cases – around how to decide which column to validate versus which columns to consider part of the scope. The column to validate is chosen to be either:

With all remaining columns considered part of the scope, following, what I believe to be, a typical typical composite unique index column ordering.

So, for example, given either of the following two statements in your schema migration:

add_index :states, [:country_id, :name], :unique => trueadd_index :states, [:name, :country_id], :unique => true

The plugin will generate:

validates_uniqueness_of :name, :scope => [:country_id]

My next stop is to have a look at simple column constraints such as IN('male', 'female') and turn them into validates_inclusion_of :gender, :in => ['male', 'female'].

Perhaps tomorrow :)

Procrastinating in Ruby is Delicious

By Simon Harris

September 14, 2006

As I was bookmarking something on del.icio.us today, I noticed the dates on which I had bookmarked the last couple of times and wondered if there was any correlation between frequency and day of the week. So, I downloaded a summary using https://api.del.icio.us/v1/posts/all? and whipped up a little ruby script to compile some statistics:

Wednesday = 41Tuesday = 39Thursday = 37Friday = 32Monday = 26Saturday = 24Sunday = 12

Looks like Wednesday is the biggest day for bookmarking – also known as procrastinating – and what do you know? Today is…Wednesday!

So then I thought I’d see if there was anything interesting in the time of day:

12 = 2613 = 204 = 1722 = 150 = 1423 = 125 = 122 = 1220 = 1011 = 101 = 107 = 103 = 96 = 721 = 79 = 615 = 414 = 38 = 310 = 219 = 2

Phew! Most of my bookmarking is done around lunchtime although an awful lot were done at 4am!