avatarharuki zaemon

South Africa

By

I’m back, for all that is worth, from 2 amazing weeks in South Africa. It is a truly amazing place. I have travelled a lot over the years but S.A. just blew me away. Words simply cannot adequately describe the beauty and splendour of the country.

The people are a little like Australians or possibly even New Zealanders in temperement and sense of humour. They are incredibly kind. The difference being the racial makeup. The large african population makes the place just amazing. The people are so happy in what can often be dire circumstances. Especially in the shanty towns (or “townships” as they are known) where literally millions of displaced people “live”.

Thankfully for most of the time at least, there was hardly a hint of racism. Unfortunately it does still exist especially the further out from a major city you get. When I did encounter it I was nearly brought to tears, followed by rage and then a sense of helplessness as I realised there was nothing I could do. It goes both ways and whatever justification or pseudo-scientific mumbo jumbo either side can concoct, when it comes down to it is 100% cultural bias, ingrained from birth and for which the only cure is time.

I started to pick up a little Afrikaans. “Who undid my bra?” is a pretty close approximation for the pronunciation of “How are you?”. Well at least now you’ll remember it :). I even started to sound like a local. It’s a great language with an unfortunate heritage. I would still like to learn more.

The food is just incredible and a downright steal for anyone with foreign currency - I don’t think we had one bad meal anywhere. Boerwors (long beef sausages), ostrich steaks, biltong (dried meat like jerky), bobootie (like shepherds pie), you name it.

The wines are pretty good, especially the whites and cabs. Can’t say I enjoyed the local Merlot or Shiraz much. Whatever your tastes it’s certainly worth checking out the wineries of Stellenbosch, etc.

I spent most of my time in Cape Town which is in itself an amazing place to visit. You could spend a year there and still not get through all the things to do - most of which are free or as close to free as you can get. The weather, in Cape Town at least, is a lot like Perth. In fact The Waterfront can be compared to Fremantle only a lot bigger. The further north you go the more humid it becomes but I seemed to aclimatise pretty quickly. Lucky really as no one seems to have heard of fans, let alone air-conditioning :-)

We hired a car which for us was the ideal way to get around. It just gave us the freedom to do whatever we wanted, when we wanted. So some tips for the roads:

  • Speeding is the norm and fines do apply however they are rarely followed up on. We met people who had outstanding fines from 10 years ago;
  • Drink driving is not tolerated and will likely land you in jail;
  • Watch out for mini-bus taxis as they are generally over-filled, are rarely road worthy and seem to have a total disregard for anyone else on the road;
  • Watch out for expensive mercedes benz as they are usually doing about twice the legal limit and have a total disregard for anyone else on the road;
  • It is normal to move over to let other cars pass you which happens all too often on blind corners, hills and over solid white lines and sometimes in both directions at once!;
  • If you overtake someone it is customary to flash your hazard lights a couple of times to say “thank-you”;
  • If you are overtaken and someone flashes their hazard lights, it is customary to flash your headlights one or twice in acklowledgement;
  • And finally, the annual road toll for South Africa is…a staggering 10,000!

Just some of the things I saw and did:

  • Hiked up Table Mountain starting at Kirstenbosh. Takes anywhere between 2-8 hours depending on your level of fitness and the route you take. Be careful of the “Table Cloth” that rolls in to engulf the entire mountain top. If high winds are predicted, you can pretty much forget climbing it as you won’t be able to see a thing;
  • Took the cable car up Table Mountain;
  • Lions Head sits beside Table Mountain and is a gentle 45 minute walk;
  • Kirstenbosh is quite an amazing botanic gardens at the foot of Table Mountain. It’s the biggest gardens I think I’ve ever seen and home to some of the most amazing Proteas. It has to be seen to be believed;
  • The Waterfront as I mentioned earlier. Shops, cafes, bars, more shops, boat rides, restauarants, German beer gardens at which MANY MANY pints of “anniversary ale” were consumed;
  • Mountain biked around the Cape Of Good Hope, Cape Point, etc. through amazing national parks. Rode past snakes, turtles, lizards, and baboons all within metres of us;
  • The wineries of course!;
  • The open top double decker bus tour of Cape Town;
  • Green Point market where you’ll find just about every kind of hand-made souvenier or gift you could ever hope for;
  • Camps Bay where all the youngsters hang out on the beach. Mind the water though as the temperature sits somewhere between 11C and 13C;
  • The wineries of course!;
  • Visited a private game reserve. Name an animal and I saw it, up close. Our guide, and UNIMOG driver, Nadia was cause for a mild case of khaki fever ;-)
  • Rode an African Elephant;
  • Swam in a lagoon in the middle of nowhere. It reminded me of something out of Tarzan :).

And that was day one! Just kidding but seriously there is so much to do. I didn’t get to do anywhere near as much as I would have liked. Some of the things I want to do next time I visit include:

  • Rock climb Table Mountain;
  • Visit Robben Island where Nelson Mandela was impresoned;
  • Hike some more mountains - there are too many to name!;
  • Mountain bike through Kruger national park;
  • Visit Pretoria and Sun City;

To top it all off, I was caught in a natural disaster. The small town I was staying in, Heidelberg, about 4 hours out of Cape Town, suffered the worst flooding ever recorded - 189mm of rain in 24hours. The rivers swelled and took out bridges and roads. Farms were flooded and crops destroyed. I saw cattle drown and two storey buildings submerged. Phones and electricty cut-off. Amazingly none of it managed to fall into the dams currently sitting at 15% capacity at the start of what is usually the driest season of the year. And all in front of my eyes. If you’ve ever seen a flood on TV, it really doesn’t do justice to the awesome power of Mother Nature at her worst.

If you’re thinking of visiting Cape Town, make sure you call Sally de Jager. We found her details on the ’net and from there organised a guided mountain bike tour of The Cape. She is a magnificent guide and a wondeful person. She is very knowledgeable and will literally organise whatever tour at whatever price you can afford. For us she organised a hike, followed by biking around the cape then it was off to sit on one of the many beaches to drink wine and watch the sun go down. Oh and her boyfriend is an Australian so she understands us :-).

As I mentioned at the start, it would take me a month to describe all that I expereinced in 2 weeks. Certainly a life changing trip. So if you’ve ever even contemplated a visit, do it!

JRules Memory Leak Gotcha

By

About 6 months ago we were profiling our application to ensure we had no memory leaks, etc. We did find some and we were able to fix them pretty much immediately. However, today I happened to be chatting with a colleague who is investigating a memory leak in another application and it sounded scarily similar. So in the interests of all you JRules developers, here’s a little gotcha.

JRules maintains a binary, 1-to-many, association between the rule set (IlrRuleSet) and the, possibly, many instances of the working memory (IlrContext) - Also referred to as a “rule engine” by the JRules documentation. I’ll spare you my diatribe on binary associations for now, suffice to say that if you weren’t aware of this little “feature” (or if you were and simply hadn’t given it much thought) you’re in for a nasty surprise.

When you’re done with an IlrContext the natural thing to do would be to simply remove all application references to it and let it be garbage collected. Unfortunately, due to the two-way nature of the relationship, this doesn’t have the expected effect. Instead, because the rule set still holds a reference to the context, it will NEVER be garbage collected.

To combat this problem, ILOG thankfully provided a somewhat innocuous looking method IlrContext.end(). To quote from the documentation:

Prepares this rule engine instance for garbage collection. After this call, the engine will not keep any reference to this rule engine instance. The rule engine instance will be detached from the ruleset and will no longer be notified of modifications on the rules. The rule engine instance will also disconnect all its tools and all the related resources will be released. If the application does not keep this object, it is then subject to garbage collection.

In other words, anytime you’ve finished with a context and wish it to become a candidate for garbage collection, be sure to call end() or be prepared for a slow and painful application death as the heap runs out.

One final tip, if you make use of context pooling, be sure to also call IlrContext.reset() before returning it to the pool. This will remove all references to your application objects within the context.

<blatant-plug>If you’re in the market for a cheaper alternative, you might like to try out the latest version of Drools.</blatant-plug>

P.S. If anyone from ILOG is listening, this is exactly the kind of problem WeakReferences (and WeakHashMaps in particular) are designed to prevent :)

Assistant Orange Peelers

By

My father is a commercial airline pilot. He’s been flying since around the time I was born, 32 or so years now. He’s flown everything from light aircraft through 707’s, 767’s right up to the latest 747-400 - the ones with the winglets and beds for the 2 crews, enabling them to fly non-stop Sydney to London.

Pilots must undergo medical and practical examinations each year in order to maintain their rating on a specific aircraft and apart from actually flying, he has at various times been a training captain working out of Boeings Washington base. From there he teaches and re-trains pilots on simulators to either attain accrediation on an aircraft they haven’t flown before, or simply to renew their existing accreditation. Obviously he has lots of stories to tell but by far and away the most interesting group of pilots he’s ever had to train were Russian commercial airline pilots.

Back in the bad old cold-war days, defections to The West were common place and of great concern to Russian authorities, especially when it came to pilots. You can imagine that it wouldn’t take much for a commercial airliner to continue on to Japan or parts of Northern Europe. So, no doubt in an effort to counter this, and also possibly to give as many people as possible a job, Russian commercial airliners would sometimes have as many as 6 crew members in the cockpit with each one assigned and trained for a specific task. My father used to joke that they would have a Navigator; Radio Operator; Flight Engineer; Pilot; Co-Pilot; Orange Peeler; and an Assistant Orange Peeler. Re-training them on aircraft that require at most a 2-man crew was challenging to say the least.

This kind of “siloing” occurrs all too often in IT shops as well where each person has a specific task: The GUI Guy; Miss Middleware; Database Dude; Build Master; Application Deployer. Not such a bad thing at first glance I suppose - it’s always good to have someone in the team who knows about these things. For example, if you ever need to know something about how the Object-Relational-Mapping works, go see Billy Bob.

Another argument put forward is that by concentrating responsibility and accountability it removes the “burden” from the rest of the team. In practice though, this approach seems to lead inexorably to demarkation disputes further resulting in much finger-pointing and scape-goating. Cries of “Can you fix that, it’s in your code” or “Who’s been making changes to Blah.jsp without asking me?” can often be heard. What’s worse is how this can materially affect the productivity of the whole team - “She’s not in today so we’ll have to wait until tomorrow to make those changes” or “You’ll have to wait until the Build Master gets back from lunch before we can do another drop for you.”

It seems odd that a modern software developer cannot or is unwilling to become multi-skilled. The days of being solely a JSP guru or COM expert are gone and have been replaced with collective code ownership and the idea that moving people around is a good thing.

Or perhaps it is management worried about losing control? Perhaps a team who’s members have become self-reliant may decide that they can make it on their own and “defect”?

Using Lucene To Find A Date

By

For the next 3 weeks (and for the past few), I’m the DefectController. I get to watch the defects roll in, assess them, and hand them out to the approriate developer (which may be me). Last week I saw a rather odd defect pass by:

org.apache.lucene.queryParser.ParseException: Too many boolean clauses when performing date range search.

My first reaction was puzzlement replaced shortly thereafter with shock as I thought through the problem. It occured to me that the most obvious cause would be the unthinkable: the developer must have enumerated every possible date in the range and included them ALL in one gigantic OR condition.

A bit of groking later and shock turned to horror. Fortunately, the developer had not done as I suspected. They had done the correct thing and generated the correct Lucene criteria in the form:

dateOfBirth:[19700801 TO 20030615]

Unfortunately, that left only one option: It must be Lucene!

Two minutes on Google and the BuildController turned up, among others, this link. Yes indeed, it seems, Lucene does enumerate ALL possible dates. In fact depending on the granularity, it will end up enumerating all possible seconds! Apparently this is not a bug nor even a feature but a “known behaviour”.

So now here’s the thing that puzzles me. It would appear, from the documentation, that string ranges are also supported allowing us to find say, people where:

name:[Albert TO Betty]

This being the case, does Lucene enumerate ALL possible names? I find that hard to fathom. If it does, then I give up now. If not, then couldn’t we just encode the dates as umambiguous, comparable strings? something like:

dateOfBirth:[19700801 TO 20030615]

Look familiar? It should. I just copied and pasted the original example. But if this time around we consider the dates as strings of the form yyyyMMdd instead of attributing any special notion of date, wouldn’t that solve the problem? Wouldn’t that also easily allow us to perform partial range searches that include say only the year or year and month?

A Lucene expert I am not but all the links we found suggesting various other “work-arounds” (one of which suggested upping the limit on the number of clauses!) seemed little more than hacks. So, please, please, please tell me I’ve missed something obvious because the solution really does seem that simple to my feeble bwain.

Where's The Problem?

By

One of the biggest problems I see over and over again is the difficulty support and maintenance teams have diagnosing problems. It’s unfortunate but developers have a knack of writing code for Happy Days scenarios - when the software works, it works well; when it fails, it can be a disaster.

A cow-orker and I were recently discussing the use of assertions in production code. He had previously been discussing the topic with one of his colleagues who had suggested that their existence was a smell; that it indicated a lack of testing.

Now, enough has been said on the topic of unit testing so suffice it to say that the great thing about unit testing is that it’s easy to ensure your components work as advertised. You can even, and in many cases should, test what the behaviour would be given invalid input such as NULLs, etc. Then of course we move into integration, functional, acceptance, regression, etc. testing to prove that the application hangs together as a whole.

The problem is that tests don’t necessarily prove that the software does what it’s supposed to GASP!. Rather, tests prove that software works for the given scenarios and the assumptions made and the plain fact is that these do not always match reality. We may have the greatest, most comprehensive test suite in the known universe, but if it’s testing the wrong things, it matters little. Sure, in a system over which you have complete control, high levels of functional/integration test coverage can compensate but even then, even 100% code coverage doesn’t equate to 100% accuracy. In particular, at best, it is difficult to test for and therefore prevent clients of your code from passing invalid parameters. In fact if you’ve ever written and published a public API you’ll know that, by definition, it’s impossible.

The further into a system a problem propogates, the more difficult it becomes to diagnose and the greater the likelyhood of “damage”. One of the major benefits of production code (as opposed to test code) assertions is that at run-time we can detect and prevent unexpected scenarios as early as possible, thereby preventing them from propogating. Maintenance developers and those familiar with the Fail Fast axiom will appreciate how important this is in a production environment.

Environmentally-Friendly Configuration

By

Dave came over to me the other day and said he wanted a new build check written that searches the entire source base for "C:". It’s a familiar problem where supposedly automated scripts refer to specific directories, paths, etc. Move the script, run it in another environment, and BLAMMO!

I recall some colleagues telling me of a problem they had once where the system had been running fine for months in development and then one day it mysteriously started failing, everytime, on every developers machine. It turned out that ALL developer machines had been configured to communicate with the message queue on another developers machine because that’s what had been checked in to CVS.

More recently, we encountered a problem where the UAT environment worked but not System Test. Again, it turned out that the default configuration for a remote URL was, lo and behold, set for use in a UAT environment. Each time the System Test deployment was run, it was communicating with the wrong server. No one noticed it at first because, aside from the server name, the string of request params is the same in all cases.

These days, as a bare minimum, we strive to have ALL “default” configuration values set to something along the lines of "THIS_IS_WHERE_THE_VALUE_OF_X_NEEDS_TO_GO". This way it sticks out like the proverbial Dogs Bits when you’ve forgotten to tailor something for a specific environment.

Then we use build properties such as configuration=development, configuration=uat, etc. specified on the command-line that allow the build scripts to substitute in all the various values appropriate for the intended target environment. This, coupled with some Build Watermarking, almost guarantees that this class of configuration problem is a thing of the past.

That is of course until one of the developers comes to you and proudly explains that they have “discovered the problem. There were these strange values in the reference data scripts, so I changed them all to sensible defaults and checked it in.”

Unit Tests As Complexity Sponge

By

A number of people have variously commented that unit tests may in fact be more about design than actual testing. Many others (the links elude me at present) have also complained about the undue burdon imposed by a large number of unit tests and that because of this, and other very sound reasons, they prefer acceptance tests. If I was ever in any doubt about the importance of acceptance tests, I was certainly convinced after the last project where acceptance tests would fail where no unit test had, due no doubt in large part to the fact that the acceptance tests also acted as integration tests.

One thing I did notice however was that over and above their usefuleness as a design tool, unit tests seemed to act as yet another positive constraint helping reduce the overall complexity of the code. Because developers were forced to write unit tests, they were forced to produce relatively simple, testable code. Much simpler, I believe, than would have been the case otherwise. The down side to this testability was that in many cases, the corresponding unit tests were rather more complex than we would have liked. And, as noted previously, complex tests tend to be brittle and this has a knock on effect with respect to maintenance. But does this really matter?

Ultimately what is important is working software (for which you have acceptance tests) and clean, easy to understand code that is hopefully cheaper to maintain. You could choose to throw away all those “dirty” unit tests once you reach production and rely solely on your acceptance tests; or you might choose to buy new ones through refactoring/re-writing; or you may decide that the unit tests are worth the extra effort to maintain.

Whatever the course of action, it seems to me that, yet again, unit tests have benefits beyond simply (or not as the case may be) producing “correct” code.

Something Tells Me This Could Be Bad

By

ADGU3163I: Suppressing console message display from server because the arrival rate of 38.76per second exceeds the threshhold rate of 10

CVS Saves My Life Once Again

By

Some time ago I trashed my linux machine by running rm -rf on a logical mount that, for reasons too mundane to discuss, was pointed at the root partition. Yes yes, snigger snigger hehe but tell me you’ve never done something similar :P. Anyway, a day or three later, I had resurrected my machine and restored all my files from those non-existant backups that we all vow to make…one day.

I deploy stuff to my various websites using ant scripts. Each time I deploy a new version of a product or project or even just make a change to some static HTML, it’s automatically shipped using JSCH then some shell commands are rune using SSH to move stuff around/configure things as necessary.

Up until about an hour ago, I had been using a semi-colon (;) as the command delimeter with no issues at all. There is problem with this that I had never considered - If any of the commands fails, the shell keeps on executing the remaining commands! Now in my case, one of those commands changes directories and is followed immedately by, you guessed it, `rm -rf *code. Not a problem if everything goes to plan but I had just recently renamed said directory! Needless to say, it wasn’t pretty.

It then occurred to me that what I should have been using were double-ampersands (&amp;&amp;) which terminate execution at the first failure. Even better would be to simply rename (move) the existing directory and create a new one; My newly adopted strategy.

Thankfully, all my projects and web sites are in CVS so restoring them is never a problem which got me thinking about all the bits of code I’ve seen commented out or the unused classes left lying around because, like all those off-cuts of building material you’re keeping in the shed, “who knows when I might need that.”

More often than not CVS is used purely as a central repository that all developers can access but it can and should be more than that. Having everything in CVS allows greater fluidity in development. It allows developers to try something out and if it ends up completely borked, well, we can always just roll back. James even suggested (now that he’s a CVS guru along with Jon :P) using CVS branching to do some speculative changes without disturbing the guys on the trunk whilst still allowing us to check-in the code. This is certainly something I would have been extremely reluctant to do even 3 months ago but having seen it work out (so far) for one of the projects on which we depend, I’m rather more inclined to give it a go.

As developers, we need to feel comfortable with and trust that all of this is possible. I doubt that many people I’ve met actually understand all the subtleties and features of CVS, myself included. One thing I know for sure is that resurrecting dead files in CVS isn’t nearly as simple as it sounds. I’m hoping Subversion will address this but right now, something tells me that after everything I’ve just said, using beta-software for SCM on my critical projects is, perhaps, not the smartest thing one could do.

McAppy Time

By

Something appeals to me about the way Mac OS X applications are distributed. Sometimes there is an installer (boo!) but more often than not there is only one “file” to drag into /Applications. I say “file” because although it looks like a single file, it’s actually an entire directory structure with a special folder Contents containing meta-info that the Mac OS X GUI understands. Sort of like an unpacked Jar file but for native applications. Best of all, if I decide to remove the application (I’ve played with about 10 IM clients so far) just delete the application and whoosh, it’s gone. In most cases about the only thing left lying around are preferences in ~/Library/Preferences.

So, finally understanding all that, I thought I’d try my hand at building a deployable application - called a bundle. I found a web site that documented most of the steps required. I also found an Ant target that allows you to generate a bundle from a build. The next step was to build an application.

Being the Java Snob that I apparently am, I naturally decided to try deploying a Swing app. The fact that Mac OS X comes with Java out-of-othe-box is pretty cool. What’s more the deployment mechanism supports Java apps directly meaning that besides some minor L&F issues here and there, Java applications slide right in amongst native applications.

So, I thought I’d put a “Launcher” GUI on Simian just for fun. Nothing special. Certainly nothing to replace the IDE plugins others have written for. And here’s a picture. Makes me want to work on a Swing project even more. Though I believe JavaScript is all the rage now ;-)

One thing I didn’t do was specify an application icon but that’s as simple as converting an image and dropping it into the Resource folder inside the bundle, something that can be achieved by using the icon parameter on the ant task.

The other thing I need to do is to find an ant task that can create disk images - the standard way Mac applications seem to be distributed.

And finally, just as I was about to post this, I stumbled across a three-part series on making your swing applications play nicely on the Mac.

Not sure if any of this is any better, worse or really just the same as a Jar file. But it’s kinda fun in that geeky kinda way.

The Sound Of One Man Snapping

By

Nothing like waking up after a night of disturbing dreams of zombies drinking bottles of warmed-up coca-cola. It’s official. Last nights blog entry was me losing the plot. It’s happened twice now in the last couple of weeks and is a sign of me becoming someone I despise. It’s an indication that my ability to cope with being asked to be responsible for things over which I have little or no control is non-existant. In all my years of software development, I’ve honestly never felt this way. It’s certainly not in my nature. I woke up this morning wishing I could have the last week all over again.

So to all those who were offended by it, I unreservedly apologise. I have deleted the entry and will make sure I go and have a beer instead next time.

FWIW, I make mistakes. Everyone makes mistakes. Almost everything I’ve ever blogged about I’ve done at some stage as well. That’s why I write about it. I grew up with the belief that it’s not a person that is wrong/stupid/whatever but the things they do. This is why I try so hard to document here all the stupid things I have done in the hope that others won’t repeat them.

Don't Panic!

By

Apparently, “after hours” batch jobs don’t require load testing. Yes, you heard it. Supposedly jobs that run when no users are logged in are pretty much free to do whatever they like, all 57 of them! Is it some weird side effect of Heisenberg’s Uncertainty Principle that I’ve never heard of whereby it’s possible to be either using the system interactively; or have limitless computing power; but not both at the same time? Who knows but excuse me for suggesting otherwise.

You’ll also be relieved to learn that there is no need to load test your applications together on the same box even if they will be co-located in production because we can extrapolate from the results obtained by running a single application stand-alone. That’ll be a good cost saving I’m sure.

Oh and as for including the generation and downloading of PDF documents, bah! That does nothing more than test all that pesky “network bandwidth stuff”. There’s nothing we can do about that anyway so why bother testing it right?

Phew! That’s a load off my mind (no pun intended). I had thought that we might end up doubling the load on the production box but it seems I was somewhat misguided. Glad they’ve got it all sorted out. At less than 6 weeks ’till go live and with the application only just now limping into System Test, I was beginning to worry. Silly me. What was I thinking?

Now where did I put my double pair of Joo Janta 200 Super-Chromatic Peril Sensitive Sunglasses? I’m sure they’re around here somewhere…

Mac OS X House Keeping

By

Having been a linux weenie for a few years now I had become accustomed to running various house keeping jobs on a regular basis and I wanted to do the same thing on my new PowerBook.

In particular, I use locate for quickly finding files, which to be of any use, requires the the indexer (updatedb) to be run periodically. A quick grep through the man pages and I discovered the OS X version was /usr/libexec/locate.updatedb so the next step was to get it to run as a batch job.

Whilst searching for the appropriate place to put my daily system cron jobs (/etc/daily.local), I ran across this little gem in /etc/daily: # Clean up NFS turds. May be useful on NFS servers.

Occam Need Not Apply

By

Wanted: Software developers for long-term, large-scale enterprise application project. Complex solutions to complex problems. Ability to justify largely redundant framework development to senior management a must.

Why is it that when left to their own devices, and given more than one way to implement something, developers we will almost certainly undertake the most complicated?

Speculative Optimisation

By

or pre-factoring as Dave likes to call it, is a common practice. It’s an easy trap to fall into. Take a look at any piece of code and I’m sure you will see a way to make it run faster. The problem is that performance bottlenecks are almost never where you would expect them to be. Sure, we might be able to double the speed of a piece of code but if it only accounts of <1% of the overall running time, then it doesn’t really matter. Just recently I had someone reccommend that we add in some caching of database results because “It will be a performance problem.” The question I had was, when compared with what?

Performance optimisation often (but not always) involves obfuscating the code in some way to achieve the desired performance. Maybe we need to inline some code or unroll a loop here or there. Whatever it is can lead to code that is hard to read and hard to understand and, as we have discussed before, therefore hard to maintain. Ironically, our so-called optimisations can potentially lead to worse performance. If the algorithm is diffuclt to understand or the code simply hard to follow, we might actually introduce unecessary overhead without even realising it. If we have no base-line, no benchmark with which to compare our results, we will never know if we are improving or degrading the performance.

For this we need a profiler. There are plenty around, some free and some you’d have to sell the kids to afford. Quest have a free version of JProbe for use with Linux and Windows that James and I have been using to profile Drools. It’s missing some features but certainly nothing we can’t live without (how many negatives can a man use in one sentence?). There really is no magic involved. Run it, see where the biggest slice of the pie is and start there. Keep doing that until you’ve knocked off all the big ticket items. Chances are that’ll get you most of the way. Anything beyond that probably requires a fundamental shift in the design. But hopefully, because you have a clean design, that shouldn’t be too much of a problem ;-)

Interestingly, one of the simplest things you can do with your design is to make things as close to immutable as possible. So, for example, rather than have lots of JavaBeans with setters, use constructors. Mark your fields final. Not so because that in itself is a performance enhancement (although it maybe?) but to ensure that the state of your objects is as stable as possible. It also makes it much easier to find out who’s messing with the state. To achieve this, you may find you need to de-compose those monolithic classes into smaller ones. I’ve found it helpful to introduce Builders to accumulate state before constructing your objects. You can think of mutable objects as having many moving parts and the more moving parts to a system, the harder it is to work out what’s happening and the harder it will be to re-factor when you finally perform your profiling.

Experience has taught me over and over again that correct code is much easier to optimise than clever code. This is why I’m a firm believer in Make It Work, Make It Right, Then Make It Fast.

Death To Blog Spam Arrgghhh

By

I’ve been using MT-Blacklist for some time now and while it does a good job of moderating the spam, I’d rather it didn’t even get that far. So in a last ditch effort to eradicate comment spam all together, I’ve just installed a different kind of solution. This plugin puts up a security code graphic that you must enter in order to submit the comment. Although there have been some complaints about this technique on the grounds that it is discrimatory towards people with impaired vision, I’m going to give a whirl anyway and see how it goes. Apparently the guy who wrote the plugin has also recently written a bayesian filter as well but personally, like with MT-Blacklist, I don’t have the time to sift through all the comments, deleting the spam.

UPDATE 1: Seems to be working a treat. I’ve had not one blog spam comment in the last 24 hours but people have successfully commented manually. I usually get around 6-10 spam comments in the same period.

UPDATE 2: It’s amusing to look at my web logs and see all the access attempts from dodgy sites, no doubt attempting to post comment spam and failing dismally!

When Corporates Embrace Open Source

By

It is common for organisations to justify the use of popular Open Source Frameworks on the basis that developers with these skills are easy to come by. In addition, because the source code is readily accessible, it’s easy to make bug fixes and patches whenever needed. This is clearly justification enough that no analysis need be performed in order to ascertain if said framework actually fits the technical requirements of the application.

The next step always seems to be to download the source code and check it into a local repository. Then, have a core group of developers maintain it internally. This team will be responsible for checking out the source code, building it and distributing it to all the other teams ensuring that changes are controlled and all teams keep up to date with the correct version.

After using the framework for a few months, it becomes obvious that the way the code was originally written is either: broken; wrong; or doesn’t quite fit with The Way We Do Projects Here ™. This then requires massive changes to “simplify” the design and add enhancements wherever “necessary” - Like masking all those pesky exceptions that get thrown and instead returning null.

Of course now that so many changes have been made, and coupled with the requirement that all projects be uniform in quality, it becomes necessary to ensure that project teams cannot and will not use the version(s) available from the original project site but instead are forced to use the highly tailored internal version. In fact it’s probably a good idea to make the framework a “black box”. I mean, why would the non-core developers need or want access to the source code. The core team are providing a service after all and that is all that’s important, so access to the internal repository must be on an as-needs basis.

And finally, after 12 months of development and hard-work, it is customary to allow the The Architect who made ALL the proprietary changes (to the supposedly open framework) to go on 4 weeks holiday, just prior to delivery to System Test, leaving the project team to fend for themselves so that when a bug is found, the only solution is to fork the code (again) and check it in to the project repository on the proviso that the changes make their way back into the core ASAP.

Why Type When I Can Skype

By

Throw out Yahoo! Messenger, if you’re not using Skype, I’m no longer your “buddy” :P. I’ve tried voice chat before but nothing even as close to as good as this. I can’t believe I’ve never heard (pardon the pun) of it before.

I plugged my headphones in, “called up” a friend and starting speaking at my laptop (there’s a mic there somewhere though I’ve no idea where). The sound quality is astonishingly good. My friend might as well have been sitting next to me.

So I started calling up everyone I could. My brother travels a lot for work and has two kids and I figured it would be really useful for him. “brb (be right back)” he says so I start playing some of my newly ripped CDs as “hold music”. When he got back I asked him what the sound quality was like. “I thought a CD had started playing on my computer” he replied.

Apparently you can have up to a 4-way chat. Though I’ve not tried, if you’re prepared to pay, it allows you to make international, and possibly even local calls. I’d be interested to hear from anyone that has. And even better it has a text based IM client built in so if you like to type instead you can; though why would you bother, Jon? :P

The next bit is to work out how to get VNC working over a VPN across The ‘Net so that James and I can do a bit of remote pair programming…mmmm

Now if only I could find a way to have multiple voice chat conversations going at once without having my brain explode. Oh well I guess it’s a little too early to throw away the IM client after all. DOH!

Beware The Cross-Product Join

By

An intersting discussion started on the Drools user mailing list regarding some problems writing a rule. The particular problem is not unique to business rules though. RETE-based inferences engines share much in common with relational databases and in fact this particular problem can affect SQL queries in the same way as it affects business rules.

Let’s say we wanted to find all pairs of people that were maternal siblings (ie that had the samemother). In SQL we could write a query like this*:

SELECT * FROM Child c1, Child c2WHERE c1.motherId = c2.motherId

If we imagine we have only two children in our database, Bob (childId = 1) and Mary (childId = 2),both having the same mother, this query would generate four rows:

  • Bob, Mary
  • Mary, Bob
  • Bob, Bob
  • Mary, Mary

This is called a cross-product; every row is joined to every other row. This results in rows we’re not interested in: Bob, Bob and Mary, Mary. So the first thing we would do is try and ignore rows where the child was the same:

SELECT * FROM Child c1, Child c2WHERE c1.motherId = c2.motherId **AND c1.childId != c2.childId**

Which results in:

  • Bob, Mary
  • Mary, Bob

The next thing you’ll notice is that we still have redundant rows - rows that mean the same thing. There are a few “tricks” to avoiding this and really come down to a knowledge of the underlying attributes of the tables involved. The simplest in our case is to change the condition:

SELECT * FROM Child c1, Child c2WHERE c1.motherId = c2.motherId **AND c1.childId < c2.childId**

By imposing an arbitrary ordering, we prevent rows being joined to themselves and ensure that for any two siblings, we only get one row. Best of all, this technique translates directly into the implementation of business rules.

Not only do cross-products produce redundant and possibly incorrect results, the extra tuples (rows) generated as a consequence can cause your rule engine to grind to a halt.

  • I realise that no one is going to model Children and Mothers in different tables but please cut me some creative slack ;-)

Project Risks

By

A few weeks ago I gave a lecture to some second year university students here in Melbourne. The talk was titled “e-Business In The Real World” but really it was me yabbering on about my experiences delivering software. Anyway, a couple of people have asked me to publish the slides so [here they are](/blog/archives/RMIT eBusiness Lecture.pdf), all done using NeoOffice/J on my brand spanking new PowerBook. They’re not much, nothing fancy, but they really summarise the risks associated with delivering software.

If I could sum it all up I would say that if your problems are largely imposed by entities external to The Team then that’s about normal; you just have to identify the risks and mitigate them somehow. If, on the otherhand, your major problems are technical, i.e. within The Team, you’re in deep doggie doodoo; fire them all and start again ;-P

UPDATE: Having been asked to present again, I revised the slides slightly using Keynote. The content may be much the same (a few changes here and there) but it sure does look sexier now ;-). Unfortunately keynote produces an enormous PDF so, I actually exporpted to PPT then imported into NeoOffice/J and re-exported to PDF producing a file that is less than 10% the size!