Posts

Drools Schmokes! - Part II

October 6, 2004

So once we’d worked out what the major hot spot in drools was, it was time to find an alternative method of conflict resolution.

As a bit of background, in simple terms, as facts are asserted, new items (or activations) are added to the agenda. In the general sense, all agenda items are equal. But some are more equal than others.

Although you should stay away from attempting to infer or impose ordering on rules, sometimes it is necessary. Sometimes you just need a couple of “cleanup” or “setup” rules, that are guaranteed to fire before or after all others. In Drools (and JESS) this is known as salience. In JRules it’s called priority.

There are other reasons to order the agenda and Drools has a number of different strategies: Random; Complexity; Load Order; etc. These are then chained together. Each Resolver then gets a chance to add the item to the agenda. If it succeeds, no more resolvers are called. If however the item conflicts with one or more existing ones, all are returned and passed to the next resolver to, well, resolve LOL.

Confused? Here’s a better explanation.

Looking at the implementation it was apparent that the complexity was O(n^2). Each resolver seemed to be doing a similar thing. It was also optimised quite a bit meaning there was necessarily duplicated code.

My initial gut feeling was that a priority queue was what we needed but how would we do the chaining of the different concerns?

Maybe something like a Red-Black Tree would be useful. Maybe we could implement a comparator for each strategy. Conceptually at least, if we used the first comparator to insert into the tree until we found items that were equal. From then on we would continue to insert using the next comparator, etc.This seemed too complicated and I don’t do complicated very well. Makes my head hurt.

It seemed that each of the strategies was really just using a different dimension or aspect of the item to perform a sort. It was like a composite key. So whats the easiest way to sort on a composite key? Use a composite comparator. Something like:

public class CompositeComparator implements Comparator {private final Comparator[] _comparators;public CompositeComparator(List comparators) {this((Comparator[]) comparators.toArray(new Comparator[comparators.size()]));}public CompositeComparator(Comparator[] comparators) {_comparators = comparators;}public int compare(Object o1, Object o2) {int result = 0;for (int i = 0; result == 0 && i < _comparators.length; ++i) {result = _comparators[i].compare(o1, o2);}return result;}}

I tried it out using a TreeSet but it performed just as badly. Maybe I was wrong I thought to myself. So I jumped online and chatted to some of the Drools guys, Mark Proctor in particular. I described my ideas and he seemed to like them.

We did a bit of searching around for implementations we could use. I found one here but the license wasn’t right. Next we thought of [Peter Royal](">Doug Lea’s stuff but it was overkill. Finally <a href=“http://fotap.org/~osi/) suggested looking at the commons-collections stuff and voila, there it was - PriorityBuffer - and it took a Comparator!

Hackedy, hackedy, hack and we’d replaced the original stuff with the priority queue. Time to give it a whirl.

The first step was to run the queue with a simple Comparator. Although it doesn’t really do anything much, it would at least allow us to see what the basic overhead of the queue implementation was:

public class ApatheticComparator implements Comparator {public int compare(Object o1, Object o2) {return -1;}}

Hit run. Damn that’s quick! Once more to be sure. Yup. Hmmm. Still not convinced. Add a breakpoint and run in the debugger. Sure enough it’s being called. Cool! Ok now to try LoadOrder and Salience.

public class SalienceComparator implements Comparator {public int compare(Object o1, Object o2) {return ((Activation) o1).getRule().getSalience() - ((Activation) o2).getRule( ).getSalience();}}

Same deal. All works just fine and after implementing a few more I was convinced that this was going to be a winner.

Now we have O(n log n). Even with all the comparators chained in, the peformance doesn’t change one bit. What’s more, the different strategies are simple one liners making implementing new strategies almost trivial!

So once more I must applaud the Drools guys for a flexible and performant design!

Paste Your Code

By Simon Harris

October 6, 2004

Anyone who’s used TinyURL will understand how cool this is. One of the guys (Mark Proctor) over at the [haus](http://www.haus.org) put me on to it.

As the title suggests, it allows you to paste your code and generate a unique URL for it. You can select a language, choose a “nickname”, enter a description and even convert tabs to spaces if you so desire.

The result is formatted code, with line numbers, that you can easily share with others. Pretty neat.

Drools Schmokes!

By Simon Harris

October 5, 2004

We’re about to open source a new rule-based project and up until now, we’d been using various closed source rule engines to get us going. Of course this won’t cut-it once we open source so we hoped that Drools would come to our rescue.

And it did. With some caveats, I can safely say that Drools is incredibly fast. Not bad for a code base that by their own admission has, quite rightly, favoured stability over performance and as such has had ittle or no profiling done.

Luckily we had built joodi, short for Java-Based O-O Design Inferometer (just had to get the word Inferometer into a project somehow!), test-first and as such the guts of the app was based on interfaces so cutting over to Drools was prety easy. It took me about an hour I guess to convert the application, rules, tests and all, to run with Drools. We fired it up. All tests passed. Hooray!How happy were we!?

Next to run a “benchmark”. We ran the application over the struts classes using the closed source engine first and it finished in around 9 seconds. COOL! Performance had been one of our unknowns and this was certainly well within tolerences.

Then we switched over to Drools and run the same test. 20 minutes later it still hadn’t finished. Another ten minutes I’d say and I was fast asleep. So when morning came around I lept up and ran into the lounge to see if it had finished. It had. In 78 minutes!!!

Yikes we thought. This aint going to cut it. Elation turned to dismay. But no real profiling of Drools had been done so surely there was room for improvement?

After a bit of chatting with the peeps in da haus, I decided to check-out the source and use JMP to do some profiling. Run it, we thought, find the lowest hanging fruit, fix it, then keep doing that until we’ve done all the obvious stuff.

So I cranked it up and it didn’t take long to find a hot-spot. In fact it appeared that nearly 50% of the time was being spent in one small area - conflict resolution. A quick look at the source code was all that was needed to confirm my suspicions. Lots of unecessary iteration. But again, I’m not taking anyone to task over it. I’d rather it was stable and functional first.

Looking more closely at the code, I realised that the functionality provided by the classes under scrutiny were not actually necessary, yet, for me to get joodi running. Thankfully due to the thoughtful design it was pretty easy to stub out, without even touching the Drools source-code.

Time to run again…holy-cow! 5 seconds! That can’t be right. Run it again. Nope 5 seconds again. Quick look at the output to verify it was actually working correctly. Yup. Run all the joodi unit tests just to be sure. Yup they run just fine. It had gone from being 300 times slower to almost twice as fast!

Damn I’ll try running joodi against another, bigger, project - xerces. With Drools plugged in, joodi ran in around 9 seconds. With the closed source product I gave up after 5 minutes and stopped it.

So hats off to the Drools team. Damn fine job! I’ll be submitting my patches ASAP and hope to see some of that other code re-factored soon :-)

Care-Factor Nine Mister Spock

By Simon Harris

October 3, 2004

This started as a reply to a very pertinent comment on a blog entry of mine but it grew to the point where I thought it deserved an entry of its own.

First to the original comment, I always appreciate a good rant. How could I not LOL. And I agree whole-heartedly with the sentiment. I don’t tend to blog about my personal life because, well, it’s personal hehehe. I don’t really get much from writing about my life experiences, yet. Maybe one day but until then I do get a lot from writing about software development. It’s an area of my life where lots of discussion and debate seems to make a big difference.

So for the curious, I teach and train martial arts most week nights. I spend most weekends with my family except for the occasional geek session here and there. I work for 9 months of the year and take 3 months off mostly to travel - I’ve lived a total of 3 years in Japan off and on over the past 17 years. I speak Japanese. I ride my motorbike whenever I can. I ride my mountain bike whenever the weather permits…. But rather than bore you with my “I’m a Leo I enjoy cooking and dancing” story, let me summarise by saying that I do believe that life is about living and NOT about software development.

Don’t get me wrong, I don’t dislike software development. As far as a job goes it’s the best one I could hope for right now. It’s interesting. It’s challenging. It keeps my mind active. And I get to meet loads of interesting people in the process. But every year I go to Japan to train or I go hiking in New Zealand and I don’t miss the internet nor email nor mobile phones nor any technology to be honest. When it comes down to it, if I were independently wealthy I could turn my back on computers and never look back.

But that was not and is not the point. The point is that no matter whether it be software development, house keeping, whatever, all I ask is that you GIVE A SHIT about what it is you are doing and that you take some care and some responsibility. If you don’t, won’t or can’t, then STOP, CEASE, DESIST! You will do more harm than good so please go away, we don’t need nor want you.

My Aikido instructor is famous for ripping shreds through students correcting their technique. Hearing him scream “DAME!” (Japanese for “wrong”) across the mat can be a bit much for some students. But he once said to us that “there are only two reasons you’ll never receive a DAME from me. Either you’re so good that you don’t need it; or you’re so bad I’ve given up and I don’t care about you anymore.”

So I hope you’ll understand that I intend to continue ranting and writing about software development, and anything else I feel passionate and enthusastic about, BECAUSE I GIVE A SHIT. :-)

Stop Calling Me Shirley

By Simon Harris

October 1, 2004

The lack of documentation is disturbing. Requirements in the form of code or often, reverse engineered from the code. Phooey! Seemingly adhoc changes to the spec by the architects. Cowboy developers making changes here and there whenever they feel like it to hack in some new feature. Dependencies between developers forcing them to pair up to write code. What a ludicrous idea! Nothing seems to get done until the last minute. We’ll be lucky to limp across the line. Whatever that line may be. With no real acceptance criteria, how does anyone know when we’re finished?

But wait a minute…it’s a waterfall project. Oops! XP was used on the previous project. Let’s try that again shall we?

All those tests slowing down my build. How dare they make me ensure my code works. All those story cards on the Wiki to read. Boring! Would you believe I was even gasp forced to understand what I was doing by consulting with the business rep. Sheesh gimme a break. Imagine allowing the customer to change their mind at the last minute and still delivering on time. Bah! And what’s with asking me for revised estimates every day? I signed up for anarchy. Instead I got micro-management!

Programmeurs Sans Responsabilité

By Simon Harris

September 24, 2004

In all my years of software development, I have honestly never encountered a developer who really just wanted to be a drone. Someone who wanted to code directly off the spec without a care for what they were doing or why. Until today.

But then it’s not so surprising when they’re given advice like “Learn from me. Always remember my 3 rules: I didn’t do it; It’s not my area; and I don’t know anything about it.”

Surely it’s not too much to ask that a developer actually understands the rationale behind the code they are producing. Surely it is reasonable to assume that a developer has questioned the design and implementation to the point they are at least happy that it conforms to their understanding of the problem domain and that it is traceable to some functional spec.

How can so many people have such a low care-factor for the work they do and the software they produce?

Build Watermarking

By Simon Harris

September 22, 2004

Desktop software products, especially of the Windows variety, invariably come with an “About” Dialog listing, among other things, the version of the software. The product version number helps support staff and developers solve problems when they occur out in the wild. Without a version number, tracking down a problem can often be rather difficult.

Especially on web-based applications, making the product build number, build date, and other configuration information (was it a production or a training build, etc.) accessible to the end user is an invaluable aid to developers, testers and support staff. In fact of you look to the side-bar on this blog, you’ll find the version of MovableType used.

On our last project we made this meta-info available as, funnily enough, META tags (though we could just as easily have used comments) in our JSPs and HTML files. To source the info we simply passed the CruiseControl build number and date through into our Ant scripts to use as replacement parameters when copying the Struts (sigh) application.properties file into the web archive. The person responsible for deployment can always be sure that the correct version of the application has actually been deployed. Then, when testers and users need to report a problem, they simply view the source for the page they are on and, hey-presto, there it is.

On a project I worked on with Dave some time back, we stored the current build number in the database as well. The build number was inserted into a table at the end of the database schema update scripts. A DBA could then visually inspect the data in the table to ensure the correct updates had been applied. Our update scripts also checked this table to ensure a script could not be applied again. At runtime, the application would also double check that it was running against the expected database schema ensuring we never had a, possibly catastrophic, mismatch.

You can add build watermarks to templates used to generate documents such as PDFs, to XML messages sent between systems, in emails as header tags, you name it. In fact anytime traceability back to a particular version of an application might be useful, consider adding some kind of meta-information about your application. Your support staff will love you for it!

Business Rules != Scripting

By Simon Harris

September 22, 2004

As Business Rules come into vogue (again?) and the tools proliferate, there will be the usual fumbling about as many come to terms with what it all really means. How do we use these things? What should I look out for, the pitfalls, the traps? Are there any “patterns”? But above all, the greatest difficulty it seems, is coming to terms with the idea that Rule Engines ARE NOT procedural scripting languages.

The Rete Algorithm (pronounced REE-tee and Latin for net) was developed by Charles L. Forgy at Carnegie-Mellon University in the 1970’s and is used in most modern high-performance rule engines. Rete is able to efficiently handle very large numbers of rules.

One of the most important features of the Rete algorithm lies in its ability to identify and subsume rules with similar predicates. Because of this, predicates need only be evaluated once. This differs from procedural (java `d) rules where every predicate in every rule must be independently evaluated, regardless of whether the same predicate might already have been evaluated in another rule. It can also locate conflicting rules. Something that’s almost impossible in traditional, procedural, languages.

When it comes to codifying business rules, well factored Java code can be rather difficult to understand. After a couple of weeks away, it can often take the original developer some time to get back up to speed with their own code, let alone someone elses. On the other hand, Rules are declarative statements of fact. That means no trudging through tens or even hundreds of lines of procedural code to understand what will happen under various conditions. Weeks, months or even years later you can go back to the rule definitions and immediately understand their meaning and intent.

Rule engines share much in common with Relational Databases. They are based on tuples and predicate calculus. You don’t navigate Relations (Tables), you join them. Similarly you don’t navigate facts, you join them. Both suffer (or at least have suffered) similar problems in terms of performance and optimisation.

Business Rules should be simple and atomic. They should make inferences. They should not be calling out to databases nor making countless remote calls. That’s what application code is for. Much like the difference between queries and stored procedures.

Analogies aside, the fact remains (no pun intended) that rules are not procedural, they are declarative statements of fact! Writing business rules requires very clear, concise and logical thought, as much if not more so than procedural code.

Rule-flow, priority, salience, etc. are mechanisms that allow some degree of procedural control and should therefore be considered a last resort, not the basis for a rule engine framework. While sometimes useful, all are frowned upon by rule advocates in much the same way as OO design frowns upon public variables.

If you can’t or won’t make the necessary shift from a procedural to a declarative mind set then I suggest you try BeanShell, Rhino, Groovy or any of the myriad scripting languages available. There is nothing to be ashamed of with this approach but it is most certainly NOT the same thing.

java.util.ThrashMap

By Simon Harris

September 22, 2004

I received a very interesting post from the JESS mailing list last night and thought it was worth a mention.

There’s a weird threshold that can occur with any Java data structurethat uses large arrays of Object references (Object[]). If the size ofthe Object array exceeds the maximum size of objects that can be placedin the “new generation”, garbage collection performance can be severelyimpacted…

I’m not sure if I’m likely to see this problem really but I do ue HashMaps a fair bit so I thought it was interesting. In general I don’t find the need to use arrays much if at all these days. In fact, for some inexplicable reason, I tend to use LinkedList over ArrayList and, except for Simian which uses lovingly hand-crufted data structures, I can’t recall ever holding maps of data large enough to exhibit this behaviour. But then again I’m not implementing a Rules Engine.

How The Other Half Work

By Simon Harris

September 15, 2004

It’s amazing how after just 3 days on a motorbike, out on the open road, breathing in clean ocean air, all the worries of the world seem to dissapear; how the old “care-factor” rapdily falls away to hover tantelisingly above zero; only to be brutally re-awakened by my brain hitting the ground with a thud as I begin day one of a new, 3-month long, project. And what a project it is!

The last project I was on kept my brain working overtime. I had to push myself to stay afloat, to keep up with the brains on the team. It was fantastic. We all pushed each other, striving to get as close to the “ideal” solution, continually honing and refining the design to remove all, seemingly, unneccessary fluff. The Simplest Thing That Could Possibly Work While Still Being Good Design ™ was King.

But now, as if by way of some bizzare super-natural ying-yang thing struggling to balance the forces of nature, I find myself on a project where the Powers That Be have skillfully managed to find complexity where there rightfully should be none.

And yet I’m baffled that, at some level, it’s a very simplistic design. A design that even with it’s strange technology choices and inexplicable coding “standards”, requires very little brain power to comprehend. I suppose that by adhering to all those wonderful™ J2EE Core Patterns, it can’t help but be simplistic. It really is the epitomy of a Simple Arrangement of Complex Things. Need a service? JNDI is your friend. And for your troubles you’ll get back an EJB. Need to extract a piece of data from some XML? Here’s a static helper class we prepared earlier.

The other developers on the team (I’ve come in 3/4 of the way through) have no trouble wading through the hundred+ line methods that make up the half dozen or so EJBs mixed in amongst the various classes that together comprise the 20 or so WSAD “modules”. In fact I found myself in a meeting honestly feeling stupid that I just didn’t “get” the seemingly overly complex application architecture. I figured I must be missing something, something significant. And maybe I have.

With the exception of my good buddy Phil, it made perfect sense to all the rest to store XML as a CLOB in a relational database only to unpack it, extract a subset of the fields and use Lucene to index the records. No one else seemed to think that spawning threads inside the app server was a bad thing. I mean “we don’t want to re-invent MQ”. Of course not. Silly me. Are you sure 15 days isn’t too fine-grained an estimate for you? We still have 6 weeks left, surely your gant chart would look a lot simpler if we just made everything finish then. :-P

I’m not sure if I simply suffer from a particularly severe form of geek snobbery (in the same way that Linux users look down on Windows users) or perhaps I expect too much from my work. Whatever it is, it’s interesting to see another perspective on J2EE development. I will try to suspend for a while my preconceptions and at least give it a go. But I’m not giving up IntelliJ…yet!

My only consolation is that it will surely be an intellectual walk in the park and give me more than enough food for blog, not to mention oodles of free cerebral cycles to work on my various out-of-hours pet projects - the ones I had been neglecting for the past 6 or so months.

You Are What You Measure

By Simon Harris

September 9, 2004

James mentioned that one of his colleagues had been compiling stats on various projects and how they rated in terms of Cyclomatic Complexity (A.K.A. McCabe’s Complexity). Quite rightly, everyone on the team was ecstatic to find that the project ranked top of the list (ie. lowest average complexity) in a field of various open and closed source projects.

Tools such as Checkstyle and PMD (to name but a few) make collecting these kinds of statistics very easy. What’s more, incorporating these tools into your developer and continuous integeration builds makes enforcing limits on whatever metrics you like almost trivial. And that is exactly what we had done.

Comprehension and testability of code vary with complexity. Overly complex code is difficult to maintain, difficult to test and more likely to contain bugs (hidden flaws). Experience on previous projects (rather than just theory) had taught us that a low value for cyclomatic and it’s cousin NPath complexity was indeed a good predicter of problematic code. On these projects, we had always cranked the threshold right the way down and had found this to have a positive influence on the code base.

In true Pavlovian style, and with nothing but ultruistic intentions, we decided to make cyclmatic complexity (along with dozens of others) a constraint on this project as well. It obviously follows then that our project would place at or near the top of the list. We had made it a constraint, enforced it in our build and therefore the only possible outcome was a low average.

But as always, the devil’s in the detail. It soon became apparent that like so many other checks, it is possible to subvert the process. The worst part is that few developers do so out of malice. More often than not a check will fail and the developer will (often quite ingeniously I might add) code their way around it. Without knowing any better, it is quite easy to inadvertently increae actual complexity whilst adhering to simple metrics.

I have a love-hate relationship with development tools. In the right hands they are invaluable but you can’t just give inexperienced developers tools and expect them to do as well. Tools + Monkeys != success. In fact as we discovered, left unchecked the opposite is much more likely to occur. Luckily for us we had a much better ratio of experienced/inexperienced developers so we caught most of the problems early and entered into intense education (ala Clockwork Orange - just kidding). I think it’s safe to say that in the end, the benefits certainly outweighed the potential dangers.

While certainly interesting, relying on a single metric can be very dangerous. Blind adherence to any number of metrics can disatrous. In general it’s often far safer to let tools do the grunt work to find the smells and even make recommendations, then employ the expertise of more experienced developers to deoderise. It’s also a good idea to use several metrics in combination that address different, sometimes competing, concerns.

Sunshine And Party Pies

By Simon Harris

September 7, 2004

All good things must come to an end they say. Tomorrow (or today I guess) is the last day on my current project. It is no exaggeration to say that it’s been one of the two best projects I’ve ever worked on.

After six months of development I’ve managed to see it through to code freeze and although I’d have liked to see it go live, that’s a luxury I will unfortunately miss out on this time. As strange as that sounds, there is nothing like nursing your own creation into production. I’ve always found production support of my own baby far easier than the actual development effort and certainly easier than supporting someone elses. Watching real, live users pounding away at your product is always good for the soul in what can often be a soul destroying lead-up hehehe.

I’ve laid my fair share of turds in the code base, I just hope I’ve obfuscated them well enough. Everyone has worked really hard to keep the code as clean as possible, employing as many checks and tests as we could. There are always things you wished you could have done differently but all in all I think it will survive the ultimate test - maintenence.

Technically, we used constructor based IoC really effectively. We have pretty good unit test coverage. Integration tests are up there too. Functional test coverage is arguably too good but hey I’ll live with that. We have out of container tests, in container tests, automated builds and deployments, code checks, cruise control, you name. Thankfully no one managed to find a way to squeeze any AOP in though they gave it their best shot :-P. In the last couple of days I’ve even been seen partaking in the writing of perl scripts. Oh and of course nothing gave me more pleasure than to see all 10 developers happily writing declarative business rules!

The people I’ve worked with have been just sensational. I’ve certainly learned a lot more than I’ve given especially from people like Mad Dog Murdoch, Uncle Dave with his little nuggets of wisdom (or as he would put it “corn in the stool”) and of course JRO who never ceases to make me realise why I’m just not cut out to be a “Technical Architect”. Marcus, Trashy, Triggles, Phil, Jimmy, Happy, Big Daz, Bazza, Victor and Vorgahan not to mention my backup singers Andwea and Neats. I don’t know how they did it but they all managed to endure my endless singing (if you can call it that) interrupted by bouts of ranting and the occasional tap dance.

It was gratifying to see customers, testers, BAs, developers and management all working together to get this thing out the door. It hasn’t always been easy but then “nuffin is”. In the end though I think pragmatism (along with threats of violence - just kidding) won through and we all just worked to ensure we got the job done.

I’ve quite a bit to digest and hopefully blog about in the coming days and weeks but before that I’m going to get my last 8 (ahem 10?) hours out the way then jump on my motorcycle for a couple of days and enjoy the sunshine that seems to be blessing this lovely city. Come Monday it’s back to work on a new project. For as a wise developer once said “It ain’t all sunshine and party pies.”

If Code Could Speak

By Simon Harris

September 1, 2004

The first time I met James I was amazed to see him sitting mere inches from the monitor. As I moved closer to see what it was all about I saw the text and graphics on the screen zooming in and out. It was enough to make my stomach turn.

James is legally blind. It’s actually not as bad as it sounds (it does come with some perks like free public transport hahaha) and he uses a great bit of software called ZoomText to magnify the display. But the best bit is the Text-To-Speech (TTS) feature. Highlight a paragraph of text (for which there is a short-cut key combination) and Davros reads the text back to you. You can also adjust the speed at which text is read back. I can just understand it at about medium pace but James has it turned all the way up to Chipmunks.

Recently, James wondered what source code would sound like and, to his delight, discovered that not only did ZoomText read the source code back as intelligibly as (perhaps more so than) a human, it even understood CamelCase text, separating the words appropriately.

Like most developers, I guess, I’ve always tried to adhere to one of the longest serving mantras of the development community - to write readable code. As you’ve probably guessed, James and I pair program quite a bit and now that he uses ZoomText to read back our code, never has this meant as much to me as it does now.

As I’m coding away now, there is this little voice in the back of my head (yes I am nuts but that’s a different voice) constantly reading back the statements, constantly reminding me of what it will sound like when James uses his TTS. And I think it’s made a difference. I think because of it I’ve changed the way I name methods, classes, etc., and even the way I structure statements, hopefully, for the better.

JSR-94 Not Useless But Certainly Trivial

By Simon Harris

August 21, 2004

I watch with interest as the Rule Engine chatter begins to increase. I truly believe it’s an area much ignored by the great majority of developers.

If you’re not aware, there is a JSR in the works to provide a common interface for integrating rule engines. In its current form, JSR-94 provides little more than a common interface for creating a context/rule engine and marshalling objects in and out. It is trivial to implement this yourself.

The fact is (pun intended) that the JSR provides little more than would result in developing a system that makes use of a business rules engine keeping in mind requirements for testability of rules and loose coupling (a.k.a. good design?). Only the JSR is considerably more verbose.

To illustrate, we have a large number of rules in our current system and a very small but useful set of interfaces:

public interface RuleEngineFactory {public RuleEngine createRuleEngine();}public interface RuleEngine {public void reset();public void addFact(Object fact);public void addFacts(Set facts);public void execute();public Set getFacts(Class type);}public interface RuleEnginePool {public RuleEngine getRuleEngine();public void returnRuleEngine(RuleEngine ruleEngine);}

Add to these a few very light-weight implemenation classes (and some Dependency Injection for good measure) and you have pretty much everything you could need from an integration standpoint.

public class JRulesRuleEngineFactory implements RuleEngineFactory {private final IlrRuleset _ilrRuleset = new IlrRuleset();public JRulesRuleEngineFactory(Reader reader) {if (!_ilrRuleset.parseReader(reader)) {throw new RuntimeException("Error parsing rules");}}public RuleEngine createRuleEngine() {return new JRulesRuleEngine(new IlrContext(_ilrRuleset));}}public class ThreadLocalRuleEnginePool implements RuleEnginePool {private final ThreadLocal _engines = new ThreadLocal() {protected Object initialValue() {return _ruleEngineFactory.createRuleEngine();}};private final RuleEngineFactory _ruleEngineFactory;public ThreadLocalRuleEnginePool(RuleEngineFactory ruleEngineFactory) {assert ruleEngineFactory != null : "ruleEngineFactory can't be null";_ruleEngineFactory = ruleEngineFactory;}public RuleEngine getRuleEngine() {return (RuleEngine) _engines.get();}public void returnRuleEngine(RuleEngine ruleEngine) {assert ruleEngine == getRuleEngine() : "ruleEngine not allocated from the current thread";ruleEngine.reset();}}

We theoretically have plugability of rule engines but to believe that this might be useful or even practical is naive at best.

Unfortunately, (or fortunately depending on your perspective) the biggest part of using a rule engine is in analysing and writing the rules themselves. Granted, many engines use a Rete Algorithm but to suggest that all Rete-based rule engines are the same is akin to saying that two Java applications are the same. JRules and JESS both use a Rete network and both are implemented in Java but the language, tools and behaviour (not to mention performance characteristics) of each differs sufficiently to render the conversion process rather less than trivial.

Surely, few of you would be imagine that a switch from using JSPs to say Velocity in a system of any significant size would be an overnight job. Similary, a switch from Struts to say Tapestry or JSF would be non-trivial. All of these technologies attempt to solve essentially the same problem but all come with a slightly different design philosphy. No matter how standardised the interface, if the behaviour of the system on the other side is different, the illusion breaks down.

Anything that lowers the barrier to entry for those wishing to explore the use of declarative rules is to be applauded however there are far more important problems for an organisation to solve than transparency of the underlying rule engine implementation. Atomicity, testing, managability, education, analysis, configuration, understandability, to list but a few. It is no coincidence that these are largely non-technical.

What Does Business Analysis Really Mean

By Simon Harris

August 12, 2004

Many years ago now I read the first edition of About Face by Alan Cooper. At the time, Dave and I were working on an HR application which was subsequently sold to a another company and is still being sold and (I presume) used, today.

Not to ding my own chain (much!) but it still rates as one of the best apps I’ve ever built. I’m sure if I looked at the source code these days I would whince but I still think we made some pretty good technical achievements. However technical merit aside the one thing that really made an impression on me and continues to make an impression was the usability of the application. Not only in terms of business functionality but also the way it simplified the way users performed their day to day tasks.

Alan Coopers most excellent insights were enlightening to me at the time and certainly influenced a lot of the design. But I can’t take credit for the usability nor functionality of the application. No for that we have Dave to thank. Besides having a brain the size of a planet, he is an exceptional business analyst. “Oh we have really good business analysts” I hear you cry. I’m sure you do in which case you’ll appreciate my definition of a business analyst.

Picking on Dave once again, he has an amazing ability to actually analyse a business. By this I mean try to really understand what the customer does; Why they do it; Determine if their current business practices even make sense; How their shiny new software might actually make their life easier; and; most importanty to convey to (AKA convince) the customer why his ideas will work. I’ve heard of CEOs walking away from meetings asking their staff how this guy knows so much about their business. And I know for a fact that he had very little prior knowledge. He just knows how to ask the right questions to get to the heart of their business.

Traditionally (though I have little data to suggest this isn’t still the norm) business analysts will sit with the customers and essentially document what the customer does. Workflow, day to day tasks, etc. From this they then write story cards or use cases (whatever is flavour dejour) that form the basis of the application design. These then go to the developers who consult with the customers on what exactly needs to be done, screen designs, etc. and then off they go to build the software. Unfortunately, the net result is usually a computerised version of some ancient manual system that is barely better than what they had and in many cases worse!

Maybe it’s because the skills of which I write are rare but I’m not sure where the notion that customers should design the software comes from . The idea that customers know what they need (or even want) is just plain ludicrous. In most cases, business people understand what drives their business. They understand what their competitive advantage is and where they could gain new business if only they could do X or had Y. Surely it is the BAs job, nay duty, to come up to speed with the business and from that explain to the customers what would make their life easier. Surely that is where BAs add value. They understand software AND the business.

Even as a developer, I see it as my responsibility to go and talk to the BAs and customers if I see inconsistencies or if I think the application flow or business rules can be improved. I’m sure their are those who wished Id shut up sometimes but I still think it’s worth it. Which brings us right back to where we started. It’s very rare that the end users can design a piece of software that actually does what they need but it’s equally as likely that developers will, on their own, design totally unusable software. So go read the book :-).

Oh, and this paper if you have the time.

The Lost Art of Database Design

By Simon Harris

August 7, 2004

Have you ever read any C. J. Date? No? Hardly suprising. In fact most of my colleages couldn’t give two hoots when it comes to data modelling (aka ER modelling) as distinct from object modelling. I’ll admit that for many it’s not the sexiest of topics.

Some months ago, Dave and I were discussing object modelling and database design. Daves assertion was that you need to do a good data model before your object model. I countered that I believed a good object model necessarily produced a well formed and normalised data model. On reflection it seems we were both right.

You see I was under the illusion that a good understanding of relational theory and alegbra was ubiquitous. I naively believed that all software developers knew what a normal form was, what a primary key actually meant and why relational integrity enforced via foreign key constraints IS important! WRONG!

Relational database management systems (RDBMS) are based on mathematically provable theory. SQL for example is a bastardisation (though not Dates exact words) of the relational algebra which in turn is derived from predicate calculus. Interestingly RDBMS and Business Rule Engines have much in common so it’s not suprising that Date, an amazing logician, seems to spend a bit of his time these days writing on business rules. But I digress.

Given this basis in rigourous theory and mathematics, I find it amazing that as an industry of self-proclaimed engineers so few of us seem to have any clues on the matter, preferring instead to stay in the relative adhocery of object modelling where we can all become pioneers by inventing and re-inventing this months rules of thumb or best practices for good OO design.

Many of the available ORM tools force you to either screw your object model or screw your database model and being the anal retentive person that I am when it comes to both topics I find this rather discouraging. Mostly because I’d be willing to bet my left nut that most developers out there don’t even realise that they’ve ended up with either an object model that violates some pretty basic principles of good OO design or ended up with a data model that is just plain wrong.

Granted, DBA’s are supposed to look at our data model and find the flaws but so far as I can tell, most DBAs these days are there to enforce BLLSHT NMG STNDRDS THT N N CN NDRSTND or to tell us what columns need indexing.

I’m all for surrogate keys over business keys, etc. but the fact remains that until the world chooses object oriented database management systems (OODBMS) in favour of the good old RDBMS don’t delude yourself into believing that relational theory isn’t important.

Covariant Return Types

By Simon Harris

August 1, 2004

Now for something completely geeky so as to fly directly in the face of my previous post, I quite literally just received notification that a Java feature request I had voted for (along with 861 others lol) has finally been closed off.

It may not seem like much but I’ve quite often felt the desire to further restrict the return type of an overidding method and soon, according to the release notes at least, I’ll be able to do just that in the up and coming 1.5 ahem 5.0 release of J2SE.

A clear candidate for this is clone() meaning I will finally be able to do:

public ThisClass clone() {
    return (ThisClass) super.clone();
}

No longer requiring callers to perform the cast themselves.

Of course removing redundant casts is but the simplest and most obvious use. Anyone who’s ever been forced to code on top of a “generic framework” implementing/overriding Object doStuff(Object) throws Exception methods will know what I’m talking about :-).

I guess it comes down to your preference for static versus dynamic typing in many cases and while I’m not averse to a degree of dynamic typing, I do like it when my APIs can be more explicit without the need for all that useful™ JavaDoc.

Don't Just Think, Feel It

By Simon Harris

August 1, 2004

I recently attended a half-day seminar thingy put on by Enterprise Java Victoria. During the day we had presentations from Mark Hapner and Gavin King along with a few vendors and some large-scale J2EE shops. Robert Watkins, James and I (apparently I talk too much and need to learn to shut up - like you didnt know that already) were among a dozen or so people fortunate enough to be invited to sit on various panels. All in all it was well attended and I think most people got something out of it.

Most of the talk centred around EJB Three-Dot-Oh and Es-Oh-Ay but there was also discussion on topics such as developer training, heavy versus light-weight containers, Java Web Start, etc. During all this, the one thing that struck me was the distinct lack of audience participation. We (the panels) were pretty much there to give our views (such that they are) on the future of J2EE but no matter how much I or Gavin or anyone else ranted, I kept wanting to ask the audience “so what problems are you having with J2EE?” I’m astonished the question never came up. Then it occurred to me that these people were here to find out what the next big thing in Java was going to be. And therein lies the problem.

If you’ve realised anything from reading this blog (apart from the fact that I’m a loud mouthed opinionated git) it’s that I really do feel we as an industry (being IT) do more to justify our own existence and “innovate” than to actually think about what our customers need. I love playing with new tools and frameworks and whatever else seems to be in vogue at the moment as much as the next developer but I also feel we (yes that includes me) persist in creating the technology equivalent of whiter toothpaste and softer toilet paper. FUD drives our industry as it does fashion. Vendor X or “Thought Leader” Y come up with some new fandangled tools/methodologies/practices. Quite often its not even new. Then these ideas are disseminated to the masses (being we the developers) in IT shops who then, upon seeing this bright shiny new toy, go about selling it to the business to justify the R&D required to work out how we might actually use it. What’s more we often shoot ourselves in the foot by over promising and under delivering.

Pause to blow off steam…

Why does the customer care about the latest technology unless it specifcally addresses some business need? “Why do they even care if it runs on Java or not?” was one of my questions for the throng and didn’t that go down like the Mars Beagle Lander on a bad day! Surely the customers have problems not to do with technology but to do with the way they conduct business. They want to do business more effectively. They want to be more flexible. Shouldn’t we be trying to find out what they do and help them do it better? Feel where their pain is, then go away and work out how we might make it go away. In doing so we would encounter a world of hurt and pain which should lead us to ask the tool vendors to help us out. Instead we end up with BEA telling us that what we really need is to turn everything into a web service and let business users clickedy-click to put together an application.

“So,” I asked the audience, “who here has built their own house or their own car?” amazingly one person replied “yes” to both questions. What about build your own furnitiure or made your own clothes or even serviced your own car? The materials are all readily accessible. Do commercial pilots construct their own planes or astronauts their own rockets? Surely they could, they’re some of the most highly trained people around but that’s not their job, that’s not where they add value.

Just take a look at SourceForge. We are very good at building IDEs and frameworks, etc. Well maybe not the last but certainly IDEs and tools in general. Whether you like it or not, Visual Studio really set the benchmark that all subsequent IDEs had to match and, thankfully, have now surpassed. It’s no surprise that IntelliJ and Eclipse , etc. are so good. We as developers understand developers. We understand how we work. IntelliJ doesn’t get in my way. It does just what I need and no more. It takes away the tedious tasks without trying to do my job for me. What we need is to build end-user software with as much thought as this. Not try to solve cool networking and infrastructure issues over and over and over again.

Sure, SOA and interoperability are laudible goals but they’re hardly new. Whether you like it or not, EDI has been around for decades, CORBA for longer than RMI, telnet longer than HTTP. I’m happy to listen to our Thought Leaders tell me their vision for computing but don’t try and convince me it’ll solve a bunch of problems that quite frankly I just don’t have right now.

I’ve taught martial arts for over a decade now and it’s only just occurred to me that some people turn up because they want to be told what to do. Maybe they lack the capacity to learn for themselves or maybe they just don’t want the responsibility but if I spend all my time teaching and none of it training, my reflexes get slow and I forget all the little things that one learns when actually forced to use technique in a meaningful way.

Neccessity they say is the mother of invention. If our Leaders long ago stopped actually dealing with customers and developing real code then they also stopped feeling the “pain” associated with developing business software. No amount of thinking makes up for this.

Rule Engine Notifications

By Simon Harris

August 1, 2004

I was interested to see Martin Fowlers recent entry on Notifications. If you’ve ever used Struts (gasp) or similar “framework” you’ll already be familiar with the concept so it’s certainly nothing new but Martin has a fantastic ability to document and explain things in clear, unambiguous terms.

The most interesting thing to me was this statement “You should use Notification whenever validation is done by a layer of code that cannot have a direct dependency to the module that initiates the validation.” and how this relates to the use of a rule engine within an application.

One of the biggest mistakes we’ve seen in using rule engines is to allow the business rules to become dependent on other than the business domain. For example, allowing rules to know or depend on what screen is currently displayed. Business rules should be statements of fact about business information not application workflow or navigation state. As much as possible, we want business rules to survive changes to the form, flow, layout and even number of application/s that depend on them.

Business rules make inferences about the business information (facts) presented to them. Some of these inferences will be new facts for other rules to consume and some will be facts for the caller to consume. It is this second class of facts which we classify as Notifications and that which the application collects and proccesses. At any point in time, some of these notifications will be relevant to the application and some will not. Some will cause the application to alter it’s workflow, screen layouts, etc. and some may safely be ignored. The critical thing to understand is that it is the applications responsibility to filter the notifications.

For example, imagine we have an application that collects data on a customer. The data is collected over N (where typically N > 1) screens according to the business workflow requirements. After each screen of data is collected, the user hits Next to proceed at which time the state of the domain is asserted into the rule engine, the rules are executed, and the notifications processed. Now lets imagine that one of the rules states that a customers date of birth is required. You’ll note there is no mention of a screen here meaning that until the date of birth is filled out, the application will receive a notification indicating some problem with that field. Rather than emed knowledge of the application into the rules, the application instead filters out any notifications that are not relevant. In this case by checking to see if the field specified in the notification exists on the curent page or not. If after filtering, there are no notifications, the application can proceed to the next screen; otherwise a message is displayed and the user cannot proceed. Finally on the last page, the application can check to ensure that there are no unfiltered messages before allowing the user to save. You can even get fancy and have the application take the user directly to the appropriate screen, something that would be difficult to achieve if the business rules were dependent on navigation state.

Another area we have used this approach is with authorisation. We have rules that assert Permissions (a type of Notification) based on the business data that the application can use to determine what a user is or isn’t allowed to do. Again, the rules make no reference to screens or assume anything about the calling application for that matter. They simply state the facts as presented and inferred. The application then checks the results for the existence of the desired permission and if present the user is allowed to proceed; if not the user is denied access to that particular function. This also makes rendering links, enabling/disabling buttons etc. very easy while maintaining the ability to define the rules in purely business domain (ie application-agnostic) terms.

The concerns of a client application are typically to do with workflow and appropriate use of screen real estate. Business rules on the other hand are concerned with statements of fact about the underlying business data. Notifications allow the business rules and application workflow to vary independently according to these different concerns.

Webpshere 4 Lock-In

By Simon Harris

July 28, 2004

Lately I’d come to think of Websphere 4 as a steaming pile of crap but then the other day I read an article which changed my mind; Human waste is actually useful for something.

We have an application that runs within the web container against a DB2 database. For development, we use OrionServer which, sensibly, leaves the isolation level at READ_COMMITTED by default. For production, we’ll be using Websphere 4 which gives a default transaction isolation level of REPEATABLE_READ.

Sidebar: There are four basic transaction isolation levels:

READ_UNCOMMITTED - Let me read anything even if it’s not been committed yet; Actually you should be committed for using this;
READ_COMMITTED - Only allow me to read stuff that’s been committed; The sensible choice for the descerning developer;
REPEATABLE_READ - I want to lock every row I’ve read because somehow I think I might want to read the same stuff again with the exact same results in the exact same transaction; Also known as please Sir can I have a deadlock; and;
SERIALIZABLE - My users are nostalgic for the response times they enjoyed during the good old days of 1200/300 baud accoustic couplers. Yay! Bring on those deadlocks. And the only way to change the isolation level in Websphere 4 is to deploy an EJB and specify the isolation level in the custom descriptor (I assume this is pretty common but no doubt not specified in the J2EE standard - anyone?). All very well and dandy IF YOU HAPPEN TO HAVE AN EJB FROM WHICH TO DO THIS!

breath…in…out

Unable to get Websphere to cooperate in any sensible fashion (no I don’t count EJB’s as sensible) we turn to the database. Ahh good old DB2 you old thing you..hey…hey…nudge nudge…wink wink. What can you do for us?

Bugger all that’s what! Well, not quite. A good-old Google search reveals some interesting stuff. It seems I can go in and set a server-level parameter DB2_RR_TO_RS which basically says “hey mister DB2, if that Websphere kid comes around here asking for REPEATABLE_READ, just smile and say yes yes and send him blissfully on his way but don’t actually do anything. ok?” Even more interesting, we found this solution right on IBMs website for their own Portal Server Software! And what do you know? It worked. No more deadlocks.

So that should be it right? Wrong! The problem with this little solution is two-fold: Firstly, it’s at the server instance not dabase instance level so it affects EVERY database; and secondly; it’s not a supported option in DB2 version 8.1 (works in 7.1) which we’ll be using for production. Instead, 8.1 supports so-called Type 2 indexes which supposedly solve the same problem. The index solution seems a little fragile to me though. I bet my bottom dollar someone will add a new table and forget to add the appropriate indexes. BLAMMO!

Back to square one. We try every combination of locklists and maxlock parameters, you name it. None work and frankly although they would have decreased the likelyhood of deadlocks, they don’t actually solve the underlying problem namely that that REPEATABLE_READ is a retarded default.

But what can we expect? I mean we aren’t doing it the J2EE way. “Please Sir, can I have READ_COMMITTED” we feebly plead. “How can you have any READ_COMMITTED if you don’t have any EJB’s?” Websphere replies cruely.

In the end it looks like I’ll need to dust off the old train book, wrap our Hibernate persistence stuff in a session bean and hopefully get back to doing some load testing.

Even after all that, compared with the performance of OrionServer, Websphere seems to mysteriously turn my P4 into a Casio wristwatch. Given the amount of CPU power required to run it, I think we’re going to need to find alternative means of power.

UPDATE: Having jadded the websphere code, it seems that the default isolation level for Sybase and Oracle is a more sensible READ_COMMITTED. That of course doesn’t help us. It also turns out that once you’ve called an EJB, even a nop method, the database connection is then set to whatever isolation level was set for the EJB! Oh we’re having fun!