avatarharuki zaemon

IKVM

By

As you’re probably aware, some time ago I ported Simian to C# and ever since I’ve been maintaining two versions of the source code. This was becoming tiresome to say the least and recently I’ve been toying with installing VisualStudio.NET under VMWare to see if I can use J# instead. Not only did this idea urk me “just because”, but the thought of installing anywhere up to 3GB worth of software that I would be forced to run under Windows just didn’t impress me in the slightest.

If you’re not up with the latest on mono you’re probably unaware that the site has been overwhelmed for days now since the release of 1.0. Almost hidden within the distribution is IKVM.NET “an implementation of Java for Mono and the Microsoft .NET Framework.” It includes the following components:

  • A Java Virtual Machine implemented in .NET
  • A .NET implementation of the Java class libraries
  • Tools that enable Java and .NET interoperability

So, having just recently emerged the latest packages under Gentoo, I figured what the heck, I’ll give it go.

First off, I tried ikvm. It’s like a java.exe replacement. Where I would usually run:

java -jar simian.jar

I can instead run:

**ikvm** -jar simian.jar

Which will run simian under the .Net framework instead. And it worked. My jar was happily running under .NET with no code changes! But it gets better. The next step was to try ikvmc the compiler. Yes you heard it, a “static compiler used to compile Java classes and jars into a .NET assembly.” So I gave it a whirl:

ikvmc simian.jar

It read the Manifest and determined where my main method was, “converted” all my classes and spat out an executable: simian.exe. So I ran it and once again, it worked first time. It just blew me away! No need to convert to C#. No need to installed Gigabytes of wizard-ware and best of all, the tools run under linux as well as windows.

So for a bit of extra bandwidth required to download the supporting DLLs, I can finally ditch all that lovingly hand-crufted C# code and deliver on just about any platform you can imagine!

I can’t vouch for it’s suitability with regards to your project but it’s definitely worth a look if only for it’s pure geek value. I’m sure yet another layer of indirection is exactly what you architects out there have been looking for. Just think, you too could have a Java application running under IKVM under .NET under Windows under VMWare under Linux ;-P

Suite Memories

By

One of the managers at the client I’m currently working for observed that I hadn’t blogged quite so much recently as in the past and wondered if I was, perhaps, suffering blogstipation. But today I received a good dose of editorial cod-liver oil.

I mentioned some time ago our success in speeding the build with respect to JUnit execution times. That has worked sensationally well for us for some time now. Unfortunately, as more and more tests were added we started getting OutOfMemoryErrors forcing us to generate multiple suites which in turn slowed the build.

Our short term solution was to stop instrumenting the build. This bought us some time but that was in no way considered a viable long-term solution. So finally today, after much distruption to the cruise build, we decided to get to the bottom of it. We already knew that JRules classes have some quirky behaviour that meant special care was needed to ensure it can free all working memory before being garbage collected. So we took a punt and looked at the tests involving rules.

On inspection, it did indeed look as if we had forgotten to clean up. Voila! No more memory errors. But then something struck me. I had just recently implemeted a solution to this for our mainline code for this very problem that ensured we were cleaning up in the finalize method of the same class we were using in our tests. This implied there was something else going on. Something most peculiar.

Unconvinced we had found the real cause of the problem, we decided to back-out the change and try something else. This time we made sure that the objects being used were candidates for garbage collection by explicitly setting the instance variables to null in the tearDown() method. We ran the tests again and bingo! Problem solved. But hang on a second. Surely the test classes themselves are being garbage collected? Surely clearing the instance variable was redundant?

A bit of digging around and we were soon scruitinising the JUnit TestSuite source code. And most fascinating reading it was too. Right there in plain Java for all the world to see: A Vector (how Java 1.1); a constructor that creates and stores (in said Vector) an instance of the test class for each test method it finds; and; a run method that simply iterates over an Enumeration of the instances, running each one in turn.

No wonder we were running out of memory! The bloody test class instances, all 2000+ of them, and the data they were holding on to were never released until the suite had finished.

A bit of decoration and a custom test suite class later and it’s bye bye to our OutOfMemoryErrors. I wonder what the next gotcha will be?

Network That Printer

By

Today I shall depart from my usual software development rants. It’s not often that a piece of hardware tickles my geek bone but I found something recently that did just that.

I have a 1.5Mbps DSL connection to the internet and run everything on a wireless laptop running Gentoo Linux and Windows under VM Ware for the times when I absolutely need to test something on windows.

The only piece of equipment I need to connect to my laptop is my ancient HP Laserject 5MP printer. Or should I say, needed. On Thursday I picked up a NetGear PS101 Mini PrintServer. A tiny (half-cigarette packet-sized) device that attaches directly to the printer and then via a CAT-3 cable to my hub. Granted it’s not wireless (though NetGear and LinkSys do have them) but at AU$115 it’s sensational value for money.

It supports both static and DHCP (the default) IP address assignment. It comes with Windows drivers (if you really feel the need hehe) but importantly took no more than about 5-mins to get working with my CUPS installation on Gentoo. I just needed to work out its IP address and internal print server name and set CUPS to print to: lpd://ip_address/internel_print_server_name

Don’t forget to set your iptables configuration to allow printing on port 515 (printer):

> iptables -A OUTPUT -p tcp --dport printer  -m state --state NEW -j ACCEPT

And you’re done. Sweeeet.

Well Behaved Rules

By

I have previously made a comparison between rule engines (and the RETE algorithm in particular) and SQL databases. Business rule languages are declarative as is SQL, both being based on predicate calculus. Both suffer (or at least have suffered) similar problems in terms of performance and optimisation.

I recall many years ago, tuning my queries within an in inch of their (or my more likely) life. Re-ordering the WHERE clause, changing JOIN conditions, even changing the order in which columns were returned.

Thankfully these days, even the simplest of SQL database engines have some form of optimisation built-in. High-end systems such as Oracle have very sophisticated optimisation techniques. I can pretty much write any old SQL (with caveats) and know that I’ll get at least acceptable performance in most cases.

The RETE algorithm (and it’s successor RETE-II) is amazingly good and rule engines have also come a long way but certainly not as much as ye-olde RDBMS. So there are still some things you need to consider when writing rules.

Without going into too much detail, the RETE algorithm builds a network of nodes representing the conditions of your rules and the matching facts. In general, the smaller the network, the better the performance.

The first thing to note is that any rules sharing common conditons are optimised into a single node. However, with many rule engines, this is sometimes only possible if the conditions are listed in the same order. So for any N rules having M conditions in common, order the conditions so that the first M are the same.

Now that your are conditions are in the same order, you’ll be interested to know that the exact order is in fact important. Because each condition is like an SQL JOIN, you need to place the MOST restrictive conditions first. That is, place the condition that is LEAST likely to be matched FIRST. This is no doubt familiar to anyone who has ever tuned SQL.

Iimagine we’re trying to find two people with the same parent. We could do this (JRules code examples, just ask me if you want to see JESS as well):

?a: Person()Person(getParent() == ?a.getParent())

This has one glaring problem: It’s essentially a cross product! So we need to fix it:

?a: Person()?b: Person(getParent() == ?a.getParent())evaluate(?a != ?b)

Now, as we’ve shown above, the number of conditions evaluated is also important. Anytime we can short-curcuit the conditions, we save ourselves another join. So once again, we can re-write our conditions:

?a: Person()Person(?this != ?a; getParent() == ?a.getParent())

I’ve found these simple techniques can result in the difference between rules running in seconds versus OutOfMemoryErrors!

To be continued…

Trivia

By

I’ve just found out that Amazonian women cut off their right breast so as to be able to use the bow better. Guess I’ll stop lusting after Amazonian women from now on!

In other news, an old friend of mine, Mike Lee, was down from Sydney today and passed on a piece of obscure Java knowledge.

He had spent some considerable time debugging a junior developers code, trying to work out what was going on with a static initialiser. He couldn’t work out why it was executing just prior to, and every time, the constructor of the class being called.

Eventually he worked it out. The developer had mistakenly omitted the static keyword from the initialiser block.

Well blow me down, Java has anonymous constructors (well that’s the name I’ve given them at least)! Yup, that’s right, this is perfectly legal Java:

class Foo {
    private final String _bar;
    {
        _bar = "Hello, World";
    }
}

In fact, just like static initialisers, you can declare any number of anonymous constructors and they will all be run (in declaration order) prior to invoking a “regularly” declared constructor.

So, to all you JLS weenies out there who already knew this, you don’t deserve to get a life. For the rest of us mere mortals, WTF?! :-)

Shake The Etch-A-Sketch

By

Recently I’ve been treating my ideas as if they were drawn on an Etch-A-Sketch. A deceptively simple concept that has helped me let go of ideas that were leading nowhere but to which I had an emotional attachment that prevented me from considering anything else. Just shake and you’re back with a clean slate. :-)

When Only An Example Will Do

By

Ever waded through seemingly useless JDK JavaDoc screaming “I know getBlah() returns a Blah but what do I actually do with one once I have it?! How do I use the swags of java.io classes? And what in the world would I ever want a StreamTokenizer for?

So maybe you’re a smarty-wishbone-legs and know everything there is to know about using the JDK APIs but, there’s always something new to learn.

Sometimes forums and newsgroups can be a great help but more often than not I don’t feel like wading through 18 pages of discussion to find the answer. Sometimes I just want a succinct example.

Enter the Java Almanac. I’ve found it to be a good source of JDK examples. It generally seems to be pretty up-to-date with new examples added all the time and is a great starting point for developers who are new to Java and struggling to come to grips with the vast number of APIs.

Business Rules Fallacy #1

By

Remember the good old days when Crystal Reports was going to save the world? That’s right. By about now (2004) every user on the planet would be writing up their own ad-hoc reports straight from the database. All they needed was the database schema and away they would go.

What? You mean your business users aren’t doing this? Really? Say it ain’t so!

No, the truth is it never really eventuated the way we (the IT industry) had envisaged. End users just don’t get fully normalised data structures. FWIW, most people I work with probably don’t understand why 25,000,000 + NULL == NULL so how did we ever expect users to? Oh and let’s not forget the IT manager who has enough knowledge of the system to be dangerous. He has a big picture of the database schema on his wall and knows just enough SQL to build queries that do multiple table scans over the millions of rows of inventory data, causing the DBMS to not so quietly tell every other use of the system to please get nicked :-) The plain fact is that writing reports typically requires as much help from IT infrastructure as to make it an IT task.

So now that Business Rules are looking more cylindrical with that silver sheen each day, some of us naively believe that our so called “Business Users” should be able to code up/modify rules for direct inclusion into a production system all by themselves. To me, this is an even bigger problem than reporting.

All the problems that plague end-user reporting apply. Users don’t understand our lovely, normalised, domain model. They surely don’t understand why they get a NullPointerExceptions when adding numbers. And when it comes to knowing the difference between and and or you can forget it!

What’s worse is that Business Rules are used directly by the application to “reason” on appropriate behaviour. By comparison, with the exception of the table scan problem and of course any bad decisions that might be made based on incorrect data, reporting seems rather innocuous.

Now, if I said to the man (and in this case yes it is a man) who writes the cheques, hey how about we don’t test any of this code before we put it into production, I’d get the sack immediately. And quite rightly so I might add. Application components interact in subtle and non-obvious ways that necessitate large-scale unit/functional and integration testing. Business rules are no different. Actually they can be worse. We have many years of collective experience managing essentially procedural languages such as Java, C, C++, etc. Most developers I know think a lisp is a speech impediment and surely wouldn’t know a prolog if they tripped over one :-)

Just like with reporting, we can try going down the path of writing views and buulding neato tools to try and make this stuff more like human readable languages and structures but when it comes down to it, like reporting, business rules require about the same (if not more) intervention from IT departments when end users write them as when the developers themsleves write them. The “best” tools in the world won’t solve real problems with allowing end users the ability to directly modify business rules. I mean, glasses aren’t much good if the patient is blind.

Business rule maintenence is and should be an IT responsibility. However, the rules must be representable in a way that makes it easy for an end user to verify the translation from written/spoken languages. Appropriate use of the lower-level JRules or OPSJ languages makes this a reality without the need for tools to render rules from/to “plain english”. With a tiny bit of coaching, our business users are finding they can understand the rules sufficiently to know when we’ve made a mistake. In fact this iteration, our business rep has indicated he’d like to try his hand at writing one. No points for picking the irony in that.

re: Appeal To Authority

By

I admit to having been influenced greatly by much of Martin Fowlers writing over the years. It’s always thought provoking at the very least but this entry on his bliki made me do a “What The?”

There is no way I (or any other software loud-mouth) has that much influence.

There is no way I can believe that he actually believes this. Clearly, people such as Martin do have that much influence. That’s why they are employed by organisations such as ThoughtWorks. That’s why they draw huge crowds at conferences. That’s why developers the world over buy their books.

So maybe we could re-phrase that sentence? Perhaps it should read:

There is no way I (or any other software loud-mouth) should have that much influence.

But the fact remains that he (and other software loud-mouths) do and as a consequence lots of people will blindly do exactly what they say. The operative word here being blindly.

He is quite right though. People are selective when appealing to authority and people as a whole should take more responsibility for their own thinking.

Encapsulation vs Hiding

By

It’s becoming dull explaining the difference between Encapsulation and Hiding. Maybe universities need to teach English as part of their computer science courses?

**v. encapsulated, encapsulating, encapsulatesv. tr.**1. To encase in or as if in a capsule.2. To express in a brief summary; epitomize: headlines that encapsulate the news.

**v. hide, hidden, hidesv. tr.**1. To put or keep out of sight; secrete.2. To prevent the disclosure or recognition of; conceal: tried to hide the facts.

Encapsulation does not mean that my classes need to have a high proportion of private methods to demonstrate good OO. It means the methods and data on a class should be cohesive - share and participate in common purpose and responsibility. Encapsulation has little to do with making all fields private and adding respective getters and setters.

A class has certain well defined and cohesive responsibilities - think normalisation (1BNF, 2BNF, etc.) but for classes - and those responsibilities are, by in large, public. Read encapsulation. The implementation of those responsibilities is hidden. The fact that some of the implementation lies is private methods is incidental and occurs for clarity and maintainability. Too many private methods is often an indication that a class has too much or wildly varying responsibility.

Business Rules Goodness - Continued

By

Continuing with the business rules examples thread, James looked at the examples of using logical to achieve a “compensating retraction” and asked me “so that’s all very well and good but what happens if I’m not asserting a fact? What a happens if instead, I’m sending an email to my broker?”

My first reaction was that this is a separate problem. The fact that I was potentially sending an email based on the SellOrder seemed like an implementation detail that I didn’t want cloduing the simple fact that, under certain conditions, I wanted to indicate my desire to sell.

After a little discussion, we came up with the following solution which, IMHO, elegantly maintains the atomicity of rules and the separation of concerns. I’ve taken some liberties with the syntax and I’ve not actually tried this in JRules but it does serve to demonstrate the concept:

rule BrokerInformedOnNewSellOrder {when {?order: SellOrder();not SellOrderActive(order == ?order);} then {sendMessage("Sell stock");assert SellOrderActive(?order);}}

As the name suggest, this simply informs the broker on any new SellOrder. The key here is that we introduce a new fact SellOrderActive. If we see an order that isn’t active, we’ll send a message and assert that it is now active.

(NB. sendMessage() isn’t syntactically correct for JRules but you get the idea.)

Next we need to know what to do if the SellOrder is retracted (either explicitly or implicitly):

rule BrokerInformedOnRetractionOfSellOrder {when {?active: SellOrderActive();?order: SellOrder() from ?active;not SellOrder(?this == ?order);} then {sendMessage("No longer sell stock");retract ?active;}}

This rule says that anytime we think we have an active order but the SellOrder itself no longer exists, retract it and send another message to the broker indicating that we no longer wish to sell.

As usual, I’ll re-write this rule in JESS. Thanks go to the creator Ernest Friedman-Hill for clarification on the exact syntax:

(defrule broker-informed-on-new-sell-order?order <- (sell-order)**(not (sell-order-active ?order))**=>(sendMessage "Sell stock")(assert sell-order-active ?order))(defrule broker-informed-on-retraction-of-sell-order?active <- (sell-order-active ?order)**(not ?order <- (sell-order))**=>(sendMessage "No longer sell stock")(retract ?active))

I'll have to start blocking email from myself now!

By

From: [email protected] To: [email protected] Subject: Re: Question Date: Sat, 1 May 2004 09:12:29 -0400

Here is my icq list. +++ Attachment: No Virus found +++ MC-Afee AntiVirus - www.mcafee.comarchive.zip

Planned Obsolescence Is Poor Design

By

Or, interfaces are no excuse to build broken software.

I’ve ranted about this previously and I see the problem recurring so often I feel compelled to have another go but this time from a slightly different perspective.

Lets start by agreeing that it’s best not to be constrained by someone elses API. They introduce constructs and ways of “doing stuff” that don’t conform to my ideal. Assuming we do agree, it then follows that mocking out someone elses API is not only a bad idea but usually plain pointless.

Enter Java’s interfaces and C++ pure-virtual classes. They are truly wonderful things. You don’t have to write any “real” code just to have one. No code means no tests required. And best of all, they provide functionality just the way you want it. They’re the ideal black box. Not as ideal but potentially almost as good, are the methods themselves. Even concrete implementations can and should provide me the ideal functionality.

So what do I mean by “ideal”? Well, say I’m implementing a bit of code and all of a sudden I realise “doh! here I’m going to need to do X” where X is conceptually simple but implementation complex functionality. I have a few choices: I could go straight into implementing X; or; I could choose to “pretend” that X is actually implemented exactly the way I want it to be.To achieve this I can either add a method to an existing class or I could introduce an interface. Either way, the important point is that I want this thing to do what I need in the ideal way. I don’t care that it doesn’t exist yet. But I don’t want my code polluted with knowledge of how it’s actually going to be implemented. I just want to call it and have it do it’s magic.

So we “stub” it out and push on. Eventually we finish our lovely piece of code. We stand back and it looks good. It’s readable. It’s understandable. But there’s still that X we haven’t yet nutted out. No problem. We can still test the class using mock objects so we can still maintain the illusion that X actually exists.

At some point we will feel the pain. We need to actually implement X so it can be deployed as part of the application. So now let’s say that X really has the potential to be stupidly complex but for now we don’t actually need to solve every possible case. We create a SimpleX or DefaultX or my much despised XImpl class. What we don’t do is create a MockX class simply because it’s “not the real thing”. What does “not the real thing” mean anyway? It’s there. It’s working code. It does what I need. It may not be as flexible as I want it to be but it’s NOT A MOCK!

And this is where the title of the entry hopefully becomes apparent.

As far as I’m concerned, every line of code I write IS PRODUCTION CODE. Period! If the customer was happy with the functionality of the system, they could disband the team and deploy the application right now and it would be production code. Let me repeat. EVERYTHING IS PRODUCTION CODE.

Code I write is rarely written with the intention of one day throwing it way! It’s a production implementation of the limited functionality that I need this minute. If I need more functionality at some point, I’ll add it. This may require me to intoduce more abstractions, more interfaces, more methods that potentially do as much as required to get the job done. I’m pushing the unimplemented bits out as far as needed so that my overall design continues to adhere to my ideal. This i’s not to say it won’t or can’t be binned at some point but that occurs as a consequence of an evolving application. It happens to any piece of code whether I actually decided upfront that it was likely to be binned or not.

There is always The Law of Leaking Abstractions to consider. But. by way of the obligatory analogy, just because there is the chance someone might run into my car, shouldn’t prevent me from actually driving. It just means I need to be vigilant and aware of what’s going on around me when I do so.

And because I’m sure I won’t have made any sense whatsoever up till now, I’ll confuse you even more with a real example.

Take pico container. Hold for the moment your predjudices on IoC/Dependency Injection/etc. That’s irrelevant to the point. The point is that pico container provides a very simple implementation of an interface. No bells, no whistles, no loading configuration from files. Everything is configured by code at run-time. But it does the job. It’s just not as flexible as maybe I’ll need down the track. But if it’’s all I need right now, then it’s, well, all I need.

Then I decide that what I need is something that can be configured from an XML file. Do I a) throw away all of pico container because it doesn’t do what I need; b) cut and paste the pico container code into a new class, re-using as much as I can, or c) just simply add the new functionality (in this case by way of nano container)?

I’ll let you be the judge but anyone who chose to have written pico container as throw away code because they knew one day they’d actually need to load configuration from files and therefore it was throw-away code deserves to be forced to program in binary for the rest of their natural life.

(BTW, I didn’t write pico or nano container I’m just using them as part of an example. That’s why it’s called an example)

Yes, it is true that things aren’t always ideal. But this is very rare in my experience and will only occur at the hopefully thin boundaries between your code and someone elses API! If this isn’t the case, then WTF are you doing? It’s your software. You created it. How is it that it doesn’t at least pretend to do exactly what you want?

FWIW, This discussion is really a demonstration of how I choose to interpret XPs do the simplest thing that could possibly work.

The Best Approximation Of The Least Crappy Solution We Could Possibly Build

By

We (all?) strive to build great software but how many of us look back on code we wrote even as recently as 6 months ago and still think “wow that’s some great software”?

Open up that CVS client and download some source code you haven’t looked at for a while. Make sure to start with the project you thought was the “best thing you’d ever written.” I’d be very surprised if your thinking on software development hasn’t changed sufficiently in that time to make you whince when you see the “cool” things you were doing back then.

I look back on videos of me performing Aikido technique and the same thing happens. “Yikes!” I think to myself “I can’t believe I used to think that was good! And to think I was teaching others that at the time!” My students learned long ago not to treat anything I say as gospel. To critically analyse what I’m teaching and understand it for themselves, not just parrot what I say. They know full well that it will probably change after a long weekend of contemplation.

We grow, we learn. Our thinking changes. In some cases it swings back and fourth like a pendulum trying to find that “sweet spot”. So don’t get down on others because they don’t produce the code that you’d like. Similarly, don’t be perterbed when your mentor/coach/teacher suddenly starts explaining to you, with all the enthusiasm of a child with a new toy, that yellow is the new brown.

Because we don’t live in an ideal world, when it comes down to it, at any point in time we really are just doing the best approximation of the least crappy solution we could possibly build.

BAL Cancer

By

Today I have to vent my spleen in the hope that I’ll save some poor hapless souls from venturing down the frivolous path that is the JRules Business Action Language (BAL).

First up, a recap. If you haven’t read any of my previous blogs on the subject, we’re in the process of converting all our validation to JRules and we have some, IMHO, pretty cool stuff happening now but I’ll blog about that another time. The rules are in a repository because realisticially the only way to write them using the BAL is using the repository (and associated tools). We need to use the BAL because it’s the business user friendly language and without it most if not all the business case for using JRules in the first place goes out the window. I estimate that we would be in the order of 5 times more productive if we could write these by hand using IRL in a text file!

Having all our rules in the repository makes it almost impossible to be agile/iterative/whatever you want to call it. Forget having multiple developers modify them because the rules are not maintained on an individual basis (ie they’re essentially all in one dirty great big file). Forget being able to apply domain refactoring (namely rename and move). And who the hell wants to write a custom Ant task to get the damn things out of the repository to do any kind of testing on them? Believe me when I tell you that we tried every which way to skin this little cat. Everytime we came up with a solution to one problem, another presented itself. It’s not that we couldn’t go ahead and write the rules mind you. It’s just that we knew if we did we would be heading down a path that would ultimately violate the entire premise on which we had based our approach. The greatest minds on the planet couldn’t have helped get around the plain and simple fact that the BAL should be classified as legally dead!

And so it was with great pleasure that we called in the project oncologist and removed the festering pustule that is BAL, hopefully, once and for all! From now on we will be maintaining the rules by hand in a text file (just like java). Because it’s a text file, CVS and IntelliJ can do their merging magic (just like java). Because it’s a text file, we can rename methods, classes, etc. with ease (just like java). The list goes on but I’ll spare you :-)

The upshot is that I managed to get 2 days worth of (previously estimated) work done in around 30mins! Just to be sure, I passed around a print-out of the rules (now in IRL) to garner peoples opinion and EVERYONE agreed: It was good! Oh, and readable. Which was actually going to be my point :-) What’s even better, is that almost straight away, people noticed a few mistakes in the rules that nobody had noticed when it was written in the “nice and fluffy” language.

I mean, when it comes down to it, the idea that end users will modify these rules is about as likely as the idea that end users can just create their own Crystal Reporst straight from the database - YEAH RIGHT!!! Even if they do want to, I reckon 30 mins of training would bring our business analysts up to speed with writing IRL. In the worst case, we can always convert rules from IRL to BAL on an as need basis if/when there is a requirement to do so.

In summary, JRules as an engine is great but the BAL sucks the big one! IRL IS readable (with a teeny bit of extra work) by end users and works great in an agile environment such as ours. BAL doesn’t. It (BAL) might work once you have a mature application but we don’t. Tools such as Eclipse and IntelliJ are making agile development that much easier. JRules IDE (as distinct from the engine which is just fine) is definitely a step backwards.

The day I got the metaphor

By

When I started on this project (as is usually the case when starting on any project) there was much to learn and many of the design decisions were unclear at best. Some seemed downright ludicrous.

Unfortunately for my team members, I’m not one to just start copying what everyone else has done - I need to understand why. After 6 weeks or so I started to feel that some of the stuff I was seeing was deliberate and some was just plain wrong. But that still didn’t explain to me why the deliberate stuff was the way it was.

This week James Ross returned to the project (YAY!) after being seconded by another team (BOO!) on the same floor. James is the technical architect for the project and therefore, IMHO, the one charged with having the overall “vision” for the design.

So, all week the focus of my constant ranting and questioning was turned fairly and squarely on him (poor bastard!). Still, no matter how many questions I asked or how much I ranted, every answer seemed only to address a single question which feelt uncomfortable to me - James is one of the best technical architects I’ve ever come across.

And then as we (James and I) walked out the office door on friday night still in the thick of an argy bargy, he made a seemingly off the cuff and devastatingly simple remark: “it’s an insurance application form. I’ve tried to get that across to everyone on this project so many times now” he exclaimed with a sense of exsasperation “except for you I guess?” Things immediately began to make sense! In one fell swoop, he had addressed almost all of all my issues.

XP has as one of its core practices The Metaphor. It had never been so clear to me how important the metaphor really is and the reason nothing seemed to gel was that I didn’t have it.

It’s been a while since I was on a project where I didn’t have the metaphor in my head. What also became blatently obvious was that when I have been the lead on a project, that’s pretty much where it’s stayed (in my head) and I haven’t communicated the metaphor effectively to every team member. It’s something I’ve usually written down as a series of “principles” that, whenever we have a design question, we can refer back to for guidance.

The metaphor is the highest level of abstraction in the application design fractal and therefore it’s paramount that every member of the team, either existing or new, fully understand what the metaphor is. Of course this is predicated on there actually being a metaphor in the first place! ;-)

The Irksome Power of Ignorance

By

Scarcely any degree of judgment is sufficient to restrain the imagination from magnifying that on which it is long detained. – Samuel Johnson

So “why then do I need all these unit tests for my code?” I read with amusement. Why can’t I just write functional/acceptance/integration tests and be done with it? The arguments are ludicrous. Saying that there are people we know who obsess about unit testing yet seem to write crap software and/or go overboard with seemingly useless tests in no way proves that it’s a flawed concept. Holy crap! if we applied that kind of logic to the entire industry we’d have stopped doing software development years ago.

We have to be very careful what we think we’re arguing about. I’d probably get upset if a Windoze weenie told me that Java was slow when what they really mean is that Java seems slow on Windoze! It may well be the case that Java is slow. But the fact that it’s slow on Windoze hardly justifies saying that Java as a whole is slow.

The fact remains that what most people do IS integration and functional testing and therefore most people know very well how to do this. Just because most people don’t know how to do proper unit-testing doesn’t mean unit-testing is wrong. It just means we work in a very immature field. Hell most people (myself included) don’t know how to do good software design. But we’re getting better at it!

Unit testing has proved itself time and time again in other fields. Do you think Ferrari needs an alternator to prove its engine design? Does the alternator manufacturer need an engine to prove its design? No. Likewise, every component (unit) of your computer has been individually tested before it’s placed into the PC your using to read this. Every single one. And then some! The components that make up the hard disk were individually tested before being used to assemble the hard disk. But does the PC manufacturer care about the components inside the hard disk? No. Of course not. Why? Because to them the hard disk is the smallest unit.

And so it is for software. Each “layer” in a system can be thought of as both a higher level of abstraction than the layer below it, yet a lower level of abstraction than the layer above. Sounds obvious right? Because software is and should be fractal in nature, it looks essentially the same no matter the magnification or level of granularity.

Do you test whether the JVM can actually increment a variable using the pre-increment operation? No! Do you test that the String class behaves correctly? Do you test to see if “your favourite framework” does its job? Probably not. At best you can safely believe it works as advertised. At worst, you write some integration tests to prove (or not) that your understanding of the tool matches reality.

Go have a quick search on Google and see what types of testing go on in electronic, automative, aerospace and just about every other field of real engineering I can think of. Not this trumped up Research and Development we like to call Software Engineering. Here’s a completely random start.

So when it comes down to it, the same people that tell me that “the network will never be transparent” because if I’m building a nuclear power plant, I can’t afford to just “ignore” network failures, also want me to believe that I should overlook a development and testing regimen that has proven itself time and time again in other, more rigorous, disciplines!

The project I’m on has 1200+ unit-tests that take around 11 seconds to run. Then we have a smaller yet substantial number of end-to-end regression tests, an even smaller set of end-to-end acceptance tests for each iteration and a tiny number of integration tests with external systems. So maybe we’re not your beloved “Thought Leaders” but we definitely have plenty of experience doing other than unit-testing.

We have zero bugs that we know of. We have things the users don’t like because they changed there mind or the developer didn’t understand the requirement. But we have rarely, if ever, encounter unexpected behaviour in the system. When new code is added, it quite clearly breaks the build if the developer has made a mistake. I can’t tell you how many times it’s saved my ass. We’ve definitely not encountered the types of problems usually associated with complex systems whereby strange interactions between objects causes software failure. I put this down to well defined and tested interfaces, whether they be “artifical” or not.

We perform around 20 or so automated integration builds a day running the full suite of tests and each developer runs the unit tests countless times each day.

The best part is, our system is nicely de-coupled. We can quite happily replace bits of our system with little or no impact because the unit-tests defined our interfaces.

Oh and as for writing bazillions of lines of code, actually my experience is that we end up with less code. In fact I find myself deleting more code everyday as we refine our abstractions, usually driven by testability. But that’s just my experience.

I know the software I produce today is, at best, average but the average has gone up since I started working - at least in the product development (as oppsoed to consulting) circles I keep. And it will continue to go up as we mature as an industry.

Comparing Collections

By

After a long week, Achilles finds he has too much time on his hands. His friend the Tortoise takes pity and indulges him with a bit of IM’ing.Achilles:I’ve done nothing but read blog entries this weekend.Tortoise:You must be bored! Anything interesting?Achilles:I just read an entry that reminded me of some stuff I refactored during the week.Tortoise:Do you ever get any real work done?Achilles:Now that Java has a LinkedHashSet can you think of any reason to use a simple List except for “performance” reasons?Tortoise:Won’t it just look like a List?Achilles:Sort of but, importantly, it’s also a Set. Since when do you actually mean to add the same item to a List more than once? I’m being pedantic here.Tortoise:But it happens, life is full of duplicates.Achilles:I’m sure it does but I can’t think of many examples where that’s actually what you want. It just seems too often people use Lists when they should actually be using a Set. Clearly, Lists are useful but ArrayList has to be the most abused Collection class aroundTortoise:People generally think in terms of Lists - it’s a simple concept.Achilles:Yes, but people also think that AND and OR mean exactly the opposite. What we think and what we mean aren’t always the same and programming is about expressing what you mean.Tortoise:Do people really think about the correct Collection type to use?Achilles:No, they probably don’t but they should.Tortoise:I try to but I can’t guarantee that I won’t be lazy and default to ArrayList.Achilles:Exactly! And then the code ends up iterating over stuff and assuming a particular order on things that have no order. I see it all the time and this damn CollectionUtils.isEquals(Collection, Collection)code just makes it worse. Its ludicrous. It basically allows you to compare a Listwith aSetand see if the contents are the same. Which is just wrong! AListand aSetare not the same thing. They are symantically very different and thinking that it's just a matter of comparing the contents is, IMHO, flawed.</td></tr><tr><td align="left" valign="top">Tortoise:</td><td>Which takes us back to your original question - if you want to allow duplicates then you can't use aSet, so when would you want to allow dups?</td></tr><tr><td align="left" valign="top">Achilles:</td><td>Very rarely I suspect. In fact how often do you ever want to allow duplicates and how often does order really matter? Part of the problem I think is a misunderstanding of what equals(Object)actually means. It implies substitutability and therefore must be reflexive. But many people don't realise that theirequals(Object)method isn't so that we end up witha == bbutb != a.</td></tr><tr><td align="left" valign="top">Tortoise:</td><td>I've not seen that happen.</td></tr><tr><td align="left" valign="top">Achilles:</td><td>It usually happens with inhreitence and using instanceofinstead of class comparison.</td></tr><tr><td align="left" valign="top">Tortoise:</td><td>You must have looked at a lot of shitty code!</td></tr><tr><td align="left" valign="top">Achilles:</td><td>You mean you can't tell? Why do you think I bitch so much :-)</td></tr><tr><td align="left" valign="top">Tortoise:</td><td>Ok what if I have a situation where it is possible to have more than one object of the same type and content? ThatCollectioncould not be stored in aSet, correct?</td></tr><tr><td align="left" valign="top">Achilles:</td><td>Correct. So you just want a Collection, not a List. I repeat NOT A List.</td></tr><tr><td align="left" valign="top">Tortoise:</td><td>Then what implementation class do I use?</td></tr><tr><td align="left" valign="top">Achilles:</td><td>The implementation can be a Listbut the variable should be aCollectionas inCollection things = new ArrayList();code because a List implies ordering and so far you haven’t mentioned anything about order being important.Tortoise:Ok so then I decide that ordering is important.Achilles:Sure make it a List but the key thing is that you don’t just assume that order is important because then people will try and write tests assuming something about the order and then they’ll build screens assuming something about the order, etc. etc.Tortoise:I’ve just remembered…I added a method to compare Collections (for that domain object) to see if there had been any changes - there is no check to see if they are the same implementation of Collection so i could be iterating over a List and a SetAchilles:Why can’t you just call Collection.equals(Object)? Thats what it’s for.Tortoise:On the Collection?Achilles:Yes. I see people writing “convenience” methods for comparing Collections all the time when they already have an equals(Object) method that does a perfectly good job.Tortoise:I assumed that it wouldn’t do a deep comparison.Achilles:It iterates over the contents, calling equals(Object) and or checking object identity (whatever is appropriate for the Collection). I use assertEquals(Object, Object)code on Collectionsall the time.<tr><td align="left" valign="top">Tortoise:</td><td>Hmmm, that didn't get picked up in the tech review.</td></tr><tr><td align="left" valign="top">Achilles:</td><td>Probably because everyone on the project usesCollectionUtils.isEqual(Collection, Collection)code!

Fatally flawed password schemes #27

By

I feel so stupid! I honestly can’t believe it’s taken me this long to realise what a drongo I’ve been.

Many years ago (well probably 3 or 4 years ago) a work colleage of mine at a new job was was using a password scheme that, at that time at least, was quite common amongst his colleagues (yes I know that by inference, they were now my colleagues also hehe). Anyway, this scheme involved replacing vowels with numbers. So, for example, “Hello” would become H3ll0.

At the time this seemed to me like a sensible thing to do so I blindly followed along. But just now it suddenly occurred to me how crack-headed (no pun intended) and brain-dead that was. I mean if the whole point was to prevent brute-force dictionary attacks then its clear (to all but me it seems) that attempting to obfuscate a password with a deterministic algorithm such as this is just pointless.

Fortunately I think I only ever used it on one machine I had at home those many years ago. But I wonder how many others continue to use it?

Unit tests should play nice

By

I’ve seen a few blog entries around of late demonstrating nifty things you can do to achieve setup/cleanup in your unit tests using statics or classloaders etc. And whilst I admire the creativity and ingenuity, I have to say that just like testing private methods, to me it smells.

Firstly, If unit tests rely on static state and class loaders for setup and cleanup then the tests are just plain broken and wrong. The ultimate goal for me is to be able to run test classes in parallel. Just splitting them up to achieve this won’t help because you likely have no way of knowing which test classes interact with which static parts of the system. The good old abstract static factory is a prime example of this. Unit tests should probably create any data and mock any infrastructure they need for the test. They certainly shouldn’t be relying on the behaviour of a class loader! If there’s too much infrastructure to mock, thats a smell too!

Next, ideally we’d like to run test methods in a single test class on a single instance of said test class, implementing setUp() and tearDown() as appropriate. Maybe the idea of a single instance isn’t always practical (I can’t think why) but just assuming the tests will be run in a new instance isn’t acceptable either. What could I possibly want to do in a constructor that I couldn’t do in setUp()? Again, relying on the behaviour of the test runner to create new instances or the class loader to unload static state is just wrong!

There should be no assertions in tearDown(). Assertions failures in tearDown() will mask any failures in the test itself making it all but impossible to track down the actual problem.

And lastly, avoid extending other test classes as much as possible. Remember, any tests that are defined in the super class wil also be run in the subclass. Not only do we end up redundantly running the super class tests but if you have inadvertently overidden some public or protected methods (say setUp() or tearDown() for example) and forgotten to call the super method of the same signature, all hell breaks loose.

As a general rule, excessive refactoring of unit tests can make them difficult to understand. Sure it might be testing the code but it WILL make maintenence of the tests very difficult and surely hinders the ability of anyone to get a handle on the expected behaviour/role of the class under test.