avatarharuki zaemon

TDD and Genetic Programming

By

TADAIMA!

It’s been sometime since my last entry (feels like the start of a confession) but I’ve been in Japan for the past 2 weeks. For various reasons I really needed to get away from computers for a bit (no pun intended) and what better way than to emerse myself in my other passion.

As I was sitting in the plane on the 9 hour leg from Sydney to Tokyo, I was thinking about genetic programming (GP) and good old fashioned artificial intelligence (AI). I remember being fascinated by AI when I was a kid only to discover that in essence it was all about fitting a curve to a line. In more recent years, I stumbled on GP which sparked my interest once again especially when I read about people taking out patents on the basis of genetic algorithms (GA) they had developed specifically for the purpose of producing patentable designs.

So anyway, I started to think back to simple (and not so simple) neural nets that I had worked with. The kind where you create a back propogating net, feed it the expected inputs and outputs and watch it reconfigure the weights, etc. accordingly.

This got me to thinking about my old friend test driven development (TDD). Neural nets “evolve” by reconfiguring themselves to satisfy the old requirements AND the new requirements. TDD encourages a similar behaviour.

One of the criticisms of neural nets is they can be quite brittle. That is, they may very well solve the problems they have trained for but can often fail dismally on problems that to us seem almost exactly the same yet obviously differ enough to “confuse” the net. Usually it’s just a matter of re-training the ’net with all the old data AND the new data (just like running your tests!). The trick is to understand the problem sufficiently to create useful training data.

One of the criticisims of TDD has been that it creates brittle applications. Now my personal experience is that is just rubbish. If anything, it encourages designs that are more flexible/extensible in the long run. But, it is true to say that a purely TDD application may not cope with a scenario that hasn’t been tested for. In fact I argue that anything that hasn’t been tested for is by definition not a feature of the system. Not only is it not a feature of the system now, it is definitely not guranteed to work in any future release even if it does work by coincidence now.

But the real thing I liked about comparing TDD with AI is that it’s evolutionary. The system is continually adapted to suit the changing needs. This naturally lead me to think about GP and wondered if it was possible or even practical (let alone useful) to write a GA that could take a suite of failing JUnit tests and generate an application that would make the tests pass? Then we would simply add some more tests and re-run the GA to produce a new system adapted to the new requirements.

As I’ve already stated, I’m not sure how practical this is let alone useful even if it is possible. But it was the last thought on computers I had until I arrived back in Melbourne.

Now to find a good Japanese restaurant…

Dead code elimination

By

I recently added <a href=“http://www.joverage.org”’>jCoverage to the build of a “mostly” TDD project. The bits that weren’t TDD are classes that we either “hacked” together for a quick-and-dirty proto-type (and that will ultimately be removed) and some that we used for domain modelling.

I like test coverage results, mainly because it gives me a warm fuzzy feeling. I can see how well or how poorly the developers are going WRT test coverage. But, I’ve often wondered if it has much benefit over and above the warm fuzzies.

It has occured to me many times that in many cases (especially on a TDD project) the coverage analysis really shows me potential areas of dead code. That is, code that almost never (if at all) gets executed. Especially on large projects with lots of re-factoring going on, it can be hard to keep track of which methods are no longer used.

IDEs such as IntelliJ IDEA and Eclipse and even Checkstyle and I think PMD can show you visually which private members aren’t being used. The IDEs can also show you which public and protected methods aren’t being used but you have to run the search manually. Granted, it is also possible to write some code that loads all your classes and builds dependency graphs to do the same. But why bother when I already have a tool that does it for me?

And so it came to pass that this morning I was looking over a coverage report and found a few areas where the coverage was a big fat zero. Intrigued, I opened the code in my IDE and did a search for all usages. What do you know. None. Zero. Nada. Bupkis. Zilch. You get the idea :-). I did this on the next few untested methods and I’d say that around 25% of the untested code was actually dead code!

Just for curiosity, I’d really like to put this into a live system and see which parts of the code aren’t used. I’m sure some of it will turn out to be old code that handles scenarios that never eventuate anymore. Who knows? Worth a try at least?

Testing thread safety - updated

By

Not much this time except to say that I took the previous examples and made them a bit more generic. The example provided shows the simplest method of using the classes but it can easily be extended for more complex requirements. In fact, I’ve so far used these classes to successfully test some in-memory database code I’d been writing so it definitely works for other than tivial examples.

/*** Based on article http://www.npac.syr.edu/projects/cps615fall95/students/jgyip5/public_html/cps616/conflict.html*/public final class SimpleSample {private int _common;/*** Increment the common variable* @return true if we managed to update it "atomically", otherwise false to indicate failure*/public synchronized boolean increment() {int common = _common;Thread.yield();boolean successfull = (_common == common);_common = common + 1;return successfull;}}
import java.util.LinkedList;import java.util.List;public class SimpleSampleTest extends CustomTestCase {public SimpleSampleTest() {super(SimpleSample.class);}public void test() {final SimpleSample sample = new SimpleSample();List targets = new LinkedList();for (int i = 0; i < 5; ++i) {targets.add(new CustomRunnable() {public boolean run() {return sample.increment();}});}execute(targets);}}
import org.objectweb.asm.Attribute;import org.objectweb.asm.ClassAdapter;import org.objectweb.asm.ClassVisitor;import org.objectweb.asm.CodeVisitor;import org.objectweb.asm.Constants;/*** Removes synchronization from a method. Currently only removes the `synchronized` access flag from the* method declaration.*/final class ClassModifier extends ClassAdapter {public ClassModifier(ClassVisitor visitor) {super(visitor);}public CodeVisitor visitMethod(int access, String name, String desc, String[] exceptions, Attribute attributes) {int newAccess = access;if ((access & Constants.ACC_SYNCHRONIZED) != 0) {newAccess -= Constants.ACC_SYNCHRONIZED;}return super.visitMethod(newAccess, name, desc, exceptions, attributes);}}
import org.objectweb.asm.ClassReader;import org.objectweb.asm.ClassVisitor;import org.objectweb.asm.ClassWriter;import java.io.IOException;/*** Forces certain classes to be loaded into this class loader. In addition, performs byte-code modification to remove* synchronization from specific classes.*/final class CustomClassLoader extends ClassLoader {/** Name of the class under test */private final String _subjectClassName;/** Name of the test class */private final String _testClassName;/*** Constructor.* @param subjectClassName Name of the class under test* @param testClassName Name of the test class*/public CustomClassLoader(String subjectClassName, String testClassName) {_subjectClassName = subjectClassName;_testClassName = testClassName;}public synchronized Class loadClass(String name) throws ClassNotFoundException {Class c = null;if (name.startsWith(_subjectClassName)) {c = defineClass(name, true);} else if (name.startsWith(_testClassName)) {c = defineClass(name, false);} else {c = super.loadClass(name);}return c;}/*** Forces the loading of a class into this class loader* @param name The fully qualified class name* @param modify* @return The newly defined class* @throws ClassNotFoundException If an error ocurrs during loading*/private Class defineClass(String name, boolean modify) throws ClassNotFoundException {// Setup the class file to readClassReader reader = null;try {reader = new ClassReader(getResourceAsStream(name.replace('.', '/') + ".class"));} catch (IOException e) {throw new ClassNotFoundException(name, e);}// Setup an in-memory writer for the byte-`ClassWriter writer = new ClassWriter(false);// Determine if we need to modify the classClassVisitor visitor = writer;if (modify) {visitor = new ClassModifier(writer);}// And load itreader.accept(visitor, false);byte[] byteCode = writer.toByteArray();return defineClass(name, byteCode, 0, byteCode.length);}}
/*** Convenience interface for constructing simple threaded tests.* @see CustomTestCase#execute(java.util.List)*/public interface CustomRunnable {/*** Performs the test.* @return true to indicate success, otherwise false to indicate failure*/public boolean run();}
import junit.framework.TestCase;import junit.framework.TestResult;import junit.framework.TestSuite;import java.util.Iterator;import java.util.LinkedList;import java.util.List;/*** Base class for testing thread safety. Extend this class and simply implement any tests you need. Each test will be* run once ASIS and then again with synchronisation removed to, hopefylly, induce failure. When run with synchrisation* remove, each test is checked to ensure that it failed.* <br>* There is also a convenience method `execute(List)` to assist in writing simple tests.* @see #execute(java.util.List)*/public class CustomTestCase extends TestCase {/** Names of classes to modify */private final String _subjectClassName;/*** Constructor.* @param subjectClass Names of class under test*/public CustomTestCase(Class subjectClass) {_subjectClassName = subjectClass.getName();}/*** As the name implies, re-runs all tests (except itself) with synchronisation removed.*/public final void testAllUnsynchronized() throws Exception {// Ignore recursive invocations of this particualar testif (getClass().getClassLoader() instanceof CustomClassLoader) {return;}// We'll need our custom class loader to perform byte-code manipulationCustomClassLoader loader = new CustomClassLoader(_subjectClassName, getClass().getName());// We need an instance of this test within the new class loaderTestSuite suite = new TestSuite(loader.loadClass(getClass().getName()));// Run the testsTestResult result = new TestResult();suite.run(result);// And and ensure they ALL failed// TODO: This is not sufficient for reporting purposes. We need to check and report on each testassertFalse(result.wasSuccessful());}/*** Convenience method if a simple threaded model is all you require. This method executes the specified* `CustomRunnable`s and asserts that each thread completed successfully.* @param targets Collection of `CustomRunnable`s to execute in parallells*/protected final void execute(List targets) {// All threads are to share the same thread groupfinal ThreadGroup group = new ThreadGroup(getName());// Create a bunch of threads, one for each targetfinal List threads = new LinkedList();for (Iterator i = targets.iterator(); i.hasNext(); ) {threads.add(new CustomThread(group, (CustomRunnable) i.next()));}// Start up the threadsfor (Iterator i = threads.iterator(); i.hasNext();) {((CustomThread) i.next()).start();}// Wait for them to finish and determine if they were all successful or notboolean successful = true;for (Iterator i = threads.iterator(); i.hasNext();) {CustomThread thread = (CustomThread) i.next();try {thread.join();} catch (InterruptedException e) {// Ignore it}successful &= thread.wasSuccessful();}assertTrue(successful);}}
/*** Executes a `CustomRunnable` and reports on the success or failure.* @see CustomRunnable*/final class CustomThread extends Thread {private final CustomRunnable _target;private boolean _successful = false;public CustomThread(ThreadGroup group, CustomRunnable target) {super(group, "");_target = target;}public void run() {try {_successful = _target.run();} finally {if (!_successful) {// Fail-fast - interrupts all sleeping threadsThread.currentThread().getThreadGroup().interrupt();}}}public boolean wasSuccessful() {return _successful;}}

Adding a comment feed

By

I found an article detailing how to add a comment feed to a Movable Type blog. I made a few changes (as one does) and now you can subscribe to the comments on this blog as well as the main feed. So for anyone who’s interested, here’s the template:

<?xml version="1.0" encoding="<$MTPublishCharset$>"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
  <channel>
    <title><$MTBlogName remove_html="1" encode_xml="1"$> - Comments</title>
    <link><$MTBlogURL$></link>
    <description>Latest comments on <$MTBlogDescription remove_html="1" encode_xml="1"$></description>
    <dc:language>en-us</dc:language>
    <dc:lastBuildDate><$MTDate format="%Y-%m-%dT%H:%M:%S"$><$MTBlogTimezone$></dc:lastBuildDate>
    <admin:generatorAgent rdf:resource="http://www.movabletype.org/?v=<$MTVersion$>" />
    <sy:updatePeriod>hourly</sy:updatePeriod>
    <sy:updateFrequency>1</sy:updateFrequency>
    <sy:updateBase>2000-01-01T12:00+00:00</sy:updateBase>
    <MTComments lastn="20">
      <item>
        <title>Comment on "<MTCommentEntry><$MTEntryTitle remove_html="1" encode_xml="1"$></MTCommentEntry>"</title>
        <link><MTCommentEntry><$MTEntryLink$></MTCommentEntry>#comments</link>
        <description><$MTCommentBody remove_html="1" encode_xml="1"$><p>- <$MTCommentAuthor remove_html="1" encode_xml="1"$></p></description>
        <guid isPermaLink="false">comment<$MTCommentID pad="1"$>@<$MTBlogURL$></guid>
        <dc:pubDate><$MTCommentDate format="%Y-%m-%dT%H:%M:%S"$> <$MTBlogTimezone no_colon="1"$></dc:pubDate>
      </item>
    </MTComments>
  </channel>
</rss>

Testing thread safety revisited

By

At a loose end today I turned my attention back to a recent blog of mine on testing for thread safety. Spurred on by your feedback and possibly just to prove a point ;-), I decided I’d spend a few hours and see what I could come up with.

Before delving into the code, I’ll set out the scope or terms of reference for the excercise:

  • I wanted to test a very simple class for thread-safety;
  • The class should be written (designed) with thread-safety in mind, ie. with synchronization in place;
  • I will only deal with synchronization at the method delcaration level
  • The tests should prove that the class works correctly with synchronization in place;
  • The tests should also prove that the class fails when sychronization is removed; and finally;
  • I want it to be automated so I need a way to have a synchronized and unsychronized version of the class.

The last point here could be handled by using an interface much like the synchronized collection wrappers but I felt this was kind of cheating. Instead I decided to use the ASM byte-code library to do some magic on the class files.

Now onto the code and some brief explanation of the classes. You should be able to copy and paste the code into your favourite IDE, compile and run. You’ll obviously need JUnit and the ASM byte-code library.

First the class under test which I based on some code I’d seen when researching threading:

import java.util.Random;

public final class ThreadSafe {
    private int _common;

    /**
     * Increment the common variable
     * @return true if we managed to update it "atomically", otherwise false to indicate failure
     */
    public synchronized boolean increment() {
        int common = _common;
        spin();
        boolean successfull = (_common == common);
        _common = common + 1;
        return successfull;
    }

    private void spin() {
        Random random = new Random();
        try {
            Thread.sleep(random.nextInt(500));
        } catch (InterruptedException e) {
            // Ignore it
        }
    }
}

As you can see, very very simple. Its sole reason for existing is to demonstrate how multi-threading can cause conflicts. The class is correctly synchronized. That is, if left unmodified it should behave correctly and with no failures in a mutli-threaded environment. However, if multiple threads were to execute through a single unsychronized instance, we would hopefully end up with a failure. The call to sleep() with a random duration is a means to that end.

A slight variation on this would be to update an unsyhronized Collection instead of a simple counter. The Collection classes typically throw a ConcurrentModificationException which we could either catch and return false or modify our test to catch the exception itself.

Next, the somewhat large test class which is broken up into a few inner classes for simplicity. Don’t believe that it’s simpler this way? Try splitting them out and see what happens. It’s possible, but some of the logic becomes too complex for my liking.

Anyway back to the code:

import junit.framework.TestCase;import org.objectweb.asm.Attribute;import org.objectweb.asm.ClassAdapter;import org.objectweb.asm.ClassReader;import org.objectweb.asm.ClassVisitor;import org.objectweb.asm.ClassWriter;import org.objectweb.asm.CodeVisitor;import org.objectweb.asm.Constants;import java.io.IOException;import java.lang.reflect.Method;import java.util.Iterator;import java.util.LinkedList;import java.util.List;public class ThreadSafeTest extends TestCase {public ThreadSafeTest(String name) {super(name);}public final void testSucceedsAsis() {assertTrue(execute(getName()));}public final void testFailsWithoutSynchronization() throws Exception {// We'll need our custom class loader to perform byte-code manipulationCustomClassLoader loader = new CustomClassLoader();// We need an instance of this test within the new class loaderClass testClass = loader.loadClass(getClass().getName());// Run itMethod method = testClass.getMethod("execute", new Class[] {String.class});assertFalse(((Boolean) method.invoke(null, new Object[] {getName()})).booleanValue());}/*** Runs the actual test.* @param name The name of the test* @return true on success, otherwise false to indicate failure*/public static boolean execute(String name) {// The class under testfinal ThreadSafe target = new ThreadSafe();// All threads are to share the same thread groupfinal ThreadGroup group = new ThreadGroup(name);// Create a bunch of threadsfinal List threads = new LinkedList();for (int i = 0; i < 5; ++i) {threads.add(new CustomThread(group, target));}// Start up the threadsfor (Iterator i = threads.iterator(); i.hasNext();) {((CustomThread) i.next()).start();}// Wait for them to finish and determine if we were successfull or notboolean successfull = true;for (Iterator i = threads.iterator(); i.hasNext();) {CustomThread thread = (CustomThread) i.next();try {thread.join();} catch (InterruptedException e) {// Ignore it}successfull &= thread.successfull();}return successfull;}/*** Runs the class under test and reports on the success or failure.*/private static final class CustomThread extends Thread {private final ThreadSafe _target;private boolean _successfull = false;public CustomThread(ThreadGroup group, ThreadSafe target) {super(group, "");_target = target;}public void run() {try {_successfull = _target.increment();} finally {if (!successfull()) {// Fail-fast - interrupts all sleeping threadsThread.currentThread().getThreadGroup().interrupt();}}}public boolean successfull() {return _successfull;}}/*** Performs byte-code modification to remove synchronization from the class under test.*/private static final class CustomClassLoader extends ClassLoader {public Class loadClass(String name) throws ClassNotFoundException {if (name.equals(ThreadSafe.class.getName())|| name.startsWith(ThreadSafeTest.class.getName())) {return defineClass(name);} else {return super.loadClass(name);}}private Class defineClass(String name) throws ClassNotFoundException {// Setup the class file to readClassReader reader = null;try {reader = new ClassReader(getResourceAsStream(name.replace('.', '/') + ".class"));} catch (IOException e) {throw new ClassNotFoundException(name, e);}// Setup an in-memory writer for the byte-`ClassWriter writer = new ClassWriter(false);// Determine if we need to modify the classClassVisitor visitor = writer;if (name.equals(ThreadSafe.class.getName())) {visitor = new ClassModifier(writer);}// And load itreader.accept(visitor, false);byte[] byteCode = writer.toByteArray();return defineClass(name, byteCode, 0, byteCode.length);}}/*** Removes synchronization from a method. Currently only removes the `synchronized` access flag* from the method declaration.*/private static final class ClassModifier extends ClassAdapter {public ClassModifier(ClassVisitor visitor) {super(visitor);}public CodeVisitor visitMethod(int access, String name, String desc, String[] exceptions,Attribute attributes) {int newAccess = access;if ((access & Constants.ACC_SYNCHRONIZED) != 0) {newAccess -= Constants.ACC_SYNCHRONIZED;}return super.visitMethod(newAccess, name, desc, exceptions, attributes);}}}

There are two tests defined. One for success and one for failure as I mentioned at the start.

The aptly named testSucceedsAsis method executes a number of threads through a single instance of an unmodified class and hopes for the best :-).

Whilst probably non-obvious at first, the method testFailsWithoutSynchronization ensures that the test executes against an unsynchronized version of the class. It does this by creating a custom class loader that removes method synchronization through byte-code manipulating.

Some points of note off the top of my head:

  • Although it would be almost trivial to add support for removing all synchronization from the code, for purposes of this excercise that wasn’t necessary;
  • It may be necessary to inject random sleeps into code to try and get it to break as quickly as possible.
  • Class loading is notoriously problematic and may become too difficult for more complex classes with many dependencies; and;
  • I’m almost positive some AOP weenies will have something to add ;-).

This was largely an academic excercise and even though I clearly made a lot of assumptions, overall I was pretty happy with the outcome. We have demonstrated that it IS possible to test the thread-safety of a class. Whether this approach can be extended usefully to real-world examples remains to be seen. I leave that as an excercise for the reader ;-)

Don't touch my privates

By

I was giving a talk today on design and testing, refactoring, tools etc. and a question regarding the testing private methods came up.

We were discussing reducing cyclomatic complexity by encapsulating conditional statements in private methods. I showed a block of fictional code that included a test that the customer was at least 18 years old and contained a number of conditionals including the 2 or 3 checks required to test the age.

So we re-factored the code to extract the age check and demonstrated that the code was now much more readable and understanable. The newly created isAtLeast18YearsOld(Date dob) method made it clear what the calling method was trying to achieve.

The next question was “should we write a test for this?” and if so, “how are we going to do that if it’s private?”

Now, I almost never test private methods. I say almost, just in case i’ve either done it once before and forgotten or I need to change my mind at some point in the future. But as far as I’m aware, I never test private methods.

Private methods are private for a reason. They’re implementation detail. Yet another reason I dislike Java Beans so much - they simply expose the innards of classes such that the private instance fields might as well have been marked as public. “Guns don’t kill people, people kill people.” True enough but give a developer a getter and it’s death to good design.

Naturally, my first reaction is that I do believe there is enough logic in there to warrant a test but that it’s private and I don’t test private methods. Instead, maybe the method deserves to be public and therefore testable and if so where does it belong?

“how about we put it into the Person class?” someone asks. What a sensational idea I replied! No more passing around a Date.

Sometimes it’s more natural to place the the logic in say a strategy, making it pluggable. In this case you may choose to make the method more general such as meetsAgeRequirements(). Then, once it’s pluggable you could move the real implementation into a rules engine. Whatever you do, resist the temptation to put it into a static helper! IMHO statics are the last resort of the scoundrel programmer ;-).

In one simple example we’d managed to:

  • greatly simplify our code;
  • extract and make obvious some business logic;
  • put that logic back into the class where it’s closest to the data on which it operates;
  • Justify making the method publicly accessible and therefore testable; and;
  • remove (from the class we’re implementing) a dependency on data (ie date of birth) contained within another class.

We’ve given our classes behaviour!

Private methods exist primarily to reduce the complexity of other methods and/or to remove code duplication. Either way, they are incidental to the implementation detail.

If you feel you need to test a private method (maybe because it’s complex or contains some kind of business logic), rather than subverting Javas access protection mechanisms or perverting your code, have a think about what the method really does and where it belongs. Chances are, you’ve missed an important abstraction or concept.

Lots of little classes

By

I remember having a heated discussion many years ago over the use of hungarian notation. Their argument went something like:

…If I don’t call it pLAmount, how do I know it’s a long?

To which I replied:

You don’t have to know if you create a Money class!

Whenever I see variables with type information embedded in their name or logically grouped by some prefix or suffix I immediately think there must be a missing class. An abstraction or simple concept that we haven’t expressed in code yet.

As an example, we’re in the process of building a booking system and we obviously needed a way to represent date ranges for various things such as, you guessed it, Bookings.

The first thing that emerged was a Booking containing a fromDate and toDate. This had two aspects (no, not AOP-speak) that annoyed me. One was the fact that the variable names all had suffixes of Date. The other, anytime we need to pass these values around, we need two, count ’em, 2 parameters for every date range!

Oh one more thing. I loathe and detest the java.util.Date classes. Many people have commented on the problems with them so I’ll say no more.

Instead we created a TimePeriod holding the start and end of the period represented as milliseconds GMT.

First the tests:

public class TimePeriodTest extends TestCase {
  public TimePeriodTest(String name) {
    super(name);
  }

  public void testSameRangeOverlaps() {
    assertTrue(overlaps(1, 2, 1, 2));
  }

  public void testRangeBeforeDoesntOverlap() {
    assertFalse(overlaps(1, 2, 10, 11));
  }

  public void testRangeAfterDoesntOverlap() {
    assertFalse(overlaps(10, 11, 1,2));
  }

  public void testRangeInsideOverlaps() {
    assertTrue(overlaps(5, 6, 1, 10));
  }

  public void testRangeOverlapsAll() {
    assertTrue(overlaps(1, 10, 5, 6));
  }

  public void testRangeOverlapsStartOnly() {
    assertTrue(overlaps(1, 6, 4, 10));
  }

  public void testRangeOverlapsEndOnly() {
    assertTrue(overlaps(4, 10, 1, 6));
  }

  private boolean overlaps(long f1, long t1, long f2, long t2) {
    return new TimePeriod(f1, t1).overlaps(new TimePeriod(f2, t2));
  }
}

And here’s the class itself:

public final class TimePeriod implements Serializable {
  public static final long FOREVER = Long.MAX_VALUE;

  private final long _from;
  private final long _to;

  public TimePeriod(long from, long to) {
    Assert.isTrue(from <= to, "from can't be > to");
    _from = from;
    _to = to;
  }

  public boolean overlaps(TimePeriod other) {
    Assert.isTrue(other != null, "other can't be null");
    return _to >= other._from && _from <= other._to;
  }

  public int hashCode() {
    return (int) (_from ^ _to);
  }

  public boolean equals(Object object) {
    if (object == this) {
      return true;
    } else if (object == null || !getClass().equals(object.getClass())) {
      return false;
    }

    TimePeriod other = (TimePeriod) object;
    return _from == other._from && _to == other._to;
  }

  public String toString() {
    return getClass().getName() + '[' + from + '=' + _from + ", to=" + _to + ']';
  }
}

You’ll note, there is no getFrom() nor getTo() method. Not because we won’t eventually need them but because so far we don’t need them for the user stories we have finished. More importantly, not having getters and setters forces us to think about classes in terms of behaviour not data. A topic I’ve ranted on previously.

So, we implement the class and now a Booking has a TimePeriod during which it is active.

Searching for available resources becomes pretty easy. We simply search for resources with no booking that overlaps a nominated period.

Instead of placing a getPeriod() on Booking, we choose instead to add a overlaps(TimePeriod) method. Again, it’s not because we think we won’t need to eventually get/set the period but instead we eel it’s more meaningful to add this kind of behaviour.

Let me re-iterate: Getters and setters are EVIL!

The next thing we needed to do was retrieve bookings. Aside from using a referenceId, the users want to search by name and booking date. The trouble is, sometimes they (the customers) don’t remember the exact date:

Um…it was sometime in january when I made the booking

says Mr. Smith. So how do we do this?

At first we though we might need some specialised class that magically knows to ignore days when comparing itself to a TimePeriod. It then struck us that we already had everything we needed.

Given our example, the month of January can be represented by a TimePeriod where the from is assigned 2004/01/01 00:00:00.000 and the to is assigned 2004/01/31 23:59:59.999. This necessarily overlaps any TimePeriod that falls within or around January 2004.

So now we can search for any booking that overlaps the specified period.

“Nothing staggering here” you might say. “I’ve seen all this before” mutters the crowd. Well true. But it serves as the basis to illustrate something I feel is very important in software design.

“Which would be?”

Continually simplify your code by introducing abstractions whenever and wherever possible. Abstractions, often at what appear to be ludicrously fine levels of granularity, almost always lead to better quality code. It’s much easier to remove an abstraction than to introduce one later on!

Ignorance was once bliss

By

It’s always nice to find something about which I know bugger all. So today, instead of my usual rhetoric, magic answers and dubious words of advice, I roll on my back and display my soft under belly.

You see, I’m trying to make a class thread-safe. Almost all of my classes are inherintly thread-safe by virtue of the fact they are immutable. This class however updates internal data structures such as Maps and Sets.

Although Maps and Sets can be made internally thread-safe (by calling Collections.synchronizedX(anX)), that doesn’t really help when performing multiple calls (add(), remove(), etc.) on multiple structures (including instance variables) and treating them as one atomic-ish operation.

Now, I don’t want to just put synchronization blocks everywhere and hope for the best. I want to validate that the class is actually thread-safe. But I soon realised I have very little idea how to unit test for concurrency and thread-safety.

A quick search of the ‘Net turns up a few interesting links:

But these only address the problems associated with running multiple threads within a test. Unfortunately, simply running multiple threads doesn’t actually prove that we have achieved thread-safety.

Because of the non-deterministic nature of thread scheduling, it is very difficult to ensure that we have had multiple threads accessing a single object concurrently, save adding hooks into the object itself to make it sleep or give up control at strategic points in the code.

Immediately, my brain kicks into golden hammer mode and decides that this problem looks like a BCEL or ASM nail. But, being the naturally lazy git that I am, writing code or having to deal with AspectJ doesn’t really appeal to me right now. Nor can I come up with any heuristics to determine where I would inject code anyway.

As you all know by now, I scoff at the idea that something is too hard to test. That said, I’m still left pondering how the hell I’m going to validate that my code is thread-safe? More to the point, how am I going to (re)design my class so that it’s easy to test?

Exceptionally Challenged

By

I’ve finally finished my foray into C# and I suppose it would be obvious to all that I was rather less than impressed by some “decisions” that were made by the C#/.Net development team(s). Most noteably, the lack of checked exceptions.

So it was with much joy that I stubled upon this article, on the Artima web site. An interview with Anders Hejlsberg, the lead C# architect and a distinguished engineer at Microsoft, on “The Trouble with Checked Exceptions”.

Fantastic I thought! At last I’ll get some sensible, logical, coherent and rational explanation for some of the stuff that feels so uncomfortable to a Java weenie such as myself.

If only.

Thinking that maybe it was just me, I forwarded the link on to a good friend of mine James Ross who knows quite a lot about the Microsoft world. But alas, he drew similar conclusions. (Portions of our conversion have been included here)

No, it is sad to say but Mr. Hejlsberg shows his true colours and in the process makes me even less impressed with .Net.

I’m usually not fond of taking an argument and disecting it line-by-line. It’s often too easy to take stuff out of context and often leaves one open for a counter argument in a similar vein ultimately leading to a flame war. But on the basis that Mr. Hejlsberg wouldn’t know me from a bar of soap let alone read my blog, why not.

So without further ado:

…I completely agree that checked exceptions are a wonderful feature. It’s just that particular implementations can be problematic. …I think you just take one set of problems and trade them for another set of problems.

So he likes them but thinks that it’s the implementation of them that’s wrong. Um…I must be really stupid but HOW ELSE DO YOU IMPLEMENT CHECKED EXCEPTIONS? I can’t think of a simpler way than in Java. In fact I can’t think of any other way really. Please enlighten me!

Skipping foward a bit, he reckons

The concern I have about checked exceptions is the handcuffs they put on programmers … It is sort of these dictatorial API designers telling you how to do your exception handling.

Well how about being dictatorial about what methods you can override? In C# a method cannot be overriden unless you declare it as virtual.

He then goes on to say:

Let’s start with versioning, because the issues are pretty easy to see there. Let’s say I create a method foo that declares it throws exceptions A, B, and C. In version two of foo, I want to add a bunch of features, and now foo might throw exception D. It is a breaking change for me to add D to the throws clause of that method, because existing caller of that method will almost certainly not handle that exception.

Dude, try making an API that will be able to respond, like wrapping those FOUR DIFFERENT EXCEPTIONS into one abstract package-level exception, you clown! This argument doesn’t stand up to scrutiny because the same thing can be said about method parameters and return types, so why not get rid of them too!?

In fact he says:

C# is basically silent on the checked exceptions issue. Once a better solution is known - and trust me we continue to think about it - we can go back and actually put something in place…And so, when you take all of these issues, to me it just seems more thinking is needed before we put some kind of checked exceptions mechanism in place for C#.

Right! You’re somehow going to go back and change the way exceptions work!? Get real my dear architect. Whatever happened to your argument about versioning issues?

…in a lot of cases, people don’t care. They’re not going to handle any of these exceptions. There’s a bottom level exception handler around their message loop. That handler is just going to bring up a dialog that says what went wrong and continue. The programmers protect their code by writing try finally’s everywhere, so they’ll back out correctly if an exception occurs, but they’re not actually interested in handling the exceptions.

AHA! Now we begin to see the truth. He doesn’t actually like catching exceptions after all. I dont know what planet he is on but really.

I can just see my nuclear power station monitoring system catching a CoreMeltDownException in the “main message pump” and bringing up a dialog kindly informing the operator he might have a problem. Meanwhile, blissfully unaware, the rest of the system continues on removing the control rods from the core.

And what’s with the finally's everywhere to “protect their code”? Holy cow. Don’t let this man near any system I’m likely to work on. I don’t know about you but I rarely need to write finally statements except maybe in some integration code. Who manages resources this way these days? I’m not saying I dont use them but my code surely isn’t as littered with them as he implies it would.

Then he loses all credibility by delving into some spurious arguments on “scalability”:

The scalability issue is somewhat related to the versionability issue…Each subsystem throws four to ten exceptions. Now, each time you walk up the ladder of aggregation, you have this exponential hierarchy below you of exceptions you have to deal with. You end up having to declare 40 exceptions that you might throw. And once you aggregate that with another subsystem you’ve got 80 exceptions in your throws clause. It just balloons out of control.

Get out of my face! I’ve never EVER seen this, even in the worst code bases I’ve had the misfortune to work on. For a start, the numbers he quotes are just ludicrous. But again I say, how about wrapping those FOUR DIFFERENT EXCEPTIONS into one abstract package-level exception!? How about designing a language and libraries that allow smart people to do smart things instead of one that makes it even easier for stupid people to do stupid things?

What’s probably even worse, is that now my SQLException propogates all the way from my database layer to my GUI layer. Some “clever” developer realises that he can catch the SQLException, check the ErrorCode for 9901 which he happens to know means a key violation (or whatever) and display some nice message to the user. Whatever happened to encapsulation and abstraction? I mean, I know abstractions can leak but this is Niagra Falls baby!

But that said, there’s certainly tremendous value in knowing what exceptions can get thrown…

Excellent. Well at least we agree on something. Damn shame that because C# has no way of declaring what exceptions are thrown, I have no way of knowing if there are any to be caught, let alone what they might be. Unless of course it says so in the documentation. Documentation gets out of date VERY QUICKLY.

When I started my porting effort, I was using the mono doco which is still incomplete. This meant in some cases I had no idea if nor what exceptions might need to be attended to. What’s even worse, none of the exceptions seem to extend any sane base class so if I decide I need to catch more than one but treat them all the same way (say some kind of IOException) I have a 3 catch clauses!

But I think we can certainly do a lot with analysis tools that detect suspicious code, including uncaught exceptions, and points out those potential holes to you.

Ahhh. Of course (slap myself on the head) why didn’t I think of that? After all what we really need is yet another tool! In fact let’s create an AOP Library for C#. Yeah. Then we can inject code into existing libraries to catch exceptions…the possibilities for this are endless ;-)

But seriously, exceptions form part of your API just like methods, interfaces, parameters, abstract data-types, etc. If you think about them in this way, they stop being scary and start being useful.

Some things to remember when using exceptions:

  • Don’t use exceptions for flow control;
  • Create sensible exception heirarchies;
  • Never throw more than one class of exception from a method unless you are forced to. That is, if you throw more than one type of exception, make sure they all extend a common base class. That way clients can catch them and/or re-thrown then easily;
  • Avoid throwing someone elses exceptions. Eg. Don’t throw SQLExceptions from your middle-tier. Wrap them. There is usually no reason for the GUI to know there was an SQLException.

Simian bake-off

By

Well you asked for it and here it is. Results from running the native, C#, flavour of Simian versus the Java flavour.

As I mentioned earlier, I had originally run the comparison on my linux machine using mono. As many people had pointed out, this was far from a “fair” comparison. Some people even suggesting that purely porting the code would result in poor performance. To this I reply fooey.

The test (and I use the term loosely) was performed on a DELL 2.0 GHz Inspiron 4150 with 512MB RAM running Microsoft Windows XP Pro against the JDK 1.4.1_01 source.

And the winner is…I’ll let you be the judge:

The java version using the Sun JDK 1.4.2_03 ran in 64MB:

> java -jar simian.jar -recurse=*.java > java.txt
Similarity Analyser 2.1.0 - http://harukizaemon.com/simian
Copyright (c) 2003-11 Simon Harris.  All rights reserved.
Simian is not free unless used solely for non-commercial or evaluation purposes.
Loading (recursively) *.java from C:\jdk1.4.1_01\src
{ignoreCurlyBraces=true, ignoreModifiers=true, ignoreStringCase=true, threshold=9}
...
Found 40880 duplicate lines in 2339 blocks in 872 files
Processed a total of 369957 significant (1187603 raw source) lines in 3889 files
Processing time: 18.337sec

The C# version ran natively in 61MB:

> simian.exe -recurse=*.java > csharp.txt
Similarity Analyser 2.1.0 - http://harukizaemon.com/simian
Copyright (c) 2003-11 Simon Harris.  All rights reserved.
Simian is not free unless used solely for non-commercial or evaluation purposes.
Loading (recursively) *.java from C:\jdk1.4.1_01\src
{ignoreCurlyBraces=True, ignoreModifiers=True, ignoreStringCase=True, threshold=9}
...
Found 40880 duplicate lines in 2339 blocks in 872 files
Processed a total of 369957 significant (1187603 raw source) lines in 3889 files
Processing time: 12.628sec

Running with -server gains us about an 8% improvement in performance for the Java version but certainly still nothing like the nearly 30% needed to catch up to the native .Net

Surprisingly, running under BEA JRockit 1.4.2_03 used 235MB and tookaround 35 seconds using all default settings. The disk seemed to be thrashing but we made no attempt to tune the performance using JRockit options.

The C# version running under mono on the same hardware ran in 250MB and took around 78 seconds. Unfortunatele\y we couldn’t get any of the optimize features to work on the windows version of mono. Besides, we figure this comparison is rather moot. Rather it is better to compare Java versus Mono on linux.

So here are the results on a DELL 1.8GHz Inspiron 8200 with 1GB RAM running Gentoo Linux (2.4 kernel) against the JDK 1.4.2_03 source.

The java version using the Sun JDK 1.4.2_03 ran in around 60MB and took 25 seconds.

The C# version under mono (with -O=all) ran in around 90MB and took 34 seconds.

Amusingly, nay astoundingly, the .Net version runs natively faster under windows+VMWare+linux than the mono on straight windows or straight linux!! Go figure?

Interesting to say the least. I wait with baited breath for the ensuing storm of abuse from the Java community this entry generates Hehehe. Though it’ll make a change from receiving a serve from the .Net community.

Now the task is to see if we can pin-point what accounts for the difference. Unfortunately I fear, that because it’s a direct port (ie line by line), any improvements I make to the Java version will likely carry forward into the .Net version.

Performance aside .Net is still not my bag baby. It still feels a little clumsy. But then I’ve years of practise getting my Java up to scratch.

My foray into C# - Part III

By

Something I forgot to mention that was very cool about all this was the fact that I developed everything under Linux and the binaries run ASIS, under windows. That, to me, speaks volumes. It’s a credit to the guys on the mono project.

I also feel compelled to answer some of the suggestions that I’m biased. Me? Never! Nor am I opinionated, loud, prone to ranting….hehehehe

I too found the process to be very smooth and very easy. In no way did I try and suggest otherwise. At the end of both blogs I clearly stated how easy it was.

I was really trying to give an accurate account of what it was like to start from knowing nothing, along with all the annoyances and frustrations that comes with it. Some comparisons may not be valid from the perspective of a born and bred C#/Microsoft developer but you have to remember that it’s only natural to make comparisons. That’s how we learn. We try and compare it with something we already know to give it some context, some meaning.

I hear from a mate that C# version 2 will have generics and Iterators and a few things other things.

Regarding string versus String, you can’t (at least using mono) use any of the System classes without “using” the System namespace. This was the thing that seemed ridiculous to me. Unfortunately String (big S) lives in the System namespace. I only found this out after I had converted lots of code so I kept on with the string (little s) convention. This was how I had seen examples written up on the ’net that I was using as a guide.

As for performance, again, in no way did I try and make out that .Net was necessarily slower. All I reported was what I had found running under mono on gentoo linux. As stated in the blog, hardly a reasonable comparison hehehe.

The only thing I really can’t get used to are the libraries. I really do find them awkward. I used to be a C++ developer before I moved to Java and I really don’t see the C# libraries as a step forwards. Oh and unchecked exceptions.

In no way do I despise C#. Okay so the quip about compsci students may have been a little harsh… I got over all the extra keywords, maybe you can forgive me in return hehehe. C# and .Net may not be my cup of tea but they no longer appear too different and too scary to at least try out.

Thank-you linesmen. Thank-you ball boys. It’s back to watching the Australian Open…

My foray into C# - Part II

By

Well it’s just gone 7pm and I’ve pretty much finished the conversion of Simian to C#.

All in all it took around 8 hours to convert 2000 lines of real source (around 6,500 raw source lines) of Java code to C#.

I haven’t quite finished due to the blatently stupid file/directory handling in .Net. But I’m almost there. I’ve hard-`d it to run from the current directory for testing purposes. But 99% of it has been completed. Even the output is identical which is nice to see.

Performance? Well again, I stress I’m running under mono on gentoo so that’s probably not the best test but it seems to take around 50% longer to run and consume around 50% more memory than the Java version. I’ll have to run some tests on windows and see how that performs.

Some, hopefully, interesting bits to add to my last blog…

Case statements don’t allow you to fall through to the next one if you have defined any code for it. Instead you have to explicitly say you want to goto the particular case you need. Hmmm…not sure about that.

C# doesn’t allow you to put array specifiers [] after the variable name. It only allows them to go after the type. This is good. Only because it’s the way I code anyway but I did find some code that somehow managed to slip through unnoticed.

What’s up with structs? Sheesh! How are they in any way different to an Object except in the underlying implementation (ie they’re probably allocated on the stack or something)? More warm and fuzzies for the C fraternity? Get rid of them I say.

Marking something as readonly doesn’t actually seem to mean you must have assigned it a value. This may be a quirk of the mono compiler. I don’t know. But if not, that’s just wrong. The thing I love about final in Java is that it stops me from accidentally forgetting to assign a value to something. Took me 10 minutes to track down a NullReferenceException because I had accidentally deleted a line (an assignment) from a constructor. Using const is sometimes an option but it has a few caveats: const can only be used if you want a static variable; and; more importantly, the value has to be known at compile time! This pretty much means it’s only useful for simple, primitive values assigned in the declaration. I can’t use const for say a collection or an instance of any object that requires new. Nor can I use it for values assigned in constructors. Which is exactly the problem I had.

instanceof becomes is. How nice of them to simplify something I hardly ever use and generally consider poor practice anyway LOL.

I started off thinking that having to specify override for things was a good thing. Now I’m not convinced. Basically, if I haven’t thought that someone might want to override a method I’m pretty much screwed. Well they are. Luckily this is my own code base so I can refactor all I like. Shame about all those libraries and frameworks out there. If in doubt, you could always mark the methods virtual. Now that’s all very well and good but if the majority of the time I want stuff marked as virtual, wouldn’t it make sense to have that be the default? I think I’ll need a macro in my IDE for all these keywords: v[TAB]; ro[TAB]; etc. hehehe. But again, maybe it’s just what I’m used to?

Don’t get me started on the FileInfo, DirectoryInfo, File, Directory, yada yada yada. Even the regular expression libraries seem to have all these extra, essentially static, classes. The .Net libraries look like they were designed (and I use the term loosely) by high-school students. First year CompSci students at best.

Readers become TextReaders, Writers become TextWriters. In and out are reserved words! probably should have named my parameters something more meaningful anyway?

Auto-boxing? Didn’t use it. Well not that I know of. That’s the problem. I have no idea. Scares the hell out of me. Let’s see, we want a language with lots of optimizations to keep C/C++ developers happy so they can feel like they’re writing on to the metal, but for everyone else, lets make it really easy to not know you’re creating bazillions of objects all over the place. Hmmm…

All in all it was really easy and relatively painless. I started from nothing and probably still know nothing but it all seems to work. I can’t help but think that even in the early days of Java, the libraries may have been less functional, but they weren’t all over the place. More to the point, remember that C# came from Java (oh go on tell me it didn’t) so in my opinion there are no excuses. What, since C# started, a myriad languages have popped up that are at least as functional (probably more so) and far easier to use.

So, whilst it’s an interesting thing that we’ll all have to get used to, I’m happy in my Java land for the forseable future.

Time for a G&T to top off a good afternoon.

My foray into C#

By

Well after a relaxing 10 days riding my motorbike around Tasmania I return to Melbourne, chilled (literally), relaxed and raring to go. I think it’s about time to convert Simian to .net as had been suggested by many users. So here is a little blog of my experiences.

To start with, I’ve never written a line of C#. EVER. So I’m totally ignorant about all language constructs (except that having read the CLR book it looks like Java with uppercase method names) including, importantly, the libraries.

I’m also a linux weenie so I’m using mono under Gentoo. The combination is pretty good. Mono comes with mcs the compiler, monodoc, a purpose built browser for the framework libraries and mono for running your assemblies. At least I think that’s the new fandangled name for executables. Anyone?

I had first investigated writing an automated tool. The process seemed mechanical enough to automate yet difficult enough to end up doing some curly stuff. I also looked at some off the shelf converters but in the end decided doing it myself by hand was probably a good way to learn the ins and outs of the C# language and at the same time indulge my years of Java bias :-D.

My approach is simple: Open my project in IntelliJ (my editor of choice) and one by one, copy the .java files to corresponding .cs files, run mcs and hack away until it compiles cleanly. I have to also admit that I’m not going to turn my code into C# style stuff during this conversion. All my method names, variable names, class names, interfaces are remaining untouched. I’m not adding an I to my interfaces nor am I giving all my method names an uppercase first letter. Stuff that LOL.

So here we go…

The first thing is the package declaration. This becomes a namespace. Only, some crack smoking monkey decided that it should be a proper block which means I have to have open and closing curly braces around my entire source file. Not so bad but that means everything gets indented one more level. YUK! So I choose to just leave the indenting ASIS. Not so bad. Still seems unecessarily verbose to me.

Next. String becomes string. What’s with that? Method names get uppercase but String (a class) gets a lowercase? I got over it. Find+Replace is my friend.

Once again, Find+Replace for boolean. It’s a bool in C#. I’m guessing a hang-over from C++. Again, I can live with this.

What’s the problem with declaring a class as final? Ahhh. A quick search of the ’net reveals that it’s sealed. Ok. Another keyword to remember. Not too bad I guess. On we go…

Crap. How do I declare a constant? In java it’s static final. In C# it’s, quite logically I guess, const. Bulk Find+Replace for that. Done

Ok. Now I want an immutable (readonly) instance field. In Java I just mark something as final. In C# it turns out, I have to use readonly. Egads! Another keyword to remember.

It’s interesting. When I first started using Java I thought it odd that a few keywords, such as final, seemed to be used for slightly different meaning but it soon became apparent that really the meaning was the same: It’s final; Unchangeable; Immutable; Not modifiable. C# seems to want me to use a different keyword for every little thing. This is beginning to irritate me but it’s still not difficult to do.

Starting to lose enthusiasm for this. I see all this syntactic sugar entering the Java language, obviously driven by C# and I wonder if my already less than perfect Java will become a quagmire of lexical sludge.

On I go…

Syntax Error. Hmmm… Maybe it’s not called throws in C#? Quick IM to Mike Melia. C# has no checked exceptions. I knew this. But what I didn’t know is that you can’t even declare that a mehod throws an exception. Any exception. Hmmm. Oh well. Believe it or not, it’s my custom Assert class (starting simple) and it’s only IllegalStateException anyway so strictly speaking it doesn’t have to be declared. I’ll just delete it.

Ok what’s the problem now? class IllegalStateException not found. What’s an equivelant in C# I wonder. Trudging through the documentation I give up. This doco is really hard to follow. JavaDoc seems much easier to read. Maybe it’s just a matter of what I’m used to. I’ll just throw ApplicationException instead.

Still giving me grief. I prefix the class name with System.. That does the trick. Hmmm. That’s a bit stupid. Having to import the System namespace? Go figure. I choose to add a using for it. I don’t like putting package/namespace names into my code.

I discover that C# doesn’t allow importing classes. Only namespaces. This is ok by me. I always figure that once you have a dependency on a class in a package, you really have a dependency on all classes in the package in a way. What I don’t like is that now I don’t have any idea, just by looking at the top of the class, what other classes it depends on.

Woohoo. A clean compile. It’s complaining that there is no entry point but I can live with that. I haven’t specified a main anywhere. At least we got a compile.

It’s now a little after 12.30am. A bit of success has brought back my enthusiasm. I’ll keep going.

Time for an interface. Bleh. Can’t have public keyword on interfaces. Yeah Yeah I know. They’re all public anyway. But as in Java, it annoys me that a missing visibility modifier means one thing for a class and another for an interface. At least in Java I can safely put the public there and it doesn’t complain. Find+Replace. Done.

Now time to do one of the classes that implements the interface.

Ok. This is ridiculous. If I have an abstract class that implements an interface but doesn’t implement all the methods, I still have to define the methods in the abstract class as abstract. I’ve defined the class as abstract. Can’t the compiler work this out? Sheesh. Talk about needless typing.

Uh. Now I discover I have to explicitly say virtual on all my methods to allow them to be overriden. I have no problem with saying override when I actually do override something, that’s kinda cool, but I’m guessing the virtual bit is a hang-over from C++ and/or it makes it easier to optimize the output of the compiler because I the developer have to tell the compiler that there is no need to create jump-vector-tables for the method. Whatever the reason, combined with the previous stuff to do with interfaces, it’s really starting to drive me nuts.

Ok. It’s now 2:45am and I’m honking along. Thankfully the compiler catches all the bits I forget to change. Sometimes the messages aren’t particularly helpful but that may just be the mono compiler. Nothing to do with C# per-se.

I’ve done all the simple classes I could find. All the ones with few dependencies. Now it’s time for some of the more meatier ones. I think I’ll keep going until 3.30am and then call it a night.

I have a bunch of decorators (I love decorators) but boy is it a pain in the butt to implement in C#. It’s like alphabet soup after I’ve added all the necessary keywords: virtual, override, sealed, etc. Do Microsoft developers get paid by keystrokes? Anyway I’m getting there. It really is becoming quite mechanical now.

C# uses C++ style class extension. So instead of saying extends you use a colon :. you also call the super-class constructor using the C++ style colon after the constructor name instead of on the first line of the constructor. That’s not too bad. Pretty much the same thing anyway. Oh and just to be different, super becomes base.

I’ve just spent the last 20 minutes learning about the collections classes. I’m beginning to think the .Net libraries were an R&D project that made it into the wild a little too early. 57 (I exaggerate a little hehe) different collection classes and none seem to do what I want.

Aha! There it is. StringCollection. Basically a Set implementation for strings. Ok now to iterate over them…

Grrrr. No Iterators. IEnumerator? Gimme a break. Instead of hasNext() followed by next() they use MoveNext() which returns true if it was successfull and a Current property that is null if there isn’t a current value. Ok. but I can change my for loops into foreach which is kinda cool and surely makes up for it. As I mentioned before, that’s one of the features I’m looking forward to in J2SDK 1.5.

I’ve just converted some code that parses numbers and some that performs file I/O. Do you think that I could work out what exceptions might be thrown? I’m going head-first into the unchecked exceptions debate here and state outright that it is plain broken and wrong that the only way I can find out what exceptions can be thrown is to pray and hope that they were documented. My god! Not only that, but the I/O exceptions don’t seem to extend any sensible base class. So now instead of hoping I’ve caught all the necessary exceptions, I’m forced to catch Exception.

I’m very glad I’m converting from fully tested, well designed (if I do say so myself, which I do hehehe) Java code because I truly believe at this stage that C# and the .Net libraries are woeful.

I admit, I’ve now around 3 1/2 hours of C# experience which clearly makes me an expert, NOT, but I struggle to see how you could take Java, and make it worse and less mature. They managed it though.

Ok, well now I’m screwed. IEnumerator doesn’t allow you to remove from a collection whilst iterating. In fact it expressly says this isn’t allowed. I’ve written a bunch of custom collection classes (for performance reasons) that I was hoping I could just ignore for now but looks like I’m going to have to convert them as well just to get the behaviour I need.

It’s now 3.45am. Time for bed. My brain hurts. Back into it tomorrow me thinks. I’ve done around 45% of the code base in a couple of hours. Not bad. It’s pretty easy. I wonder what the Microsoft conversion tool is like :-)

It’s not too bad so far. A few quirky things here and there. It surely looks like they’ve tried to make all their existing developers happy by keeping lots of language constructs the same as those in Microsoft flavours of C/C++, VB, etc. I can understand that. In fact I think in some ways it’s remarkable that they think that way about their developers. But still it’s a bit of a heinz 57 varieties in places.

Generics

By

Now I’m sure you’re all bored to tears with yet another blog on generics but I felt it was only fitting to convert all my example code.

So, here is what the immutable collection examples turned out like:

public class Component {private final Collection&lt;Component&gt; _subComponents;public Component(Collection&lt;Component&gt; subComponents) {Assert.isTrue(subComponents != null, "subComponents can't be null!");_subComponents = Collections.unmodifiableCollection(Arrays.asList(subComponents.toArray(new Component[subComponents.size()])));}public Collection&lt;Component&gt; getSubComponents() {return _subComponents;}}

Interesting. Because I wanted to use Collections.unmodifiableCollection(Collection&lt;T&gt;) it now needs to know the types. So I had to use the T[] Collection.toArray(T[]) method instead.

Or we could make it ever clearer:

public class Component {private final Collection&lt;Component&gt; _subComponents;public Component(Collection&lt;Component&gt; subComponents) {Assert.isTrue(subComponents != null, "subComponents can't be null!");List&lt;Component&gt; temp = new ArrayList&lt;Component&gt;(subComponents.size());temp.addAll(subComponents);_subComponents = Collections.unmodifiableCollection(temp);}public Collection&lt;Component&gt; getSubComponents() {return _subComponents;}}

And of course my original bitch about custom type-safe collections:

Collection&lt;Customer&gt; customers = ...;for (Iterator&lt;Customer&gt; i = customers.iterator(); i.hasNext(); ) {doSomething(i.next());}

Not bad. Feels like C++ again ;-). Not sure if that’s such a good thing or not.

As Damian Guy points out, we could do even better using the new for construct provided by 1.5:

for (Customer c: customers) {doSomething(c);}

Now that definitely looks much nicer! Unfortunately I don’t have a runtime that will support this - I can’t seem to get J2SDK 1.5 to install on Gentoo :-(

Looking over all my newly re-written (that’s re-factored for you buzzword bingo players) code, I have to say it reads pretty well. Time will tell if it really does add much more than using the old collection classes.

One thing I will note is that it’s nearly impossible to copy and paste the examples for publishing as html. I have to change all the &lt; and &gt; to &amp;lt; and &amp;gt; respectively. Just putting that previous sentence in was a job and a half! hehehe

And lastly, I’m using IntelliJs support for generics using the 2.2 early access compiler from Sun. So no doubt things will change by the time they’re released in J2SDK 1.5 proper.

Be gone ye foul smelling custom type-safe collections!

By

I used to use STL type-safe collections in C++ all the time and when I moved to using generic collections in Java something just felt wrong. How was I to ensure my collections contained the correct types?

But you know what? Much to my surprise over the years, I’ve never really had a problem. I can’t even remember the last time (if ever) I had a class-cast exception from accidentally adding the wrong type of object to a collection.

Which one of these would you rather?

The generic collections approach:

Collection customers = ...;
for (Iterator i = customers.iterator(); i.hasNext(); ) {
  doSomething((Customer) i.next());
}

Versus the custom type-safe collection:

public class CustomerCollection extends AbstractCollection {
  private final Collection _customers = new HashSet();
  public boolean add(Object object) {
    return addCustomer((Customer) object);
  }

  public boolean addCustomer(Customer customer) {
    return _customers.add(customer);
  }

  public int size() {
    return _customers.size();
  }

  public Iterator iterator() {
    return customerIterator();
  }

  public CustomerIterator customerIterator() {
    return new CustomerIterator(_customers.iterator());
  }

  public final class CustomerIterator implements Iterator {
    private final Iterator _iterator;
    private CustomerIterator(Iterator iterator) {
      _iterator = iterator;
    }

    public void remove() {
      _iterator.remove();
    }

    public boolean hasNext() {
      return _iterator.hasNext();
    }

    public Object next() {
      return nextCustomer();
    }

    public Customer nextCustomer() {
      return (Customer) _iterator.next();
    }
  }
}
CustomerCollection customers = ...;
for (CustomerIterator i = customers.customerIterator(); i.hasNext(); ) {
  doSomething(i.nextCustomer());
}

Not so bad you say? Well try doing this for every type for which you need to hold collections. Then try implement List or Set instead of the basic Collection. Don’t even think about what it takes to implement Map. I admit, you might end up re-factoring some of this into a generic base class. You could even write one without implementing any of the collection interfaces. But then, why bother?

Sure, they might have save me a few keystrokes here and there but because most people (as I have done in my example) create their type-safe collections with names like getCustomer(), etc. it saves me 2 key-strokes over using a simple cast like (Customer). In fact, in the example given, I actually ended up with more key-strokes!

“But it’s not about keystrokes, it’s about type-safety” I hear you exclaim. Well yes, you may be right. But as I mentioned earlier, I don’t recall this ever being a problem on any projects I’ve worked on. Oh, how that I think about it, there may have been one project but I’m pretty sure we simply shot the offending developer for being such an imbecile ;-)

I’m currently enduring the pain of working on a project where the consensus was that we should have type-safe collections. I seem to spend half my day implementing these instead of letting IntelliJ and Eclipse do a fine job of adding the casts in for me.

And so it is that I eagerly await the release of JDK 1.5 and language supported type-safe collections in the hope that it’ll once and for all stop people writing these useless custom implementations. Egads!

Immutable Collections

By

Oh what joy. Back on the last week of my project from a brief break for xmas and straight to the inevitable cvs update. Yikes!

This is probably old hat to a lot of you but for all others, Collections.unmodifiableCollection() and its siblings don’t actually make a collection immutable!

By way of example, the following code is broken:

public class Component  {
  private final Collection _subComponents;

  public Component(Collection subComponents) {
    Assert.isTrue(subComponents != null, "subComponents can't be null!");
    _subComponents = Collections.unmodifiableCollection(subComponents);
  }

  public Collection getSubComponents() {
    return _subComponents;
  }
}

Looks pretty good but unfortunately, the following test breaks:

public void testImmutability() {
  Collection originalSubComponents = new HashSet();
  Component component = new Component(originalSubComponents);
  Collection returnedSubComponents = component.getSubComponents();
  assertNotNull(returnedSubComponents);

  // First, we test to see if the returned collection is immutable
  try {
    returnedSubComponents.add(new Object());
    fail("sub components should be unmodifiable");
  } catch (UnsupportedOperationException e) {
    // Pass
  }

  // Next, we test to see if we can subvert the immutability
  // by addding to the original collection
  assertTrue(returnedSubComponents.isEmpty());
  originalSubComponents.add(new Object());
  assertTrue("sub components should be a defensive copy", returnedSubComponents.isEmpty());
}

We haven’t made a defensive copy! So let’s add a little more code to get the test to pass:

public class Component  {
  private final Collection _subComponents;

  public Component(Collection subComponents) {
    Assert.isTrue(subComponents != null, "subComponents can't be null!");
    _subComponents = Collections.unmodifiableCollection(Arrays.asList(subComponents.toArray()));
  }

  public Collection getSubComponents() {
    return _subComponents;
  }
}

Remember, anytime a caller passes you a mutable object (such as Collection, Date, Calendar, etc.), you need to make a defensive copy if you’re going to hold on to it. This will ensure no one accidentally subverts your own immutability.

Why did I use Arrays.asList() to create the copy? Well I’m glad you asked:

  • Because I know the collection will never be modified (that’s what immutable means after all), I don’t need to maintain the semantics of the original collection (which may have been a Set for example);
  • The resulting List will be have been created with exactly the right size to accomodate the contents of the collection, meaning no re-allocating buffers. As a reader points out the standard ArrayList constructor that takes a collection as an argument will allocate an additional 10%;
  • Arrays List implementation performs well when iterating; and;
  • The iteration order is preserved (if that was important).

It is also possible to use new ArrayList(int) followed by ArrayList.addAll(Collection), which is probably easier to read?:

public class Component  {
  private final Collection _subComponents;

  public Component(Collection subComponents) {
    Assert.isTrue(subComponents != null, "subComponents can't be null!");
    List temp = new ArrayList(subComponents.size());
    temp.addAll(subComponents);
    _subComponents = Collections.unmodifiableCollection(temp);
  }

  public Collection getSubComponents() {
    return _subComponents;
  }
}

As one reader has demonstrated, on his version of the JDK it uses an iterator which is the way I had assumed it would be implemented but I was thrown off when I looked at the source code for the version of the JDK I have and discovered that it creates a temporary array. Either way this post wasn’t really about performance so I guess I’m getting a little off track now.

P.S. Anyone see the funny side of this test given my previous post?

Be Assertive

By

I recently discovered a bug in an Ant task I had written a while back. A customer was running the code and it barfed with an IllegalStateException telling them proudly that the code"formatterType cannot be null"code. Nothing special about that I suppose. A NPE would have done much the same thing. The only difference is that the code is obfuscated. That means the stack trace is totally useless. Not only that, but they were running an older version of the software which would have meant I would have had to check out that particlar version from CVS, build it and see if I could work out what was going on.

As it turned out, There were only two places in the code that performed a null pointer check with this particular message and only one of them is called from the Ant task. A bit of detective work later (like 5 mins tops) and I had a failing unit test written and running. The fix was one line. Actually half a line as it required assigning a default value to an instance variable in the declaration :-).

I program defensively. I check parameters for null pointers, I assert that objects are in the correct state at the start of the method etc. I’ve found I catch strange bugs much, much earlier than if I simply wait for the NPE. I also have a preference for immutable objects, which substantially reduces the set of things I need to check for. I’ve had countless debates with, mostly junior, developers who believe that it’s better just to wait for the NPE. But fundamentally, I put assertions into my code so that out in the field my software fails in predictable ways. Ways that will ultimately help me to identify the nature and cause of a problem as easily as possible.

I even have an IntelliJ macro setup so that it’s as painless as possible to add a check to my code. And just to be sure, James Ross and I even wrote a custom checkstyle check to ensure that an object reference is always checked for null before being used.

There are also a number of libraries around such as iContract and jContractor, that go way beyond simple assertions and allow you to add Design by Contract conditions to your code via javadoc comments and byte-code modification, special methods, macros that a pre-processor inlines, etc. Though I’ve never used them, I’d be interested in comments from anyone that has.

The thing is, I’ve never used the assert statement in java. That doesn’t mean I don’t believe in assertions, clearly. I’ve just never used the built-in support. I’ve never liked the fact that you can turn them off. To me, this defeats the reason for having them there in the first place. In fact the JDK assertions (at least in 1.4) have to be explicitly turned on. I’m not so much interested in catching NPE’s etc in development as preventing really bad things happing in production so to me, it seems rather pointless to turn then off.

Instead, I use a custom Assert class that works in any JDK and can’t be turned off.

public final class Assert {
    private Assert() {
        throw new UnsupportedOperationException("Constructor should not be called");
    }

    public static void isTrue(boolean condition, String message) throws IllegalStateException {
        if (!condition) {
            fail(message);
        }
    }

    public static void fail(String message) throws IllegalStateException {
        throw new IllegalStateException(message);
    }
}

(This is one of those times where I feel it’s just fine to have a bunch of static methods. Imagine butchering your code to have an Assert object passed into the constructor of every class I ever created? YUK! ;-) All you AOP weenies get back in your box before things get ugly!)

I also noticed commons-lang has a Validate class that provides the same functionality (and then some). Though, again, I’ve never used it myself.

I can think of one possible reason why I might want to turn off not only an AssertionError being thrown but the condition check itself. If you’re using asserts for checks such as “when the method ends, there should be at most 2 address records for the customer”. This may well be an expensive operation that you don’t want performed everytime in production but I don’t tend to check these sorts of things. Now I’m thinking maybe I should? Or is that what my unit tests are for? But then usng asserts, I could turn them on in production if I needed to catch some bug I was having trouble replicating. Having said that of course, if I had a sufficient test suite, I bet I could find the bug just as easily without resorting to debugging in production. Hmmmm

Best Practices

By

I happened upon this post on the artima web site which reminded me that I had been wanting to rant a little about best practices for a while.

It has always seemed to me that so called “industry best practices” are really only ever identified at the end of a cycle of innovation. Or in other words, when many in the industry have moved on to developing “better” practices. My observation has been that companies striving to adhere to “best practices” are often at least 12-18 months behind the leaders.

Some (read large, conservative, glacial, etc.) companies take so long to adopt advances in technologies and methodologies, etc. that by the time they do, they’re really using legacy* practices. They are highly risk averse. They can’t, won’t or don’t employ talented developers. Instead they hire an army of over-paid whiteboard architects who couldn’t build a software system if they tried. These whitebaord architects, with little or no understanding of what really goes on their own industry, need to protect their backsides by showing little or no innovation by promoting whatever methodolgy sees most of a projects time and budget justifying their employment and ensuring there is always someone else to blame if/when something goes wrong.

Now, don’t misunderstand me. I’m, in no way suggesting you jump on the next AOP, IoC, whatever wagon that rolls into town. But don’t be fooled that simply automating manual processes or creating a web site (inter-, intra- or -extranet) will suddenly make your business more efficient. We should be striving to deliver high-quality, adaptable systems that allow the business to realise competitive advantage as early as possible.

  • For clarification, I’ll define legacy to mean anything that clearly doesn’t work to all but the people who try so hard to believe that it does not least because their jobs probably depend upon it. A kind of managed failure. As a reader suggests, if it works, it’s not legacy.

I'd like to coin a phrase...

By

… for software that bears not the hallmarks of great craftsmanship so much as lovingly “hand crufted code” ;-)

Testing with factories

By

A long time colleage of mine asked me yesterday about IoC. After explaining about constructors, etc. we then discussed factories. He understood the way you would pass a single instance of say a widget to an object via the constructor, but what about something more complex that doesn’t know which widget it wants or needs multiple widgets. Enter the factory.

Factories are really no different from other java objects that require some kind of lookup mechanism to find such as, JDBC connections, JNDI contexts, etc. One of the most common ways to implement factories to facilitate runtime lookup is the so called abstract static factory “pattern”. (I used the term pattern simply because others have. In general I find most people abuse the use of the word, but that’s a rant for another time).

For those not familiar with abstract static factory, the general idea is you create an abstract base class with a static method, say newInstance() or getInstance() for example, that returns the concrete implementation of the factory and an abstract method, say createWidget() that the concrete class implements. Typically, the concrete class is determined at runtime by using something as simple as a system property or something a little more sophisticated (ala META-INF/services). If you’ve ever used JAXP, you’ve been using one without possibly realising it.

public abstract class WidgetFactory {
  public static final WidgetFactory newInstance() {
    try {
      return Class.forName(System.getProperty(WidgetFactory.class.getName()).newInstance();
    } catch (...) {
      throw new IllegalStateException("Can't instantiate concrete factory: " + e.getMessage());
    }
  }

  public abstract Widget createWidget();
}
public final class WidgetFactoryImpl extends WidgetFactory {
  public Widget createWidget() {
    return new WidgetImpl();
  }
}
public final class Window {
  public Window() {
    add(WidgetFactory.newInstance().createWidget());
    ...
  }
  ...
}

I will usually have to set a system property to tell the abstract factory which concrete implementation to use. Apart from the fact that this destroys my ability to run multiple tests in parallel (system properties are global!), you could also think of it as a kind of violation of encapsulation - my test class has to know how the abstract factory is implemented so that I can tell it to return my mock factory instead.

public void testSomething() {
  System.setProperty(WidgetFactory.class.getName(), MockWidgetFactory.class.getName());
  Window window = new Window();
  ...
}

So anyway, given my love of TDD which seems to lead me to pass implementations of interfaces into constructors (ala IoC), I have a dislike of most things static. Even the abstract static factory.

Instead, I prefer to have the factory defined by an interface. Then pass an instance of the interface to the class that depends on the factory (ie no more newInstance()). In my test this can be a mock implementation, and at runtime this can be configured in a number of ways to pass a real implementation.

public interface WidgetFactory {
  public Widget createWidget();
}
public final class WidgetFactoryImpl implements WidgetFactory {
  public Widget createWidget() {
    return new WidgetImpl();
  }
}
public final class Window {
  public Window(WidgetFactory factory) {
    add(factory.createWidget());
    ...
  }
  ...
}
public void testSomething() {
  Window window = new Window(new MockWidgetFactory());
  ...
}

Sometimes this is difficult to achieve, especially when you have to use a 3rd-party API that only has an abstract static factory (such as JAXP). In this case, you really have a few possibilities:

  • Call newInstance() externally and pass the resulting concrete factory into the constructor. At least this way, you should be able to implement some kind of mock implementation, even if that means extending the abstract factory;
  • In the case of sframeworks that force you to have a default constructor, consider a hybrid where your objects have both a constructor that accepts an instance of a factory for your own use, and a default constructor that falls back on some kind of lookup mechanism (such as abstract static factory) and simply passes the factory instance to the other constructor; or
  • In the case of struts, as James Ross has pointed out, you can hang the factories out of the session or the application context and let struts set them for you; or lastly;
  • Continue using the factory ASIS from within the class. In the case of JAXP I can’t see a big problem with this (one hopes that it is already fully tested!);

Naturally, I have a preference for the first approach if I can get away with it as it makes my life much simpler when it comes time to test and debug. No more flipping system properties, tests that can be run independently and in parallel, etc.

I can think of reasons for using the last method that are mainly to do with resource issues like “but I don’t want an instance of the factory lying around if I don’t need it. If that really is going to cause you resource issues (which I very much doubt), then at least wrap the factory in a class that lazily creates a factory for you.