Virtual by default is better

February 28th, 2008 No comments

I’ve taken quite a few paradigm shifting journeys in my life. Some have been quick insights, instant moments of clarity. Others have been slow processes, spanning over several years. One such slow journey for me has been how I look upon virtual methods of object oriented languages.

Java methods are virtual by default. Unless explicitly marked as final, a method may be overridden in a deriving class.

class A {

  a() { … }

  final b() { … }

}

Class B extends A {

  // B::a overrides A::a
  a() { … }

  // Compile error, A::b is non-virtual
  b() { … }

}

C# on the other hand, takes the opposite approach. There you have to explicitly mark a method as virtual, and an overriding method must be marked with the override keyword.

class A {

  virtual a() { … }

  b() { … }

}

class B {

  override a() { … }

  // This won’t compile since A::b is non-virtual.
  // Without the override keyword though,
  // B::b only hides A::b which produces a
  // warning instead of an error.

  override b() { … }

}

In that sense C# is more expressive, and intention is better communicated with the explicit marks. On the other hand, Java is more flexible and less restrictive.

I used to favor the non-virtual by default approach. Partly because I was kind of born with it – my first OOP language was Object Pascal and Delphi – but also because I appreciate the better control as to how my components are being used. The ability to restrict polymorphism is good because usage in a way that wasn’t intended can create a lot of mess.
To give one non-obvious example, consider a library class A with a method a.

class A {

  a { … }

}

A library user decides he needs to add functionality to A, so he creates a descender B with a method b.

class B: A {

  b { … }

}

Everything works just fine, but two year later a new version of the library is released. In this new version a new method b was added to class A. Suddenly B::b overrides A::b although this was not the original intention. In a language like C# compiling class B will result in an error (since b is not virtual), but in Java and most other object-oriented languages the upgrade may have introduced a subtle bug.

My standpoint regarding virtual methods used to be that the need for control outweighs the need for flexibility, but some five years ago I started to re-evaluate that opinion. The reason was automated testing. I have slowly come to realize that the C# object model is a pain, at least when it comes to putting a piece of legacy code under a testing harness. While there are plenty of situations where sub classing is a bad idea, unit-testing is not one of them. Sub classing can be a powerful dependency breaking technique, and it allows for testing of code that isn’t otherwise testable without refactoring.

One (contrived) example:

class SpecialConnection {

  //…

  send_string(String s) {
    //…
  };

  //…
}

class SomeLegacyClass {

  SpecialConnection _conn;

  SomeLegacyClass(SpecialConnection connection) {
    //…
  }

  //…

  create_and_send_string() {
    String s;

    // some code that builds
    // the string to send

    _connection.send_string(s);
  }

  //…
}

How can I test the create_and_send_string method? Well, if SpecialConnection::send_string is virtual the answer is easy, just subclass SpecialConnection and stub out the method.

Class FakeSpecialConnection: SpecialConnection {
  Send_string(String s) {
    if (s != “expected value”)
    throw new Exception(…);
  }
}

And the test code could be like this:

SomeLegacyClass c = new SomeLegacyClass(new FakeSpecialConnection());
c.create_and_send_string();

On the other hand, if send_string is not a virtual method we must do some refactoring (like extracting an interface from SpecialConnection) before we can do testing without a real connection. One could argue that the code in the example is poorly designed, and that we should refactor it anyway. That’s true, but a prerequisite for safe refactoring is thorough unit-testing, something we don’t have at this point, which is why we’re trying to get this piece of code into a testing harness in the first place; so that we could make safe refactorings.

When it comes to testing, virtual methods means more options, better options, safer options. That is the reason why I have converted, and now value that flexibility more than I fear possible regression.

Cheers!

Categories: programming Tags:

OT: Al Gore and “The Climate Crisis”

February 19th, 2008 4 comments

Sorry for the off-topic.

I saw Al Gore on Swedish television last week. He had received the Gothenburg Sustainable Development Award and was holding a speech at the ceremony. As usual Al Gore was able to capture his audience with a mix of scaring facts and a call for action. This time he conveyed the new prediction that the arctic ice could be gone in as little as five years. While that could very well be a grave exaggeration, we can’t deny that something is terribly wrong with our climate.

Where I live, a coast town in the middle of Sweden, the rising temperature has been particularly noticeable. The first sign came in the late 1980’s when I experienced my first non-white Christmas. Now, having no snow in December has become normal.

This year, the year of 2008 – a year I’ll never forget – we have witnessed something I thought was impossible. This year we had rain in January. Where I live January is the winter month. It’s supposed to be cold and the only thing coming down from the skies should be snow. In my life, and I’ve been on this planet for 37 years, I have never experienced rain in January before. But this year it rained for ten whole days.

So what can we do? Well, Al Gore’s mantra is that “we can do something about it,” and I agree. There are definitely things I can do to reduce my share of released greenhouse gases. I can use my car much less (we have great public transportations here,) and I can, with little effort, save energy. Simple things would be a good place to start; like really turning things off when they’re not used (computers, the TV, lamps), turning heat down (human beings sleep best in 18 degree Celsius) or replacing old energy devouring machines (like our freezer from 1980).

My promise to you, my dear reader, and to the rest of the world, is that 2008 will be the year when I start to implement changes to become a more optimized human being. I will make sacrifices, my life will be less convenient, but I’ll do it happily for the sake of my children and for future generations.

My intentions are not entirely unselfish though, I expect the changes to have a significant and positive effect on my economy as well.

Cheers!

P.S. I can’t help wondering what the world would be like if the outcome of the election had been different.

Categories: off-topic Tags:

The Open-Closure-Close Idiom in D

February 13th, 2008 5 comments

In a Reddit discussion following my last post on object lifetime management in D, bonzinip wondered why the D closures aren’t used with the open-closure-close idiom that is common, for instance, in Ruby.

bonzinip: languages with closures (Ruby, Smalltalk) use them to ensure that the file is closed when you get out of scope, and that does not require new keywords and does not require to bypass GC and possibly get dangling pointers.

NovaProspekt: D actually has full closures

bonzinip: so why don’t they use them with a “try { open; run closure; } finally { close; }” idiom instead?

Well, let’s take a look at how that could be accomplished in D.

To get an image of the open-closure-close idiom, we can look at Ruby’s File class, and particularly the open method.

IO.open(fd, mode_string=”r” ) => io
IO.open(fd, mode_string=”r” ) {|io| block } => obj

With no associated block, open is a synonym for IO::new. If the optional code block is given, it will be passed io as an argument, and the IO object will automatically be closed when the block terminates. In this instance, IO::open returns the value of the block.

What this means is that a File can be opened, either in the normal way with explicit closing (begin-ensure corresponds to try-finally in D)

f = File.open("c:\\log.txt", "w")
begin
  f.puts "Log this"
ensure
  f.close
end

or with a code block (closure)

File.open("c:\\log.txt", "w") do |f|
  f.puts "Log this"
end

In this last example the file is both opened and closed by the open method, in between the method invokes the associated code block with the file object reference as a parameter (f). In essence, that is the open-closure-close idiom.

How could we do this in D? Well, as I’ve written before, D’s delegates in combination with function literals can be used to mimic the code blocks of Ruby. So, a D version of the File::open method could look something like this:

import std.stdio;
import std.stream;

void open_run_close(char[] fn, FileMode fm = FileMode.In,
  void delegate(Stream) closure)
{
  Stream f = new File(fn, fm);
  try {
    closure(f);
  } finally {
    f.close();
  }
}

and be invoked like this:

open_run_close("c:\\log.txt", FileMode.Out, (Stream f) {
  OutputStream os = f;
  os.writefln("Log this");
});

To make it a little more like the original Ruby version, let’s make it support the non-closure variant as well.

Stream open_run_close(char[] fn, FileMode fm = FileMode.In,
void delegate(Stream) closure = null)
{
  Stream f = new File(fn, fm);
  if (!closure) {
    return f;
  } else {
    try {
      closure(f);
      return null;
    } finally {
      f.close();
    }
  }
}

// Example usage
open("c:\\log.txt", FileMode.Out, (Stream f) {
  OutputStream os = f;
  os.writefln("Log this");
});

// Example traditional usage
OutputStream f = open("c:\\log2.txt", FileMode.Out);
try {
  f.writefln("Do log this");
} finally {
  f.close;
}

The open-closure-close idiom is definitely doable in D. It’s not as pretty and natural as in Ruby, but not so far from it either. Whether this particular idiom will be embraced by the D community is yet to be seen.

Cheers!

Categories: D Programming Language Tags:

Managing Object Lifetimes in D

February 7th, 2008 14 comments

The D Programming Language is a modern version of C. It adds productivity features to the performance power of C, features like object oriented programming and garbage collection.

It may seem strange that a language with focus on performance utilizes automatic memory management. A GC equals overhead, right? Well, actually that is a common misconception. These days implicitly memory managed code is generally faster than code where the programmer handles deallocations. One reason for this counter-intuitive fact is that a number of optimizations can be done by the GC, especially on short-lived objects.

But garbage collection isn’t a problem free feature. For one thing, it is indeterministic. Normally, an object that uses external resources acquires them in a constructor and releases them in the destructor.

import std.stream;

class LogFile {
  File f;

  this() {
    f = new File("c:\\log.txt", FileMode.Out);
  }

  ~this() {
    f.close;
  }

  // Logging methods goes here
}

But with a GC there is no way to know when or even if the destructor is run. This may be a problem if you’re dealing with scarce resources like window handles, or file locks. So how can we control object destructions in a GC capable language? One possibility is to trigger a GC sweep explicitly in your code. In D this is done with the fullCollect function.

import std.gc;

fullCollect();

Explicit trigging of garbage collection is usually a bad idea, for several reasons. For one thing, it’s like killing an ant with a bazooka, and still you risk missing the target. Therefore, the normal approach to deal with the problem of indeterminism is to use a dispose method.

class LogFile {
  File f;

  this() {
    f = new File("c:\\log.txt", FileMode.Out);
  }

  ~this() {
    dispose();
  }

  void dispose() {
    if (f !is null) {
      f.close;
      f = null;
    }
  }

  // Logging methods here
}

LogFile log = new LogFile();
try {
  // Do some logging
} finally {
  log.dispose;
}

We don’t have to use the dispose pattern though because in D we have the luxury of choosing between implicit and explicit memory management. We can either free the object ourselves with the delete operator, or leave it for the GC to dispose later.

LogFile log = new LogFile();
try {
  // Do some logging
} finally {
  delete log; // Explicit memory management
}

D has another construct which is very convenient when you need to control the lifetime of an object. With the scope attribute, an object is automatically destroyed when the program leaves the scope in which the object was created.

void function some_func() {
  scope LogFile log = new LogFile();
  :
  // log will be freed on exit
}

The ability to mix explicit and implicit memory management is a simple, yet great feature. It is one of many examples where D provides us with the best of two worlds. Convenience and control, as well as productivity and performance, is blended in a way that has no equivalence in any other language I know.

Cheers!

Categories: D Programming Language Tags:

Loop Abstractions in D revisited

January 31st, 2008 4 comments

In my previous post on Loop Abstractions in D I showed you how we could make loop constructs abstract, in a similar way which is common in Ruby. The example I used as a model was the retryable method from Cheah Chu Yeow. His version is customizable in a way that let you define the type of exception that triggers a retry.

retryable(:tries => 5, :on => OpenURI::HTTPError) do
  # retryable code goes here
end

To mimic that in D we had to use templates, which are invoked with a special syntax.

retryable!(HTTPError)({
  // Retryable code goes here
}, 5);

To be honest, I don’t like the template syntax. I don’t know why, it just doesn’t feel right. If possible, I’d much prefer a more native looking code. Maybe something like this:

retryable({
  // Retryable code goes here
}, 5, HTTPError);

Christopher Wright points out an implementation that would be the closest one could get to a signature like that. He uses the somewhat primitive RTTI in D.

void retryable(ClassInfo info, int times, 
  void delegate() totry)
{
  for (int i = 0; i < times; i++) {
    try {
      totry();
      break;
    } catch (Exception e) {
      if (e.classinfo is info) continue; else throw e;
    }
  }
}

Which could be invoked with the following code.

retryable(HTTPError.classinfo, 5, {
  // Retryable code goes here
});

The problem with this approach, which was pointed out by Jarret Billingsley, is that this implementation wouldn’t catch and retry on exceptions from derived classes (descendants to HTTPError in the above example). Fortunately, Jarret provides us with a solution.

What you have to do then is perform a dynamic cast. There’s no syntax for this, but you can
hack around with some of the runtime functions to do the same thing. Place:

extern(C) Object _d_dynamic_cast(Object o, ClassInfo c);

somewhere in your module, then in the catch clause of retryable:

catch(Exception e)
{
  if(_d_dynamic_cast(e, info))
    continue;
  else
    throw e;
}

That _should_ work. It’s doing the same thing as the cast operator would
but without the pretty syntax.

Not pretty, but it works, at least if you use one of the standard libraries: Tango or Phobos. I’m not sure it’s better than the template version though. The .classinfo property brings nearly as much noice as the template version does. Also, the template version has the advantage that it is resolved at compile-time.

I think I’ll go with templates after all. Who knows, I might even get used to them one day.

Cheers! 🙂

Categories: D Programming Language Tags:

The Most Essential Development Problem

January 25th, 2008 6 comments

Paul W. Homer has a post up discussing essential development problems. While this is a huge subject (somewhat reflected by the size Paul’s article,) I’d like to emphasize one of the things he touches: the lack of understanding the users’ problem domain.

The biggest problem in most systems today is the total failure of the programmers to have any empathy for their users.

I too have a similar perception, and I have many times been guilty as charged, although I’ve come to realize that I’d better take the Customer View if I want to succeed. Still, I fail from time to time in the analysis of the users’ situation. I don’t know why this keeps happening, but I think impatience and a strong wish to “get going” is partly to blame.

One part of the problem could also be how many of us look upon system development. We usually decide for technology at an early stage, often on grounds that matters little to the end users. Things like personal preference and hype usually have a big impact on the choices we make. We then build the system pretty much bottom-up, adapting the business process to fit the technology. I’ve seen this unspoken philosophy many times in many places, and the results are usually poor.

The cure is a change of paradigm. We need to start with the users, identify and help develop their processes before we build systems that support them. We need to realize that the system is not the solution; it’s just a part of it.

Another part of the problem is a question of attitude. We need to accept that we’re actually in the support business. Our job is to make things easier for others, not to take the easy way out on the users’ expense, as Paul also points out.

“Use this because I think it works, and I can do it”, produces an arrogant array of badly behaving, but pragmatically convenient code. That of course is backwards, if the code needs to be ‘hard’ to make the user experience ‘simple’ then that is the actual work that needs to be done. Making the code ‘simple’, and dismissing the complaints from the users, is a cop-out to put it mildly.

Of course, there is a trade-off between what users want and what we can provide, but at least we need to start at the right end.

Cheers!

Categories: software development Tags:

New looks and stuff

January 23rd, 2008 4 comments

I haven’t been happy with the way code snippets look on my blog. Yesterday I decided to do something about it, but, which is typical me, the number of changes got a lot more than I set out for, including a WordPress upgrade and a new theme.

After a couple of minor layout and style sheet changes, I was satisfied with the new look – in Firefox that is; When I tested it in Internet Explorer 6.0, I noticed that the content of the sidebar got displaced when there were code snippets in a post.

Display Bug in IE 6

With a little investigation, I discovered that the triggering factor of this annoying bug was the following css rule, or more precisely, the padding property.

code{
  font:1.2em 'Courier New',Courier,Fixed;
  display:block;
  overflow:auto;
  text-align:left;
  margin: 10px 0 10px 0;
  background: #FBF5DF;
  border-top: solid 1px #EDE0B3;
  border-bottom: solid 1px #EDE0B3;
  padding: 5px 10px 5px 10px;
}

When I changed it and removed the right hand side padding, the problem went away.

code{
  font:1.2em 'Courier New',Courier,Fixed;
  display:block;
  overflow:auto;
  text-align:left;
  margin: 10px 0 10px 0;
  background: #FBF5DF;
  border-top: solid 1px #EDE0B3;
  border-bottom: solid 1px #EDE0B3;
  padding-left: 5px;
  padding-top: 10px;
  padding-bottom: 10px;
}

One may wonder how an internal space property like padding could have this effect on the positioning of other objects, but then again, I’m not surprised. I can see why Internet Explorer is not the favorite browser among web designers, or among you for that matter.

Anyway, I hope you like the new looks of my blog; If you don’t, please let me know.

Cheers!

Categories: blogging, web-design Tags:

Loop Abstractions in D

January 17th, 2008 8 comments

One of the great things with Ruby is the natural way in which you can hide looping constructs behind descriptive names. Like the retryable example that Cheah Chu Yeow gives on his blog.

retryable(:tries => 5, :on => OpenURI::HTTPError) do
  open('http://example.com/flaky_api')
end

Notice how elegantly the loop logic is abstracted; There’s no need to look at the implementation of retryable to figure out what it does. The question is, can we do something similar with D as well? It turns out that with features like delegates and function literals we can actually get pretty close.

bool retryable(int tries, void delegate() dg)
{
  for(int i = tries; i > 0; i--)
  {
    try
    {
      dg();
      return true;
    }
    catch
    {
      // Retry
    }
  }
  return false;
}

Which can be used like this:

retryable(5, {
  open("http://example.com/flaky_api");
}) ;

Not as nice as with Ruby, but almost.

The custom exception of the Ruby version is a tricky one to implement in D. Templates to our rescue.

bool retryable(E)(int tries, void delegate() dg)
{
  for(int i = tries; i > 0; i--)
  {
    try
    {
      dg();
      return true;
    }
    catch (E)
    {
      // Retry
    }
  }
  return false;
}

With the (little bit odd) template syntax, we can then make retryable retry only when, for example, StdioExceptions are thrown.

retryable!(StdioException)(5, {
  open("http://example.com/flaky_api");
}) ;

To clean it up a bit, we can add some defaults (which requires us to switch places between the parameters).

bool retryable(E = Exception)(void delegate() dg, int tries = 5)
{
  for(int i = tries; i > 0; i--)
  {
    try
    {
      dg();
      return true;
    }
    catch (E)
    {
      // Retry
    }
  }
  return false;
}

That gives us a little more freedom when utilizing retryable.

retryable({
  // Retry up to 5 times
});

retryable({
  // Retry up to 10 times
}, 10);

retryable!(StdioException)({
  // Retry up to three times
  // on StdioException failures
}, 3);

I totally agree with Cheah Chu that Ruby is nice, but I think D is pretty cool too.

Cheers!

Adaptive Database Optimizations?

January 9th, 2008 2 comments

I love Rails Migrations. Not only do they help making database development a part of an agile development process, they also make my life easier as a developer with shallow knowledge in the field of database programming. But, even with frameworks like these, I think we’re still dealing with databases on a very low level.

Conceptually, databases are very simple objects. You can store data and you can retrieve data. The complexity should go no further than to declare and organize the data into relational units. Making queries should be a simple matter of getting the right data.

Reality is different. We need to consider performance, create and drop indexes, write stored procedures, configure the database for optimal performance on our specific data and specific usage; We need to write queries that not only gets us the right data, but also gets us the right data efficiently.
To get the most out of our databases we therefore need deep knowledge of the database engine, and how it’s being used by our system. For the last part, all we can do is make a good guess based on tests and supervision. In all the projects I’ve been in so far, tuning databases has always been a hefty task, hard to get right.

If we allow ourselves to think outside the box, does the tuning task really have to be this difficult? We go through great effort to collect and predict usage data although there is an object that has access to the most accurate data at all times: The database engine should be able to figure out the best tuning actions, and do them at run-time.

Java’s Virtual Machine HotSpot is a great example of a self-tuning technique, called Adaptive optimization. HotSpot starts out as a simple interpreter, but performs dynamic recompilation on portions of the bytecode that is heavily used. An adaptive optimizer can theoretically perform better than a pre-compiled program since it can make optimizations based on actual usage and local conditions.

Now, wouldn’t it be possible to create a self-tuning database engine as well?
As I said, I’m not a database expert, and I appreciate the complexity involved, but on a conceptual level there are no real obstacles – that I can see. I see no reason why it couldn’t be done. Can you?

Cheers!

Categories: databases, programming, tuning Tags:

Did I say don’t unit-test GUIs?

January 3rd, 2008 1 comment

Isn’t life funny? Two weeks ago I stated my opinion that unit-testing graphical user interfaces isn’t worth the trouble. Now I find myself doing it, writing unit-tests for GUI components.

What happened, did I come to my senses or go crazy (pick the expression that fits your point of view) during Christmas holidays? No, I still think that unit-testing user interfaces is most likely a waste of time, but for some special occasions, like this one, they will pay off.

My task is to make changes to a tool box control in a legacy application. The control is full of application logic and it has a strong and complicated relationship to an edit area control. There are two things I need to do:
First I have to pull out the application logic of the tool box and break the tight relationship to the edit area. Then I need to push that application logic deeper into the application core, so that the tool box could be used via the plug-in framework.

I figured I had two options. I could either refactor the tool box and make the changes in small steps, or I could rebuild the complete tool box logic and then change the controls to use the new programming interface.
I found the rebuild alternative too risky. The code base is full of booby traps and I have to be very careful not to introduce bugs. Therefore I decided to go with refactoring.

But refactoring requires a unit-testing harness, which of course this application doesn’t have. Trust me; you don’t want to refactor without extensive unit-testing, so here I am setting up the tests around the involved GUI controls. It’s a lot of work, especially since I don’t have a decent framework for creating mock objects, but it’ll be worth the effort once I start messing around with the code.

As a conclusion I’d like to refine the “Don’t unit-test GUI” statement to read “Don’t unit-test GUI unless you’re refactoring it, and there’s no easier way to make sure your changes doesn’t break anything.”

Cheers!