www.hans-eric.com

Generic Types are Litter

July 28th, 2010 Hans-Eric 2 comments

The short version of this post:

Do not spread generic types around. They are ugly and primitive.

Allow me to elaborate.

Generics are great

Before we got Generics (2004 in Java, 2005 in C#, and 2009 in Delphi) the way to enforce type safety on a collection type was to encapsulate it and use delegation.

class OrderList
{
private ArrayList items = new ArrayList();

public void Add(OrderItem item)
{
items.Add(item);
}

// Plus lots of more delegation code
// to implement Remove, foreach, etc.
}

Nowadays, we can construct our own type safe collection types with a single line of code.

List<orderitem> orders = new List<orderitem>();

That alone is good reason to love generics, and there lies the problem; many developers have come to love their generics a little too much.

Generics are ugly

The expressiveness of generics comes at a price, the price of syntactic noise. That’s natural. We must specify the parameterized types somehow, and the current syntax for doing so is probably as good as anything else.

Still, my feeling when I look at code with generics is that the constructed types don’t harmonize with the rest of the language. They kind of stand out, and are slightly negative for code readability.

OrderList orders = new OrderList();
// as compared to
List<OrderItem> orders = new List<OrderItem>();

This might not be so bad, but when we start to pass the constructed types around they become more and more like litter.

List<OrderItem> FetchOrders()
{
//…
}

double CalculateTotalSum(List<OrderItem> orders)
{
//…
}

void PrintOrders(List<OrderItem> orders)
{
//…
}

Besides being noisy, the constructed types expose implementation decisions. In our case the list of orders is implemented as a straight list. What if we found out that using a hash table implementation for performance reasons would be better? Then we’d have to change the declaration in all those places.

Dictionary<OrderItem, OrderItem> FetchOrders()
{
//…
}

double CalculateTotalSum(Dictionary<OrderItem, OrderItem> orders)
{
//…
}

void PrintOrders(Dictionary<OrderItem, OrderItem> orders)
{
//…
}

Comments redundant.

Generics are primitive

A related problem is that the constructed generic types encourage design that is more procedural than object-oriented. As an example, consider the CalculateTotalSum method again.

double CalculateTotalSum(List&amp;amp;amp;lt;OrderItem&amp;amp;amp;gt; orders)
{
//…
}

Clearly, this method belongs in the List<OrderItem> type. Ideally, we should be able to invoke it using the dot operator.

orders.TotalSum();
// instead of
CalculateTotalSum(orders);

But we cannot make that refactoring. We cannot add a TotalSum method to the List<OrderItem> type. (Well maybe we can if we make it an extension method, but I wouldn’t go there.) In that sense, constructed generic types are primitive types, closed and out of our control.

So we should hide them

Don’t get me wrong. I use generics a lot. But for the previously given reasons I do my best to hide them. I do that using the same encapsulation technique as before, or – at the very least – by inheriting the constructed types.

class OrderList : List<OrderItem>
{
  double TotalSum()
  {
    //…
  }
}

So, the rule I go by to mitigate the downsides of generics, is this:

Constructed types should always stay encapsulated. Never let them leak through any abstraction layers, be it methods, interfaces or cl

That is my view on Generics. What is yours?

Cheers!

P.S.

Just to be clear, the beef I have with Generics is with constructed types alone. I have nothing against using open generic types in public interfaces, although that is more likely to be a tool of library and framework developers.

Categories: C#, Delphi, java, opinion, programming Tags: constructed types, generics

Tools of the Effective Developer: Regular Expressions

May 12th, 2010 Hans-Eric 8 comments

Whenever I suggest using regular expressions to solve a string parsing problem, more often than not I’m met with skepticism and frowning faces. Regular expressions have a bad reputation among many of my fellow developers.

(Yes, they are mostly Windows developers, Xnix users don’t seem to have this problem.)

But can you blame them? I mean, have a look at this regular expression for validating email addresses.

[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?

That’s enough to send any normally functioning individual out the door screaming. Fortunately, regular expressions aren’t always this messy. In fact, the simpler the problem, the more effective and beautiful they get.

Let’s look at an example that shows the power and density of regular expressions.
Suppose we have a chunk of text. We know that somewhere in it there’s a social security number. Our job is to extract it.

This is one of those tasks that involve a decent amount of code unless your language of choice has support for regular expressions. In Ruby, on the other hand, it’s a single line of code.


text = “Test data that 123-45-6789 contains a  social security number.”

if text =~ /\d\d\d-\d\d-\d\d\d\d/
  puts $~
else
  puts “No match”
end

If the match operator (=~) and the magical match result variable ($1) puts you off, here’s how it’s done in C# that doesn’t have a special notation for regular expressions, but support them through the .Net framework.


String text = “Test data that 123-45-6789 contains a  social security number.”;

Regex ssnReg = new Regex(@"\d\d\d-\d\d-\d\d\d\d");
Match match = ssnReg.Match(text);

if ( match.Success ) {
  Console.WriteLine(match);
} else {
  Console.WriteLine("No match");
}

A beautiful thing with regular expressions is that it’s really simple to extract the parts of a match. For instance, if we need to extract the area code all we need to do is to put parenthesis around the part we’re interested in. Then we can easily extract that information, in C# by using the match.Groups property.


String text = “Test data that 123-45-6789 contains a  social security number.”;

Regex ssnReg = new Regex(@"(\d\d\d)-\d\d-\d\d\d\d");
Match match = ssnReg.Match(text);

if ( match.Success ) {
  Console.WriteLine(match);
  Console.WriteLine(“Area Number: “ + match.Groups[1]);
} else {
  Console.WriteLine("No match");
}

As intimidating as they may seem, the payoff using regular expressions is huge. And, since efficiency is what we strive for, they have a natural place in our bag of tools. So, learn the basics of regular expressions. You’ll be happy you did.

Finally some advice sprung from my own experience with regular expressions.

• Get your regular expressions right first. Use Rubular, or an equivalent tool, for pain free experimentation before you implement them in your code.
• Document your regular expressions. They are notoriously difficult to read so a short description with example match data is almost always a good idea.

Cheers!

Previous posts in the Tools of The Effective Developer series:

Categories: C#, ruby, software development Tags: Regex

Tools of the Effective Developer: Error Handling Infrastructure

September 3rd, 2009 Hans-Eric 2 comments

I often come across code with little or no infrastructure for error handling. This is a big mistake, one that’ll make you pay increasingly as the code base grows. Why? Because your code’ll end up with loads of this:

try
{
  ParseText(SomeText);
}
catch (Exception e)
{
  MessageBox.Show(&quot;Error while parsing.&quot;, &quot;My Application&quot;,
         MessageBoxButtons.OK, MessageBoxIcon.Error);

  // Skipping the rest due to the error
  return;
}

It may look OK but there are several problems with the error handling code above, some more severe than others.

Code duplication
A message box for displaying error messages to the user should almost always have an error icon, the application name as the title, and an OK button. Repeating this all over the code base is not very DRY and should be avoided.
Inconsistency
A related problem is that when you leave programmers with only the general error handling routines of the platform chances are they’ll end up doing things differently, resulting in an inconsistent user experience; For example showing some error dialogs with an error icon and some without.
Information loss
A more serious problem is the fact that the code in the example, by ignoring the exception object, drops information that could be important to track down a bug. We could of course mititgate that problem somewhat by showing the error message of the exception to the user (by concatinating e.Message), but we’d still miss an important piece of information: the stack trace.
Since dumping a pile of function calls in the face of the users is a no no, what we’d really need is a way to put the details in a non-intrusive place, for example a log file. If the Error Handling Infrastructure doesn’t make this easy, we’re likely to leave that kind of information out completely. For shame.
Not automation-friendly
But the most severe problem with a spread out usage of UI displaying methods like MessageBox is that it makes your code impossible to automate. If someone has to monitor a scheduled run of for example unit tests, and every now and then klick an OK button, it kind of beats the purpose of automation.

So here’s my advice: Implement a strategy for handling errors at the earliest possible time. That is, right after setting up continuous integration for the Hello world application.

Properties of an Error Handling Infrastructure

So how should a error handling infrastructure be like? There’s only one rule. It has to be easy!
If it isn’t simple to use, it won’t be used, and programmers end up using the general purpose message showing methods again, or worse, not doing any error handling at all.

I usually build my error handling interfaces around two use cases.

Displaying error messages
Logging error messages

These are often used in combination, for example

try
{
  ParseText(SomeText);
}
catch (Exception e)
{
  ApplicationEnvironment.ShowErrorMessage(&quot;Error while parsing: &quot; + e.Message);
  ApplicationEnvironment.LogError(e);
  return;
}

Or, to avoid code duplication, combined into a single convenient method.

try
{
  ParseText(SomeText);
}
catch (Exception e)
{
  ApplicationEnvironment.HandleException(&quot;Error while parsing&quot;, e);
  return;
}

When you design your error handling interfaces don’t be afraid to add lots of convenient methods. Remember, it has to be easy to use and that’s what convenient methods are all about, as opposed to utility methods who generally need to be given more arguments. In this case I prefer the Humane interface design style instead of a minimal interface approach.

Needs to be configurable

Furthermore, the Error Handling Infrastructure needs to be configurable or support some other kind of dependency breaking technique like dependency injection. For instance, you should be able to mute all error messages intended for the user, and instead write them to the error log. This way your code will be able to run in a scripted environment, like during unit testing.

There are many ways to design an Error Handling Infrastructure. You could create your own message dialogs allowing the user to send a report right away, you could use the built in application log of the operating system or just plain text files, etc etc.

Whatever you choose to do, don’t forget to do it early and make it easy.

Cheers!

Previous posts in the Tools of The Effective Developer series:

Categories: C#, programming, software development Tags: Error Handling

Virtual or Non-Virtual by Default, Do We Really Have To Choose?

March 4th, 2008 Hans-Eric 3 comments

When it comes to the question of whether methods should be virtual by default or not, there are two schools of thought. Anders Hejlsberg, lead architect of C#, describes them in an interview from 2003.

The academic school of thought says, “Everything should be virtual, because I might want to override it someday.” The pragmatic school of thought, which comes from building real applications that run in the real world, says, “We’ve got to be real careful about what we make virtual.”

As I told you about in my post from last week, I have left the “pragmatic school of thought” to join the “academic” camp. The main reason was unit-testing, that – in my opinion – calls for a more flexible object model than the one of C#. When unit-testing, I often want to use components in unusual ways, all in the name of dependency breaking, and therefore I like their methods to be virtual.

But, it wasn’t – and still isn’t – an easy pick; Virtual by default does bring some serious problems to the table. Again from the interview with Anders:

Every time you say virtual in an API, you are creating a call back hook. As an OS or API framework designer, you’ve got to be real careful about that. You don’t want users overriding and hooking at any arbitrary point in an API, because you cannot necessarily make those promises.

Whenever they ship a new version of the Java class libraries, breakage occurs. Whenever they introduce a new method in a base class, if someone in a derived class had a method of that same name, that method is now an override—except if it has a different return type, it no longer compiles. The problem is that Java, and also C++, does not capture the intent of the programmer with respect to virtual.

C# captures the intent better and avoids versioning problems, and Java offers the flexibility needed for unit-testing. Which is better? The answer seems to be: it depends. But, do we really have to choose? Why can’t we have both? Well, I think we can, and Java has shown the way to achieve it.

Java annotations is a powerful language feature. (The concept was rightfully stolen from C# where it is called custom attributes.) With it one can attach metadata to parts of a program, to give additional information to the compiler or other external agents. In other words, annotations can be used to extend the programming language without the need for changing the language itself.

A good example is the @override annotation.

class SomeClass extends SomeBaseClass {

  &lt;strong&gt;@override&lt;/strong&gt;
  void someMethod() { … }

}

From the Java documentation:

[@override indicates] that a method declaration is intended to override a method declaration in a superclass. If a method is annotated with this annotation type but does not override a superclass method, compilers are required to generate an error message.

The use of the @override annotation takes care of the problem when name changes to virtual methods silently breaks behavior of descending classes; Or, which is more common, when misspelled names of intended overrides aren’t captured at compile-time. Now, with the introduction of the @override annotation, you can help the compiler help you to fail fast. It is now possible to show your intention better than what was possible in the pre annotation days.

Unfortunately Java doesn’t take this concept all the way. They could have introduced an @virtual annotation as a complement to @override, and thus reach the same level of expressiveness as C#, but without forgoing the flexibility of the Java object model. It would be the perfect middle-way, and provide the best of both worlds.

Class SomeBaseClass {

  &lt;strong&gt;@virtual&lt;/strong&gt;
  void someMethod() { … }

}

The benefit of an annotation (or custom attribute) based solution is that it’s configurable. It would be possible to alter the compiler’s behavior based on context or environment. For instance one could enforce the use of @virtual and @override in production code. Additionally, one could relax the controls when necessary, like in test projects or legacy code, to mere warnings or complete silence.

Wouldn’t that be better than the all or nothing solutions of today?

Cheers!

Categories: C#, java, programming Tags:

D doesn’t have real closures

September 11th, 2007 Hans-Eric 12 comments

Note: This article was written before D got full closure support.

Delegates are something that gives me mixed feelings. While I used to embrace them in Delphi and Object Pascal, I now find them difficult to get used to in C#. I guess that is because they introduce a little of the functional programming paradigm into an otherwise object oriented environment.

Don’t get me wrong, I do find delegates powerful and I use them whenever I see fit. It’s just that they bug me, like a badly-pitched note in an otherwise perfect chord. In that respect I am much more fond of the inner classes of Java.

The D Programming Language, my favorite pet at the moment, has support for both delegates and inner classes, thus providing the best of both worlds. And since D is a multi paradigm language with support for functional programming, its delegates are justified.

The semantics of D delegates are somewhat different from that of C#. For instance, C# delegates may bind to both static and non-static methods, while D may bind to non-static methods only. But on the other hand, D delegates may also bind to inner functions, which makes them look a lot like closures.

bool retry(int times, bool delegate() dg)
{
  int tries = 0;
  while (tries++ < times) {
    if(dg())
      return true;
  }
  return false;
}

// Example usage
retry(3, {
  writefln("Password: ");
  char[] buf = readln();
  return (buf == "secretn");
});

In order to be a real closure a delegate is required to keep variables that are bound from the surrounding environment alive, even if the program has left their scope. For example, consider the following program.

int delegate() countdown(int from)
{
  int current = from;
  int countdown_helper() {
    return current\-\-;
  }

  return &countdown_helper;
}

int delegate() dg = countdown(10);
dg(); // should return 10;
dg(); // should return 9;
dg(); // should return 8;

The countdown function returns a delegate that refers to the inner function countdown_helper. Counter_helper binds the variable current from it’s parent by returning its value and decreasing it by one.

So, if D’s delegates were real closures the first invocation of the delegate in the above example should return 10, the second 9, followed by 8, 7, and so on. But in reality they produce bogus values like 4202601 or 4202588. The reason is this: D doesn’t implement real closures. The variable current is invalid because the program is no longer executing the countdown function when countdown_helper is invoked via the delegate.

So D doesn’t have real closures, is that a big deal? No! In my opinion real closures are not important for languages that are not pure functional. The creators of D shouldn’t waste their resources on this feature since it brings little value but lots of implementation complexity. The energy is better spent on features like multicast delegates. Those would bring value in the form of convenience, and I hope future versions of D will incorporate them.

Complete versions of the code samples in this post can be found here and here.

Categories: C#, D Programming Language, Delphi, java, programming Tags:

Great use of annotations

September 3rd, 2007 Hans-Eric No comments

C# entered the field of battle some seven years ago as a better version of Java. The one feature that many people referred to when claiming C# superior, was custom attributes. Now Java has them too, but they go under the name of annotations.

The ability to attach meta data to various parts of your program is a powerful way to integrate with frameworks or configure external utilities. Mike Kaufman, for example, used annotations beautifully to configure an Obfuscator. He has posted the details on his blog.

Categories: C#, java, programming Tags:

www.hans-eric.com

Archive

Generic Types are Litter

Generics are great

Generics are ugly

Generics are primitive

So we should hide them

Tools of the Effective Developer: Regular Expressions

Tools of the Effective Developer: Error Handling Infrastructure

Properties of an Error Handling Infrastructure

Needs to be configurable

Virtual or Non-Virtual by Default, Do We Really Have To Choose?

D doesn’t have real closures

Great use of annotations

Random Posts

Categories

Blogroll

Archives

Meta

www.hans-eric.com

Archive

Generic Types are Litter

Generics are great

Generics are ugly

Generics are primitive

So we should hide them

Tools of the Effective Developer: Regular Expressions

Tools of the Effective Developer: Error Handling Infrastructure

Properties of an Error Handling Infrastructure

Needs to be configurable

Virtual or Non-Virtual by Default, Do We Really Have To Choose?

D doesn’t have real closures

Great use of annotations

Random Posts

Tag Cloud

Categories

Blogroll

Archives

Meta