Archive

Archive for the ‘D Programming Language’ Category

Functional D: Is Transitive Const Fundamental?

July 30th, 2008 14 comments

As I’ve mentioned before, a pure functional subset is forming in the D Programming Language. According to the creators of D, transitive const is a key feature to make this work.

The future of programming will be multicore, multithreaded. Languages that make it easy to program them will supplant languages that don’t. Transitive const is key to bringing D into this paradigm. […] C++ cannot be retrofitted to supporting multiprogramming in a manner that makes it accessible. D isn’t there yet, but it will be, and transitive const will be absolutely fundamental to making it work.

[Quote from the D website]

What is transitive const?

Just a quick explanation for those of us who doesn’t have academic terms in close memory. Transitivity is a property of some binary relations, for example equality:

if A = B and B = C, then A = C

Applied to the concept of const it means, simply put, that anything reachable from a const type is also a const. So, for a declaration const int **p, p is const, as well as *p and **p.
The same is true in the case of composite types:

class A {
  int f;

  void set_f(int a_value) {
    f = a_value;
  }
}

const A a = new A();
a = new A(); // error
a.a = 2; // error
a.set_a(2); // error

All three reassignments above result in compiler errors due to the fact that a is const, and anything reachable from it is also const.

Why does it matter?

So, in what way is transitive const fundamental to concurrent programming? Well, it isn’t. What Walter Bright and his companions refer to is the fact that pure functional programming is thread-safe by design. That is, in a pure functional language the result of a function is solely dependent on its arguments. Thus, in a code like this:

val = some_func( a(), b(), c() );

functions a(), b() and c() can be safely executed concurrently in a multi-core architecture; Nothing a(), b() or c() does can affect each others results. This is not the case for imperative languages that builds on the notion of mutable and global state. With mutable state comes hidden side-effects (a(), b() or c() could change common data and cause raise conditions) that complicates multi-thread programming so much.

What the people behind D tries to do is to create a pure functional subset within the language. I like to refer to it as Functional D. Such a subset would allow us to write code that is thread-safe by default, all you have to do is to write Functional D code. The compiler would then be able to chisel out the functional code and fully utilize the advantages of functional programming.

Immutable data and pure functions are fundamental

To make this work we need a way to make data immutable and a way to shut down access to the global state. In D you use the invariant keyword to create immutable data. The pure keyword is used to mark functions that may only take invariant arguments, no access to the global state, and that may only invoke other pure functions. (As of this writing, the semantics of the pure keyword is not yet implemented).

int g = 0;

pure int pure_func(invariant int a) {
  a = 0; // error, a is invariant
  g = 1; // error, can't write to global g
  writefln(a); // error, writefln is not pure
  return g + a; // error, can't read global g
}

How does transitive const fit into all of this? To use the intuitive definitions from Andrei Alexandrescu’s slides on the functional subset of D:

  • const(T) x: I can’t modify x or anything reachable from it
  • invariant(T) x: Nobody can modify x or anything reachable from it

Const is not strong enough to be used in the functional subset (which depends on truly immutable data), but it has one application that could be important. From Walter Bright at the D newsgroup:

Const allows you to reuse code that works on invariants to work with
mutables, too.

How usable is const to the functional subset?

Const can be used to write code that works with data from both the imperative (mutable) and the functional (immutable) subsets. For example, the print function below.

void print(const int a) {
  writefln(a);
}

const int a = 1;
print(a); // ok

invariant int b = 2;
print(b); // ok

print(3); // ok

The reason this works is that invariants and immutable data is implicitly converted to const when necessary. One can question how useful this would be in practice though, the print function above would not be invokable from the functional subset (which would require it to be pure).

My conclusion is that although it may very well be important, transitive const is not “absolutely fundamental to making it work.” Transitive invariant, on the other hand, is.

Cheers!

The Functional Subset of D

May 20th, 2008 6 comments

As I wrote about in an earlier post, the future of D lies in the field of functional programming. More specifically, what the creators of D are trying to do, is to construct a pure functional subset that can be utilized within the otherwise imperative language.

Let’s take a closer look at that functional subset that is taking form in the experimental 2.0 version of D.

Immutable data

The most fundamental difference between a purely functional language and an imperative one is how they treat data. Many of us are used to think of data in terms of state, where variables can be changed through assignments. But in a functional program there are no states. There are only constant values and functions that operate on them.

In D, immutability is achieved with the invariant keyword, either as a storage class:

invariant int i = 3;

or as a type constructor:

invariant(int) i = 3;

Transitive invariance

The side-effect free nature that comes with immutable data has some great advantages. For one thing it simplifies testing since the result of a function only depends on its input. There are also some optimizations that can be done by the compiler, but the biggest advantage is that programs written in this way are thread-safe by default.

To take advantage of these things the compiler needs to be able to trust the immutability of our data. This is where transitivity comes in. In D, invariance is transitive, which basically means that no data reachable from an invariant reference can be changed. Here’s an example.

int ga = 2; // mutable

struct A {
  int *p = &ga; // pointer to mutable
}

invariant(A) a; // a is immutable
A b; // b is mutable

// invariant is transitive
a = b; // ERROR, a is immutable
a.p = null; // ERROR, a.p is immutable
*a.p = 3; // ERROR, data pointed to by a.p is also immutable

Garbage collection

Since data must never change in a functional program, and consequently must not be destroyed while the data is in use, it’s usually a good idea for a functional language to utilize automatic memory management. Like most functional languages, D features garbage collection (alongside with explicit memory management.)

Higher class functions

In order to do anything interesting in a purely functional language you need higher order functions, or – in other words – the ability to send functions as arguments to other functions. For this we can use the function pointers (or delegates for methods and nested functions).

As an example, let’s say that we want to create a function that calculates the sum of two adjacent Fibonacci numbers. Here’s one way to do that.

int nth_fib(int n) {
  if(n == 0) return 0;
  if(n == 1) return 1;
  return nth_fib(n-1) + nth_fib(n-2);
}

int add_next_fib(int n) {
  return nth_fib(n) + nth_fib(n+1);
}

Now, let’s say that we want to do the same operation on a different sequence, for example natural numbers. Well, we could use the good old copy and paste but that isn’t very DRY. Let’s make add_next a higher order function instead so that it could be used with any sequence function.

int add_next(int n, <em>int function(int) nth</em>) {
  return nth(n) + nth(n+1);
}

int i = add_next(3, &nth_fib);
// i is 8 (3+5)

Now, we can write any sequence function we want and have add_next apply it.

// Sequence function for natural numbers
int nth_nat(int n) {
  return n;
}

int i = add_next(3, &nth_nat);
// i is 7 (3+4)

Note: For methods and nested functions, the keyword function is replaced with the keyword delegate, otherwise it’s the same syntax.

Closures

Closures is another indispensable feature of functional languages. In short, it’s the ability to extract a function pointer for later use, and when invoked the function will still have access to the context in which it was created, even though that context has gone out of scope.

In D, closures are created with the delegate keyword.

int delegate() create_closure() {
  int x = 3;

  int f() {
    return x;
  }

  return &f;
}

int delegate() a_closure = create_closure();
int i = a_closure();
// i is 3

Note that the extracted function f (referenced by the a_closure variable) accesses the local variable x, although it has gone out of scope at the time of execution. D got this ability with the 2.07 version, before that it didn’t have real closures.

Currying

Closures provide an easy way to do currying, which is common in functional languages. Simply put, currying is a technique where functions take a general function and return a new, more specialized one.

For instance, we could curry our add_next function in our previous example and create a specialized version of it, say add_next_fib (and thus get back to where we started).

int delegate(int) curry_add_next(int function(int) nth) {
  int curry_f(int n) {
    return add_next(n, nth);
  }
  return &curry_f;
}

int delegate(int) add_next_fib = curry_add_next(&nth_fib);
int i = add_next_fib(5);
// i is still 8

Pure functions

These features are all we need to write purely functional code, but in order to take full advantage of the functional programming paradigm some major things remain unsolved.

For one thing, the compiler needs to know whether or not our code is functional in order to apply possible optimizations. The easiest way to do this is to give the programmer a keyword to tell the compiler she wishes purity, and then have the compiler enforce it. In D this is the purpose of the pure storage class.

pure int a_pure_square_function(invariant(int) x) {
  return x * x;
}

A pure function must not access non-invariant data, and may not invoke other non-pure functions. As per D2.13, the pure storage class has not had its semantics implemented and are therefore not enforcing purity. I sense this is not a trivial matter, so it may take some time before we have it.

Cheers!

Categories: D Programming Language, programming Tags:

The Future of D is Functional

April 16th, 2008 20 comments

The D Programming Language has an impressive list of cool features. That is not always a good thing. A feature should contribute to the philosophy and foundation on which the language was built. If it doesn’t, harmony breaks down and the result is a language that is harder to learn and to use.

Some people think D suffers from such a feature creep. To some degree I can agree. D has features that are unlikely to become widespread, but most of them are aligned towards a common goal. That goal is bringing productivity to the world of low-level programming.

Lately, a new goal has emerged from the D community, and it has triggered some real intense activity. The future of D seems to lie in the field of functional programming, making the imperative and the functional paradigms truly come together.

Why is this important? Let me quote Walter Bright from a discussion at the D forums:

The future of programming will be multicore, multithreaded. Languages that
make it easy to program them will supplant languages that don’t.
[…]
The surge in
use of Haskell and Erlang is evidence of this coming trend (the killer
feature of those languages is they make it easy to do multiprogramming).

As we all know, multithread programming in an imperative language is a real pain. It’s complicated and easy to get wrong, but that is not the case in a functional language. A pure functional program is thread-safe by design, since it’s free from mutable state and side-effects.

You never have to worry about deadlocks and race conditions because you don’t need to use locks! No piece of data in a functional program is modified twice by the same thread, let alone by two different threads. That means you can easily add threads without ever giving conventional problems that plague concurrency applications a second thought!

Quote from Slava Akhmechet’s excellent article Functional Programming For the Rest of Us.

What the people behind D want to do is to create a true functional subset of the D Programming Language, and create a safe interfacing to parts of the program that are imperative. The functional subset would enforce pure functional programming, like disallowing access to mutable global state and calls to non-pure functions. In effect that would enable you to write parts of your program that need to be thread-safe in a functional style, while using the style of programming that most of us are used to for the rest.

But, it might not stop there. Andrei Alexandrescu, who’s one of the driving forces behind the functional subset, has suggested that the enforcements inside a pure function can be relaxed to allow mutability of what he calls “automatic state,” thus allowing imperative techniques to be used as long as the mutability doesn’t leak across the function barrier. Here’s an example from Andrei’s slides on the subject:

int fun(invariant(Node) n) pure {
  int i = 42;
  if (n.value) ++i;
  int accum = 0;
  for (i = 0; i != n.value; ++i) ++accum;
  return n.value + i;
}

This code doesn’t look a bit functional, but the result of fun() is solely dependent on its arguments, so seen from the outside it’s pure.

Mixing pure functional and main stream imperative programming is certainly bleeding-edge. As far as I know it has never been done in any other language. But even though I’m excited, I can’t help wondering whether this is the right way to go. There’s a risk that D spreads itself too thin, and that the pure functional subset of D will be too noisy; I anticipate a heavy use of the invariant keyword for instance.

Would it be a better option to create a completely new language, say Functional D, and make D and Functional D interoperable on a binary level? At least that would battle the noise by reducing the need for explicitly expressing things like immutability, which can be taken for granted in a functional language. I also suspect that the compilers would be less difficult to implement than a compiler that has to support both styles.

But then again, I don’t know much about making compilers. I just happen to be more of a minimalist when it comes to computer languages. (As opposed to my preferences for APIs and Framworks.)

Expect more posts on this subject as I plan to delve into details next.

Cheers!

Steve Yegge: D is the Next Cool Language

April 1st, 2008 4 comments

I just watched an interview with Steve Yegge on the Google Code Blog. The interview was mostly about his current project Rhino on Rails. It was interesting, although not interesting enough to keep my attention for the entire 25 minutes. After some time I kind of tuned out, but I kept the video cast running in the background.

Good thing I did, because at the end of the interview Steve surprised me big time. If you’ve followed my blog lately you know that the D Programming Language is my favorite toy at the moment, so when he mentioned it in the interview I was all ears again. For your convenience, I made a transcript of the best parts:

You’re a language geek. What new cool language would you want to play with next?

To be honest I really wish I knew more about D. I’ve programmed in the D language and it’s D-lightful. It was quite expressive. I did a bunch of programs that I kind of ported from ruby scripts that I’ve written, and I swear it was only like 20% harder than in the ruby code. It was super tight, except that it was lightning fast. It was like C++ or faster.

Steve also gives his opinion on what’s needed for D to become really big.

If it [the D language] were link compatible with C++, which is hard, then it could be used with all the C++ libraries. Then it could be the replacement of C++.

In case you’d like to hear and see for yourself, the part when he mentions D starts at about 22 minutes and 35 seconds in.

Steve Yegge as part of the D community? Wouldn’t that be something.

Cheers!

Categories: D Programming Language Tags:

D Update Mitigates Comparison Gotcha

March 11th, 2008 No comments

The D programming language tries to make a clear distinction between comparison by equality and comparison by identity. It does so by offering two different operators, one for each purpose.

// Are a and b equal?
if (a == b) { ... }

// Are a and b the same object?
if (a is b) { ... }

As I’ve written about in a previous post this could be a source of confusion for D newbies, like myself. Someone who’s used to the comparison semantics of Java, for instance, is likely to learn to separate equality and identity the hard way. Until now, that is.

This code, where the equality operator is (wrongly) used to compare identities, used to produce a nasty Access violation error at runtime.

SomeObject a = null;
if (a == null) { ... } //<-- Access Violation

Now, with the newly released versions of the D compiler (the stable 1.028 and the experimental 2.012,) this error is nicely caught by the compiler, which kindly instructs us to use the identity operator instead.

Error: use 'is' instead of '==' when comparing with null

Unfortunately, it won’t help us discover all cases of wrongly used equality operators. For instance, this piece of code still produces the dreadful runtime AV.

SomeObject a = null;
if (a == a) { ... } // <-- Access Violation

Still, the update is a big improvement and will make it easier for a lot of people while learning the language.

Cheers!

Categories: D Programming Language Tags:

The Open-Closure-Close Idiom in D

February 13th, 2008 5 comments

In a Reddit discussion following my last post on object lifetime management in D, bonzinip wondered why the D closures aren’t used with the open-closure-close idiom that is common, for instance, in Ruby.

bonzinip: languages with closures (Ruby, Smalltalk) use them to ensure that the file is closed when you get out of scope, and that does not require new keywords and does not require to bypass GC and possibly get dangling pointers.

NovaProspekt: D actually has full closures

bonzinip: so why don’t they use them with a “try { open; run closure; } finally { close; }” idiom instead?

Well, let’s take a look at how that could be accomplished in D.

To get an image of the open-closure-close idiom, we can look at Ruby’s File class, and particularly the open method.

IO.open(fd, mode_string=”r” ) => io
IO.open(fd, mode_string=”r” ) {|io| block } => obj

With no associated block, open is a synonym for IO::new. If the optional code block is given, it will be passed io as an argument, and the IO object will automatically be closed when the block terminates. In this instance, IO::open returns the value of the block.

What this means is that a File can be opened, either in the normal way with explicit closing (begin-ensure corresponds to try-finally in D)

f = File.open("c:\\log.txt", "w")
begin
  f.puts "Log this"
ensure
  f.close
end

or with a code block (closure)

File.open("c:\\log.txt", "w") do |f|
  f.puts "Log this"
end

In this last example the file is both opened and closed by the open method, in between the method invokes the associated code block with the file object reference as a parameter (f). In essence, that is the open-closure-close idiom.

How could we do this in D? Well, as I’ve written before, D’s delegates in combination with function literals can be used to mimic the code blocks of Ruby. So, a D version of the File::open method could look something like this:

import std.stdio;
import std.stream;

void open_run_close(char[] fn, FileMode fm = FileMode.In,
  void delegate(Stream) closure)
{
  Stream f = new File(fn, fm);
  try {
    closure(f);
  } finally {
    f.close();
  }
}

and be invoked like this:

open_run_close("c:\\log.txt", FileMode.Out, (Stream f) {
  OutputStream os = f;
  os.writefln("Log this");
});

To make it a little more like the original Ruby version, let’s make it support the non-closure variant as well.

Stream open_run_close(char[] fn, FileMode fm = FileMode.In,
void delegate(Stream) closure = null)
{
  Stream f = new File(fn, fm);
  if (!closure) {
    return f;
  } else {
    try {
      closure(f);
      return null;
    } finally {
      f.close();
    }
  }
}

// Example usage
open("c:\\log.txt", FileMode.Out, (Stream f) {
  OutputStream os = f;
  os.writefln("Log this");
});

// Example traditional usage
OutputStream f = open("c:\\log2.txt", FileMode.Out);
try {
  f.writefln("Do log this");
} finally {
  f.close;
}

The open-closure-close idiom is definitely doable in D. It’s not as pretty and natural as in Ruby, but not so far from it either. Whether this particular idiom will be embraced by the D community is yet to be seen.

Cheers!

Categories: D Programming Language Tags:

Managing Object Lifetimes in D

February 7th, 2008 14 comments

The D Programming Language is a modern version of C. It adds productivity features to the performance power of C, features like object oriented programming and garbage collection.

It may seem strange that a language with focus on performance utilizes automatic memory management. A GC equals overhead, right? Well, actually that is a common misconception. These days implicitly memory managed code is generally faster than code where the programmer handles deallocations. One reason for this counter-intuitive fact is that a number of optimizations can be done by the GC, especially on short-lived objects.

But garbage collection isn’t a problem free feature. For one thing, it is indeterministic. Normally, an object that uses external resources acquires them in a constructor and releases them in the destructor.

import std.stream;

class LogFile {
  File f;

  this() {
    f = new File("c:\\log.txt", FileMode.Out);
  }

  ~this() {
    f.close;
  }

  // Logging methods goes here
}

But with a GC there is no way to know when or even if the destructor is run. This may be a problem if you’re dealing with scarce resources like window handles, or file locks. So how can we control object destructions in a GC capable language? One possibility is to trigger a GC sweep explicitly in your code. In D this is done with the fullCollect function.

import std.gc;

fullCollect();

Explicit trigging of garbage collection is usually a bad idea, for several reasons. For one thing, it’s like killing an ant with a bazooka, and still you risk missing the target. Therefore, the normal approach to deal with the problem of indeterminism is to use a dispose method.

class LogFile {
  File f;

  this() {
    f = new File("c:\\log.txt", FileMode.Out);
  }

  ~this() {
    dispose();
  }

  void dispose() {
    if (f !is null) {
      f.close;
      f = null;
    }
  }

  // Logging methods here
}

LogFile log = new LogFile();
try {
  // Do some logging
} finally {
  log.dispose;
}

We don’t have to use the dispose pattern though because in D we have the luxury of choosing between implicit and explicit memory management. We can either free the object ourselves with the delete operator, or leave it for the GC to dispose later.

LogFile log = new LogFile();
try {
  // Do some logging
} finally {
  delete log; // Explicit memory management
}

D has another construct which is very convenient when you need to control the lifetime of an object. With the scope attribute, an object is automatically destroyed when the program leaves the scope in which the object was created.

void function some_func() {
  scope LogFile log = new LogFile();
  :
  // log will be freed on exit
}

The ability to mix explicit and implicit memory management is a simple, yet great feature. It is one of many examples where D provides us with the best of two worlds. Convenience and control, as well as productivity and performance, is blended in a way that has no equivalence in any other language I know.

Cheers!

Categories: D Programming Language Tags:

Loop Abstractions in D revisited

January 31st, 2008 4 comments

In my previous post on Loop Abstractions in D I showed you how we could make loop constructs abstract, in a similar way which is common in Ruby. The example I used as a model was the retryable method from Cheah Chu Yeow. His version is customizable in a way that let you define the type of exception that triggers a retry.

retryable(:tries => 5, :on => OpenURI::HTTPError) do
  # retryable code goes here
end

To mimic that in D we had to use templates, which are invoked with a special syntax.

retryable!(HTTPError)({
  // Retryable code goes here
}, 5);

To be honest, I don’t like the template syntax. I don’t know why, it just doesn’t feel right. If possible, I’d much prefer a more native looking code. Maybe something like this:

retryable({
  // Retryable code goes here
}, 5, HTTPError);

Christopher Wright points out an implementation that would be the closest one could get to a signature like that. He uses the somewhat primitive RTTI in D.

void retryable(ClassInfo info, int times, 
  void delegate() totry)
{
  for (int i = 0; i < times; i++) {
    try {
      totry();
      break;
    } catch (Exception e) {
      if (e.classinfo is info) continue; else throw e;
    }
  }
}

Which could be invoked with the following code.

retryable(HTTPError.classinfo, 5, {
  // Retryable code goes here
});

The problem with this approach, which was pointed out by Jarret Billingsley, is that this implementation wouldn’t catch and retry on exceptions from derived classes (descendants to HTTPError in the above example). Fortunately, Jarret provides us with a solution.

What you have to do then is perform a dynamic cast. There’s no syntax for this, but you can
hack around with some of the runtime functions to do the same thing. Place:

extern(C) Object _d_dynamic_cast(Object o, ClassInfo c);

somewhere in your module, then in the catch clause of retryable:

catch(Exception e)
{
  if(_d_dynamic_cast(e, info))
    continue;
  else
    throw e;
}

That _should_ work. It’s doing the same thing as the cast operator would
but without the pretty syntax.

Not pretty, but it works, at least if you use one of the standard libraries: Tango or Phobos. I’m not sure it’s better than the template version though. The .classinfo property brings nearly as much noice as the template version does. Also, the template version has the advantage that it is resolved at compile-time.

I think I’ll go with templates after all. Who knows, I might even get used to them one day.

Cheers! 🙂

Categories: D Programming Language Tags:

Loop Abstractions in D

January 17th, 2008 8 comments

One of the great things with Ruby is the natural way in which you can hide looping constructs behind descriptive names. Like the retryable example that Cheah Chu Yeow gives on his blog.

retryable(:tries => 5, :on => OpenURI::HTTPError) do
  open('http://example.com/flaky_api')
end

Notice how elegantly the loop logic is abstracted; There’s no need to look at the implementation of retryable to figure out what it does. The question is, can we do something similar with D as well? It turns out that with features like delegates and function literals we can actually get pretty close.

bool retryable(int tries, void delegate() dg)
{
  for(int i = tries; i > 0; i--)
  {
    try
    {
      dg();
      return true;
    }
    catch
    {
      // Retry
    }
  }
  return false;
}

Which can be used like this:

retryable(5, {
  open("http://example.com/flaky_api");
}) ;

Not as nice as with Ruby, but almost.

The custom exception of the Ruby version is a tricky one to implement in D. Templates to our rescue.

bool retryable(E)(int tries, void delegate() dg)
{
  for(int i = tries; i > 0; i--)
  {
    try
    {
      dg();
      return true;
    }
    catch (E)
    {
      // Retry
    }
  }
  return false;
}

With the (little bit odd) template syntax, we can then make retryable retry only when, for example, StdioExceptions are thrown.

retryable!(StdioException)(5, {
  open("http://example.com/flaky_api");
}) ;

To clean it up a bit, we can add some defaults (which requires us to switch places between the parameters).

bool retryable(E = Exception)(void delegate() dg, int tries = 5)
{
  for(int i = tries; i > 0; i--)
  {
    try
    {
      dg();
      return true;
    }
    catch (E)
    {
      // Retry
    }
  }
  return false;
}

That gives us a little more freedom when utilizing retryable.

retryable({
  // Retry up to 5 times
});

retryable({
  // Retry up to 10 times
}, 10);

retryable!(StdioException)({
  // Retry up to three times
  // on StdioException failures
}, 3);

I totally agree with Cheah Chu that Ruby is nice, but I think D is pretty cool too.

Cheers!

D Gotchas

November 6th, 2007 21 comments

I have been playing around with the D Programming Language lately, and I love it. D combines the low-level control of C and modern productivity features like garbage collection, a built in unit-testing framework and – the most recent feature – real closures.

But D is still a young language, and as such a little rough around the edges. It jumps out and bites me every now and then, forcing me to change some of my most common coding habits. Here are a couple of gotchas that made me trip more than once.

Premature Convertion

What’s the value of the variable after the assignment below?

real a = 5/2;

In D, the answer is 2. The reason for this unintuitive behavior is D’s arithmetic conversion rules which only takes operand types into consideration. A division between two integers result in an integer as well. The desired type, the type of the resulting variable, is disregarded.

To get the desired result we need to convert at least one of the operands to a floating-point number. This can be done in two ways. Either literally:

real a = 5.0/2; // a=>2.5

Or with an explicit cast:

real a = cast(real)5/2; // a=>2.5

Note that you must convert the operand, not the result. So this won’t work:

real a = cast(real)(5/2); // a=>2

The Premature Conversion gotcha is a particularly nasty one. It compiles and runs, which means only testing can reveal this bug.

Testing for identity

I’m a defensive programmer. I like to put assertions whenever I make assumptions in my code. By far, the most common assumption I make, is that an object is assigned. Here’s how I normally do it:

assert(o != null, "o should be assigned!");

In D, this is a big gotcha. The code above works as long as o is not null. If o is unassigned, we’ll get a nasty Access Violation Error. Here’s another example:

SomeObject o = null;
if (o == null) // <= Access Violation
  o = new SomeObject;

The reason is that D supports overloaded operators, in this case the equality operators (== and !=). Unlike Java, D converts the equality operator into a method call without checking for null references. So, internally, the above code gets converted to the following:

SomeObject o = null;
if (o.opEquals(null))
  o = new SomeObject;

Since o is null, the call to opEquals result in an Access Violation. Instead you should use the is operator to check for identity.

if(o is null) ...

Or

assert(o !is null, ...)

Despite the tripping, I actually like the idea of a separate identity operator. After all, “is a equivalent to b?” is a different question than “are a and b the same object?”. But, as we say in Sweden, It’s difficult to teach old dogs to sit.

Cheers!

Categories: D Programming Language Tags: