The Functional Subset of D

May 20th, 2008 6 comments

As I wrote about in an earlier post, the future of D lies in the field of functional programming. More specifically, what the creators of D are trying to do, is to construct a pure functional subset that can be utilized within the otherwise imperative language.

Let’s take a closer look at that functional subset that is taking form in the experimental 2.0 version of D.

Immutable data

The most fundamental difference between a purely functional language and an imperative one is how they treat data. Many of us are used to think of data in terms of state, where variables can be changed through assignments. But in a functional program there are no states. There are only constant values and functions that operate on them.

In D, immutability is achieved with the invariant keyword, either as a storage class:

invariant int i = 3;

or as a type constructor:

invariant(int) i = 3;

Transitive invariance

The side-effect free nature that comes with immutable data has some great advantages. For one thing it simplifies testing since the result of a function only depends on its input. There are also some optimizations that can be done by the compiler, but the biggest advantage is that programs written in this way are thread-safe by default.

To take advantage of these things the compiler needs to be able to trust the immutability of our data. This is where transitivity comes in. In D, invariance is transitive, which basically means that no data reachable from an invariant reference can be changed. Here’s an example.

int ga = 2; // mutable

struct A {
  int *p = &ga; // pointer to mutable
}

invariant(A) a; // a is immutable
A b; // b is mutable

// invariant is transitive
a = b; // ERROR, a is immutable
a.p = null; // ERROR, a.p is immutable
*a.p = 3; // ERROR, data pointed to by a.p is also immutable

Garbage collection

Since data must never change in a functional program, and consequently must not be destroyed while the data is in use, it’s usually a good idea for a functional language to utilize automatic memory management. Like most functional languages, D features garbage collection (alongside with explicit memory management.)

Higher class functions

In order to do anything interesting in a purely functional language you need higher order functions, or – in other words – the ability to send functions as arguments to other functions. For this we can use the function pointers (or delegates for methods and nested functions).

As an example, let’s say that we want to create a function that calculates the sum of two adjacent Fibonacci numbers. Here’s one way to do that.

int nth_fib(int n) {
  if(n == 0) return 0;
  if(n == 1) return 1;
  return nth_fib(n-1) + nth_fib(n-2);
}

int add_next_fib(int n) {
  return nth_fib(n) + nth_fib(n+1);
}

Now, let’s say that we want to do the same operation on a different sequence, for example natural numbers. Well, we could use the good old copy and paste but that isn’t very DRY. Let’s make add_next a higher order function instead so that it could be used with any sequence function.

int add_next(int n, <em>int function(int) nth</em>) {
  return nth(n) + nth(n+1);
}

int i = add_next(3, &nth_fib);
// i is 8 (3+5)

Now, we can write any sequence function we want and have add_next apply it.

// Sequence function for natural numbers
int nth_nat(int n) {
  return n;
}

int i = add_next(3, &nth_nat);
// i is 7 (3+4)

Note: For methods and nested functions, the keyword function is replaced with the keyword delegate, otherwise it’s the same syntax.

Closures

Closures is another indispensable feature of functional languages. In short, it’s the ability to extract a function pointer for later use, and when invoked the function will still have access to the context in which it was created, even though that context has gone out of scope.

In D, closures are created with the delegate keyword.

int delegate() create_closure() {
  int x = 3;

  int f() {
    return x;
  }

  return &f;
}

int delegate() a_closure = create_closure();
int i = a_closure();
// i is 3

Note that the extracted function f (referenced by the a_closure variable) accesses the local variable x, although it has gone out of scope at the time of execution. D got this ability with the 2.07 version, before that it didn’t have real closures.

Currying

Closures provide an easy way to do currying, which is common in functional languages. Simply put, currying is a technique where functions take a general function and return a new, more specialized one.

For instance, we could curry our add_next function in our previous example and create a specialized version of it, say add_next_fib (and thus get back to where we started).

int delegate(int) curry_add_next(int function(int) nth) {
  int curry_f(int n) {
    return add_next(n, nth);
  }
  return &curry_f;
}

int delegate(int) add_next_fib = curry_add_next(&nth_fib);
int i = add_next_fib(5);
// i is still 8

Pure functions

These features are all we need to write purely functional code, but in order to take full advantage of the functional programming paradigm some major things remain unsolved.

For one thing, the compiler needs to know whether or not our code is functional in order to apply possible optimizations. The easiest way to do this is to give the programmer a keyword to tell the compiler she wishes purity, and then have the compiler enforce it. In D this is the purpose of the pure storage class.

pure int a_pure_square_function(invariant(int) x) {
  return x * x;
}

A pure function must not access non-invariant data, and may not invoke other non-pure functions. As per D2.13, the pure storage class has not had its semantics implemented and are therefore not enforcing purity. I sense this is not a trivial matter, so it may take some time before we have it.

Cheers!

Categories: D Programming Language, programming Tags:

Alan Cooper: Open-Source is a Sign of Failure

April 28th, 2008 10 comments

In the keynote where Alan Cooper proclaimed that Agile processes are bad for developing quality software, he made another provoking statement: that Open-Source is ultimately a symptom of management failure. His point is that with the right enthusiasm and commitment to your products, why would anyone go and work in an Open-Source project on their spare time?

Well, there are plenty of good reasons for doing Open-Source pro bono work; it’s a great way to get experience and widen perspectives for instance, but still, Alan has a point. Many of us are not as content as we could be with our regular work. Instead we’re seeking satisfaction elsewhere.

So, what should our employers do to get our full attention? Here’s my list.

  • Creative freedom
    Give me a chance to contribute, to be innovative and creative. Let me spend a part of my time doing research and follow paths that interest and inspires me. Google is a great example of a company that understands the importance of this.
  • Personal Development
    As Ron Jeffries has said, “the river is moving, and if we don’t keep rowing, we are going to drift back downstream.” I think self improvement is a spice of life. If my company provides me with all the books I need (and want), and lets me attend courses and conferences of my choice, chances are I’ll stay with it for life.
  • Ownership
    People usually do a better job, are more careful and thorough, if they own the thing they’re working on. This is true for software as well. Make me a business partner and I’ll optimize my work according to your business goals.
  • Appreciation
    The human race seems to be immensely better at criticizing than at giving appreciation. Yet, this is what we all crave, and – it has a great impact on how we see our employers. A rewarding salary is one form of showing appreciation, but I also need the outspoken forms.
  • Closures
    No one can go on for ever without reaching a finish line. I want to work in a team that is long-term efficient and gets to get done often. Help me divide and I’ll conquer for you.

That was my list, what’s on yours?

Cheers!

Categories: opinion Tags:

The Future of D is Functional

April 16th, 2008 20 comments

The D Programming Language has an impressive list of cool features. That is not always a good thing. A feature should contribute to the philosophy and foundation on which the language was built. If it doesn’t, harmony breaks down and the result is a language that is harder to learn and to use.

Some people think D suffers from such a feature creep. To some degree I can agree. D has features that are unlikely to become widespread, but most of them are aligned towards a common goal. That goal is bringing productivity to the world of low-level programming.

Lately, a new goal has emerged from the D community, and it has triggered some real intense activity. The future of D seems to lie in the field of functional programming, making the imperative and the functional paradigms truly come together.

Why is this important? Let me quote Walter Bright from a discussion at the D forums:

The future of programming will be multicore, multithreaded. Languages that
make it easy to program them will supplant languages that don’t.
[…]
The surge in
use of Haskell and Erlang is evidence of this coming trend (the killer
feature of those languages is they make it easy to do multiprogramming).

As we all know, multithread programming in an imperative language is a real pain. It’s complicated and easy to get wrong, but that is not the case in a functional language. A pure functional program is thread-safe by design, since it’s free from mutable state and side-effects.

You never have to worry about deadlocks and race conditions because you don’t need to use locks! No piece of data in a functional program is modified twice by the same thread, let alone by two different threads. That means you can easily add threads without ever giving conventional problems that plague concurrency applications a second thought!

Quote from Slava Akhmechet’s excellent article Functional Programming For the Rest of Us.

What the people behind D want to do is to create a true functional subset of the D Programming Language, and create a safe interfacing to parts of the program that are imperative. The functional subset would enforce pure functional programming, like disallowing access to mutable global state and calls to non-pure functions. In effect that would enable you to write parts of your program that need to be thread-safe in a functional style, while using the style of programming that most of us are used to for the rest.

But, it might not stop there. Andrei Alexandrescu, who’s one of the driving forces behind the functional subset, has suggested that the enforcements inside a pure function can be relaxed to allow mutability of what he calls “automatic state,” thus allowing imperative techniques to be used as long as the mutability doesn’t leak across the function barrier. Here’s an example from Andrei’s slides on the subject:

int fun(invariant(Node) n) pure {
  int i = 42;
  if (n.value) ++i;
  int accum = 0;
  for (i = 0; i != n.value; ++i) ++accum;
  return n.value + i;
}

This code doesn’t look a bit functional, but the result of fun() is solely dependent on its arguments, so seen from the outside it’s pure.

Mixing pure functional and main stream imperative programming is certainly bleeding-edge. As far as I know it has never been done in any other language. But even though I’m excited, I can’t help wondering whether this is the right way to go. There’s a risk that D spreads itself too thin, and that the pure functional subset of D will be too noisy; I anticipate a heavy use of the invariant keyword for instance.

Would it be a better option to create a completely new language, say Functional D, and make D and Functional D interoperable on a binary level? At least that would battle the noise by reducing the need for explicitly expressing things like immutability, which can be taken for granted in a functional language. I also suspect that the compilers would be less difficult to implement than a compiler that has to support both styles.

But then again, I don’t know much about making compilers. I just happen to be more of a minimalist when it comes to computer languages. (As opposed to my preferences for APIs and Framworks.)

Expect more posts on this subject as I plan to delve into details next.

Cheers!

Steve Yegge: D is the Next Cool Language

April 1st, 2008 4 comments

I just watched an interview with Steve Yegge on the Google Code Blog. The interview was mostly about his current project Rhino on Rails. It was interesting, although not interesting enough to keep my attention for the entire 25 minutes. After some time I kind of tuned out, but I kept the video cast running in the background.

Good thing I did, because at the end of the interview Steve surprised me big time. If you’ve followed my blog lately you know that the D Programming Language is my favorite toy at the moment, so when he mentioned it in the interview I was all ears again. For your convenience, I made a transcript of the best parts:

You’re a language geek. What new cool language would you want to play with next?

To be honest I really wish I knew more about D. I’ve programmed in the D language and it’s D-lightful. It was quite expressive. I did a bunch of programs that I kind of ported from ruby scripts that I’ve written, and I swear it was only like 20% harder than in the ruby code. It was super tight, except that it was lightning fast. It was like C++ or faster.

Steve also gives his opinion on what’s needed for D to become really big.

If it [the D language] were link compatible with C++, which is hard, then it could be used with all the C++ libraries. Then it could be the replacement of C++.

In case you’d like to hear and see for yourself, the part when he mentions D starts at about 22 minutes and 35 seconds in.

Steve Yegge as part of the D community? Wouldn’t that be something.

Cheers!

Categories: D Programming Language Tags:

Is agile only for elites?

March 28th, 2008 18 comments

I’m back from the ESRI Developer Summit. While suffering from severe jet lag I’ve spent the last couple of days in slow reflection. The biggest impact the conference had on me was a keynote held by Alan Cooper.

Early in his talk he put me in a defensive mode by stating that agile processes are bad for developing quality software. Alan means that the idea of little or no upfront design is ridiculous and will result in either expensive development costs or crappy software.
Instead he believes in having all outstanding uncertainties removed with a thorough and detailed design, thus developing a “blueprint” for a production team to follow. Additionally, in opposition to the agile manifesto, change is nothing he seems to embrace.

Most business executives believe that writing production code is a good thing. They assume that getting to production coding early is better than getting to it later. This is not correct. Writing production code is extremely expensive and extremely permanent. Once you’ve written it, it tends to stay written. Any changes you might make to production code are 1) harmful to the conceptual integrity of the code, and 2) distracting and annoying to the programmers. Annoying your programmers is more self-destructive to a company than is annoying the Board of Directors. The annoyed programmers hold your company’s operations, products, and morale in the palms of their hands.

So he wants us to go back to Waterfall. Doesn’t that give me the right to discard his thoughts without further reflection? No, I don’t think so.

It’s easy to forget that the “traditional” development processes were not created to make our lives as developers miserable. They emerged from common knowledge of that time, and they were formulated to address real problems. We would be foolish to disregard that experience from the past.

Let me be clear with one thing. I don’t agree with Alan. I do believe we can produce high quality software with agile methods, where design evolves with the production code. But I did, after my initial defensive reflex, find his perspective refreshing.

Alan’s talk is not published anywhere, but the general ideas are documented on his company’s website.

Software construction is slow, costly, and unpredictable.
[…]
Unpredictable is by far the nastiest of these three horsemen of the software apocalypse. Unpredictable means 1) you don’t know what you are going to get; and 2) you won’t get what you want. In badness, slow and costly pale in comparison to unpredictable.
[…]
The key, it seems, is vanquishing unpredictability, which means determining in advance what the right product design is, determining the resources necessary to build it, and doing so. As the airline pilots say, “Plan your flight, and fly your plan.”

Alan’s solution to the development problems of today is to divide work into three separate fields of responsibilities, something he calls “The Triad”.

Interaction design is design for humans, design engineering is design for computers, and production engineering is implementation. Recognizing these three separate divisions and organizing the work accordingly is something I call “The Triad.” While it cannot exist without interaction designers, it depends utterly on teasing apart the two kinds of engineering which today, in most organizations, are almost inextricably linked. It will take some heroic efforts to segregate them.

Collaboration with the customer (or users), as the agile methodologies suggest, is out of the question according to Alan. Why let the least qualified make the most important decisions, he reasons. Instead, Alan Cooper advocates the use of interaction designers (HCI experts). Thus, he identifies three key roles: design engineers, production engineers and interaction designers.

Production engineers are good programmers who are personally most highly motivated by seeing their work completed and used for practical purposes by other people. Design engineers are good programmers who are personally most highly motivated by assuring that their solutions are the most elegant and efficient possible.
Interaction designers’ motivations are very similar to those of design engineers, but interaction designers are not programmers. Although most programmers imagine that they are also excellent interaction designers, all you have to do to dissuade them of this mistaken belief is to explain that interaction designers spend much of their time interviewing users.

Alan doesn’t rule out agile methods completely. He thinks they have a place, but only as a part of the design process.

Currently there is a pitched battle raging in the programmer world between conventional engineering methods and Agile methods. What neither side sees is a path of reconciliation; where Agile and conventional methods can effectively coexist. Providentially, the Triad reconciles them very well. The lean, iterative, problem-solving work of the software design engineer is the archetype of Agile programming. The purposeful, methodical construction work of the production engineer is the quintessence of conventional software engineering, particularly the type espoused by disciples of Grady Booch’s Rational Unified Process, or RUP. Both methods are correct, but only when used at the correct time and with the correct medium.

Despite Alan’s thought provoking keynote, I’m still a believer of agile methods for the whole development process. I think it’s possible to build rigid software with little upfront design, a readiness for change, rapid feedback and customer collaboration. The problem I see is that it demands a lot more from us developers. Knowing the language and how to program the platform is no longer enough. We need system and interface design skills, as well as social skills. We also need to master important but difficult techniques like unit testing and code refactoring.

Maybe agile is only for teams of elites?

Categories: software development Tags:

D Update Mitigates Comparison Gotcha

March 11th, 2008 No comments

The D programming language tries to make a clear distinction between comparison by equality and comparison by identity. It does so by offering two different operators, one for each purpose.

// Are a and b equal?
if (a == b) { ... }

// Are a and b the same object?
if (a is b) { ... }

As I’ve written about in a previous post this could be a source of confusion for D newbies, like myself. Someone who’s used to the comparison semantics of Java, for instance, is likely to learn to separate equality and identity the hard way. Until now, that is.

This code, where the equality operator is (wrongly) used to compare identities, used to produce a nasty Access violation error at runtime.

SomeObject a = null;
if (a == null) { ... } //<-- Access Violation

Now, with the newly released versions of the D compiler (the stable 1.028 and the experimental 2.012,) this error is nicely caught by the compiler, which kindly instructs us to use the identity operator instead.

Error: use 'is' instead of '==' when comparing with null

Unfortunately, it won’t help us discover all cases of wrongly used equality operators. For instance, this piece of code still produces the dreadful runtime AV.

SomeObject a = null;
if (a == a) { ... } // <-- Access Violation

Still, the update is a big improvement and will make it easier for a lot of people while learning the language.

Cheers!

Categories: D Programming Language Tags:

Off to the land of opportunity

March 10th, 2008 No comments

Next week I’m going to attend the 2008 ESRI Developer Summit in Palm Springs. It will be my first time in the USA and I’m really looking forward to the trip. Who knows, I might even meet one of you guys over there.

Categories: off-topic Tags:

OT: Rest in peace Gary

March 6th, 2008 No comments

I’m sad to see that the original author of the role-playing game AD&D, Gary Gygax, has died. I’d like to send a warm, thankful thought to the one who’s work meant so much to me and my role-playing friends in our youths.

I would like the world to remember me as the guy who really enjoyed playing games and sharing his knowledge and his fun pastimes with everybody else.

Rest in peace man.

Categories: off-topic, role-playing Tags:

Who wants a sloppy workplace?

March 6th, 2008 11 comments

I’m not the kind of person who is easily annoyed, but there is one thing that gets into my skin – all the time. Inconsistently structured code. I hate arbitrary indentation, spacing and line breaks. It is close to impossible for me to assimilate a piece of sloppy code without first running it through a beautifier.

I ran into such a code again today, and for some reason I started to reflect upon my negative reactions. What is it about messy looking code that makes me dislike it so much? The first thing that came to my mind was this: The source code is where I spend most of my time at work, and who wants a sloppy workplace?

At second thought, that didn’t seem to hold – at least not for me. Here are a couple of pictures of my desk at work.

My workplace 1My workplace 2

My computer corner at home is even worse. Clearly, I’m not a person who cares about a tidy workplace. So what’s the reason then? Why can’t I stand to look at sloppy code while I’m perfectly OK with turning my desk into a dump? Well, it beats me.

If you have an idea, please let me know.

Cheers!

Categories: habits Tags:

Virtual or Non-Virtual by Default, Do We Really Have To Choose?

March 4th, 2008 3 comments

When it comes to the question of whether methods should be virtual by default or not, there are two schools of thought. Anders Hejlsberg, lead architect of C#, describes them in an interview from 2003.

The academic school of thought says, “Everything should be virtual, because I might want to override it someday.” The pragmatic school of thought, which comes from building real applications that run in the real world, says, “We’ve got to be real careful about what we make virtual.”

As I told you about in my post from last week, I have left the “pragmatic school of thought” to join the “academic” camp. The main reason was unit-testing, that – in my opinion – calls for a more flexible object model than the one of C#. When unit-testing, I often want to use components in unusual ways, all in the name of dependency breaking, and therefore I like their methods to be virtual.

But, it wasn’t – and still isn’t – an easy pick; Virtual by default does bring some serious problems to the table. Again from the interview with Anders:

Every time you say virtual in an API, you are creating a call back hook. As an OS or API framework designer, you’ve got to be real careful about that. You don’t want users overriding and hooking at any arbitrary point in an API, because you cannot necessarily make those promises.

Whenever they ship a new version of the Java class libraries, breakage occurs. Whenever they introduce a new method in a base class, if someone in a derived class had a method of that same name, that method is now an override—except if it has a different return type, it no longer compiles. The problem is that Java, and also C++, does not capture the intent of the programmer with respect to virtual.

C# captures the intent better and avoids versioning problems, and Java offers the flexibility needed for unit-testing. Which is better? The answer seems to be: it depends. But, do we really have to choose? Why can’t we have both? Well, I think we can, and Java has shown the way to achieve it.

Java annotations is a powerful language feature. (The concept was rightfully stolen from C# where it is called custom attributes.) With it one can attach metadata to parts of a program, to give additional information to the compiler or other external agents. In other words, annotations can be used to extend the programming language without the need for changing the language itself.

A good example is the @override annotation.

class SomeClass extends SomeBaseClass {

  <strong>@override</strong>
  void someMethod() { … }

}

From the Java documentation:

[@override indicates] that a method declaration is intended to override a method declaration in a superclass. If a method is annotated with this annotation type but does not override a superclass method, compilers are required to generate an error message.

The use of the @override annotation takes care of the problem when name changes to virtual methods silently breaks behavior of descending classes; Or, which is more common, when misspelled names of intended overrides aren’t captured at compile-time. Now, with the introduction of the @override annotation, you can help the compiler help you to fail fast. It is now possible to show your intention better than what was possible in the pre annotation days.

Unfortunately Java doesn’t take this concept all the way. They could have introduced an @virtual annotation as a complement to @override, and thus reach the same level of expressiveness as C#, but without forgoing the flexibility of the Java object model. It would be the perfect middle-way, and provide the best of both worlds.

Class SomeBaseClass {

  <strong>@virtual</strong>
  void someMethod() { … }

}

The benefit of an annotation (or custom attribute) based solution is that it’s configurable. It would be possible to alter the compiler’s behavior based on context or environment. For instance one could enforce the use of @virtual and @override in production code. Additionally, one could relax the controls when necessary, like in test projects or legacy code, to mere warnings or complete silence.

Wouldn’t that be better than the all or nothing solutions of today?

Cheers!

Categories: C#, java, programming Tags: