Categories

Posts in this category

Sun, 25 Apr 2010

You are good enough!


Permanent link

Have you ever tried writing a compiler?

Most programmers haven't. Most programmers think that writing a compiler is hard. Or even some deep magic for which you need some kind of advanced wizardry that you only obtain by hacking twenty years on a compiler already, or so.

It's not.

Writing a feature complete compiler for a full fledged programming language is quite some work. But writing a simple compiler isn't. And contributing to an existing compiler isn't either.

I'd like to point you to Jack Crenshaw's tutorial series Let's Build a Compiler. It's rather old, and outdated by many standards, and not all that well formatted and so on, but it really teaches you the basics of how to parse a program, and then interpret it, or compile it down to assembler.

But mostly it shows you that compiler writing is no black magic at all. It's just like writing any other kind of program: Once you've got the gist of how compilers can work, it's mostly a matter of actually implementing things. And if some features seem hard to implement, there's plenty of literature that you can read on that particular topic.

(Mr. Chrenshaw's tutorial inspired me to write a toy interpreter in Perl for a nearly usable, Turing complete programming language. Math::Expression::Evaluator is a side product of writing that interpreter).

Contributing to an existing compiler is even easier. The overall architecture already exists, and typically you need to only modify small parts to add a feature.

I want to illustrate this by a feature that Solomon Foster, Jonathan Worthington and I added to Rakudo in a nice piece of collaboration.

The Feature

Perl 6 has the reduction meta operator. It takes an infix operator, and applies it to a list. Here a few examples:

# normal form
# same as 1 + 2 + 3 + 4
my $sum = [+] 1, 2, 3, 4;

# triangle form:
# same as
# my @sub-sums = 1, 1 + 2, 1 + 2 +3 , 1 + 2 + 3 + 4
my @sub-sums = [\+] 1, 2, 3, 4;

# right-associative operators are reduced right to left:
# infix:<**> is exponentiation
# same as 2 ** (3 ** 4)
say [**] 2, 3, 4

# chained operators are AND'ed together:
# same as  1 <= 2 && 2 <= 3 && 3 <= 4
my $sorted = [<=] 1, 2, 3, 4;

Status Quo

When we started our work, only the first, simplest version was implemented, i.e. reduction of a left associative, non-chaining infix operator.

What we did

Solomon started with a basic implementation of the reduction logic. You'll notice that it's written in Perl 6, so no knowledge of scary low level languages required.

In a series of small patches Solomon and I generalized and improved the logic step by step, and Jonathan wired the parser to the reduction logic.

All of these patches were written in Perl 6 code, and only the last one required more than a trivial amount of guts knowledge.

The actual reduction method is no piece of magic. It ended up a bit lengthy because it needs to consider several different variations of the reduction feature. It's just an ordinary function that you would typically find in a perl module.

Conclusion

If you know a bit of Perl 6, you can contribute to Rakudo today. Many built-in features can be desugared to ordinary library functions under the hood. If implement the logic, somebody can tell you how to wire up it with the rest of the compiler, or even do it for you.

You are good enough. Ordinary programmers can do it, no wizardry required.

(The same actually holds true for most projects that look scary from the outside. In my experience it's just very important that the community is friendly and helpful.)

(With apologies to mst).

[/perl-6] Permanent link

Comments / Trackbacks:

Trackback URL: /blog-en/perl-6/you-are-good-enough.trackback

Eddward wrote

Somewhat related...
Since the reduce is written in perl6, does that mean it is not byte compiled? Or rather, the reduce function is implementing the 'reduce' op code?

If so, is there some rule of thumb as to what perl6 features can be used to implement op codes in order to prevent recursive op codes?

Edd

Thomas Lee wrote


Absolutely spot on! And yes, this is most definitely true of all projects that look scary from the outside. Really, all it takes to get involved is a little knowledge, a willingness to contribute, and knowing where to ask for help or guidance. This is true of open source projects in particular.

Ives van der Flaas wrote


Well, contributing to *your* compiler's pretty easy. Try adding a feature to http://awib.googlecode.com/svn/builds/awib-0.2.b ;-)

moritz wrote

Bootstrapping and circularity chainsaw (for Eddward)
The reduce subroutine is written in Perl 6, and when you build the compiler, it is also written in bytecode.

Of course not everything can be written in Perl 6: Adding two integers or floating point numbers is a primitive of the underlying virtual machine (parrot in this case), and &infix:<+> is a wrapper around such an opcode. The situation with substr() is similar.

So what features can you write in Perl 6?
Basically everything. Large parts of grammar engine and of the meta object protocol are written in Perl 6 (or subsets thereof).
And which features can use for that?
All the things that are already implemented. None of these already implemented built-ins use your new function, so there's no endless recursion.

Herve wrote


Just a suggest//Remarks about the syntax of [\+] (triangle form)
# my @sub-sums = [\+] 1, 2, 3, 4; (1, 1 + 2, 1 + 2 +3 , 1 + 2 + 3 + 4

[\+] don't say what it's doing.
May be [<+>] or [@+] says much that you get an array on the output...

May be, some (Hyper|Meta|etc.)operators is -OFun for writing, but not so '-OFun' for reading (without comments// explanations).

Look at the Zip operator 'Z' and the Cross operator 'X':
They can obfuscated the program reading :
my @Y=<U V> X~X <W Z>;
my @X=@Y Z <V W>;

With them, it's easy to write unreadable programs lines.

Could you avoid that 'Perl6' will produce 'unreadable' programs like Perl[1..5] did it ?
Could you avoid that Perl6 have later a dark side like its parents ?

The syntax of a language can play for him or against him.

Anyway, Parrot & Raduko Teams do very good job...
Thanks for all.

(For ever Perl lover)

jhuni wrote

Triangular Reductions
I generally like the idea of giving a character a consistent definition and re-using that, like with the ternary operator (?? !!). It seems like a far superior syntax since ?? is generally associated with true booleans and !! with false ones.

Similarly it might be nice to re-use the array sigil (@) to indicate that the reduction returns an array like [@+], however, [@+], [@-], [@*], [@/], [@**], [@,] generally seems to bulky, and it just doesn't look that good, so that solution doesn't work that well.

The current solution isn't without its own set of problems though, for one \ is almost always associated with escape characters like "\n" and the line characters (\, |, /) look too similar to one another, so triangular division looks a little weird:

my @a = [\/] 1..5;
// 1, 1/2, 1/6, 1/4!, 1/5!

The other thing is [\|] which looks pretty bad as well due to the similarities between line characters. That said, I think triangular reduction operators are a great thing, rarely will you find yourself using [\|] anyways...

Write a comment

The comments on this blog post have been disabled; the comment form below will not work.

 
Name:
URL: [http://www.example.com/] (optional)
Title: (optional)
Comments:
Save my Name and URL/Email for next time