Platinum Partner
php,css,phpwindows,compiler

Writing a Compiler -- in PHP?

Coders don't always write compilers. But even when they don't, they do often like to learn about compilers, and maybe even play with bits of one.

One of the standard introductions is the 'dragon book' -- very comprehensive, but very theoretical, and not entirely up to date. (Yes, I own it, and I will read it..someday...) It's a serious textbook -- but textbooks can be tedious and inefficient, especially apart from a full course of study (though Stanford does offer extensive materials from number of complete courses that use the dragon book).

In any case, because most developers won't be writing a compiler, the majority of developer interest in compilers is probably more playful than the dragon will permit. You don't normally worry about lexing and parsing and so forth. Normally you just write fairly high-level code, and fiddle until the compiler is cool with it. (You can tell that I don't write much assembly.)

But what if you do want to play with some of the parts of a compiler? or maybe even create a domain-specific language, to simplify common programming tasks for your particular projects?

Then, first, you'll want to use a very familiar language -- otherwise you're playing two games at once, neither really helping the other very much. And, second, you'll want to work with some of the very basic elements of compiling before anything else -- the indivisible primitives, like things and groups of things. Or lexers and parsers, in formal programming terms.

Terence Parr takes this easy, stepwise approach in his book Language Implementation Patterns. As a result, Parr's book reads reads to me a bit like Dennis Ritchie's classic The C Programming Language: everything makes sense as you do it, and you never stop making incremental progress.

Sameer Borate recently worked through the first chapter and created a simple lexer and parser in PHP. This seems like an ideal first step into compiler design: plenty of people know PHP, and lexers and parsers are fundamental -- plus PHP syntax is pretty clear and intuitive, so the implementation language features don't get in the way. 

You should read the full article, which isn't long (and is mostly code snippets), but here's a teaser: Sameer's lexer for a programming language consisting of a simple list:

<?php
 
require_once('ListLexer.php');
require_once('Token.php');
 
$lexer = new ListLexer($argv[1]);
$token = $lexer->nextToken();
 
while($token->type != 1) {
    echo $token . "\n";
    $token = $lexer->nextToken();
}
 
?>

Sweet, I understood that. Maybe I can write a compiler now too.

Well, maybe not now, but at least read through the rest of the code...

{{ tag }}, {{tag}},

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}
{{ parent.authors[0].realName || parent.author}}

{{ parent.authors[0].tagline || parent.tagline }}

{{ parent.views }} ViewsClicks
Tweet

{{parent.nComments}}