Stuff I like.

Sunday, January 03, 2010

Do I Have a Lithp?

Why! Yes, I do!

Here's the story.

The setting: the second semester of my senior year. I had two major design projects and several major papers, I was designing a control system for a 6-DOF robot arm, and I had no time whatsoever. It was during this semester that I wrote my first Lisp interpreter.

What is Lisp (wikipedia)?

The name LISP derives from "LISt Processing". Linked lists are one of Lisp languages' major data structures, and Lisp source code is itself made up of lists. As a result, Lisp programs can manipulate source code as a data structure, giving rise to the macro systems that allow programmers to create new syntax or even new domain-specific programming languages embedded in Lisp.

The interchangeability of code and data also gives Lisp its instantly recognizable syntax. All program code is written as s-expressions, or parenthesized lists. A function call or syntactic form is written as a list with the function or operator's name first, and the arguments following; for instance, a function f that takes three arguments might be called using (f x y z).

When I set out to create my own lisp the language was pretty new to me. I understood how to make new functions and do silly arithmatic and I could begin to grasp the recursion of more difficult algorithms--but I quickly discovered I didn't have a clue.

The simplicity language was what really captivated me. I could begin to see how one could build incredibly complex things out of very simple building blocks. For example: If you can create a global variable and create an unnamed function -- combine the pieces and you can create global functions. Using these two primitive pieces I was able to write the fundamental function definition keyword defun in lithp. I didn't have to make it part of the interpreter.

  • (set a 1) ; creates a global variable, sets it to one
  • (lambda (x) (+ x 1)) ; creates an unnamed function that adds 1 to whatever it's given
  • (set b (lambda (x) (+ x 1)) ; creates the same function as before and stores it in a
  • (b 5) ; applies the add-one function to 5 to produce 6

Like many coders, I had imagined that one day I would create my own language. A language as fundamentally simple as lisp seemed well within my abilities. I got to work.

The Parser

My first task was creating a parser that could understand all those parentheses. Writing the parser was difficult because I had no clue what I was doing--the reality is that the Lisp syntax is very easy to parse. I was fortunate to discover the Bison and Flex tools and was able to understand them well enough to create my lisp parser.

Here's how it works.

lithp.lex

The lex file is given to a tool called Flex to produce the c code for the lexer. A lexer converts a stream of text into a stream of tokens with tags designating what each of the tokens are. Each token is defined using regular expressions.

For example: The tokens in lithp are the open and close parentheses, single quotes, backquotes, dots, numbers, and symbols. Each of these has special significance in how the textual form of the program is converted into s-expressions.

lithp.y

The .y file is given to another utility called Bison to produce the c code for the parser. The parser uses the tokens that the lexer produces and produces s-expressions if the tokens are in a recognizable sequence.

Inside the .y file I use EBNF Notation to define the sequences of tokens that make up the various types of stanzas that can be found in our language. For example: A sequence is one or more expressions separated by spaces. A list is a sequence that has parenthesis around it. An expression is either a symbol, a number, or a list.

These two surprisingly short and readable files were the product of many hours of reading documentation and writing test files that I thought should parse correctly. I could definitely write these much quicker now but Bison and Flex were really tough to learn from scratch using only google and man pages.

The Printer

In the next installment I will discuss the function to convert s-expressions back into text form. It was while I was writing this function that I really started to grasp the beauty and simplicity of the traditional parenthetical lisp syntax. My next lithp post will explain how s-expressions are stored and what the syntax of textual representation actually means.

To be continued...

No comments:

Followers