Tweeter button

And so Forth

Lately I’ve been looking into the Forth programming language. It’s a fascinating language in many ways! It combines low-level language features such as direct access to memory, absence of garbage collection and, at the same time, high-level language features such as high degree of expressiveness, reflection, ability to modify the compiler at runtime, meta-programming.

Concatenative

Forth is concatenative. In Forth, functions are called words and words can be made of other words by composition. The philosophy in Forth, called factoring, is to simplify programs into words of smaller size, typically 2-3 lines maximum (good programmers should factor their code in any language but well…). Both built-in words and words added by the programmer are located in the dictionary.

Stack-based

Another interesting feature is that Forth is stack-based. Instead of passing parameters to words explicitly, parameters are passed implicitly through the stack. The same goes for return values. Forth uses the reverse polish notation and words are evaluated left to right. Here’s a common example:

: square ( n -- nsquared ) dup * ;
  • “: square” is the way to define a new word, called square.
  • words between ( and ) are comments.
  • ( n — nsquared ) denotes the stack effect. Here, the left side shows that the word square expects one parameter on the stack when it’s called and the right side shows that it will leave one value on the stack. The syntax between ( and ) is a convention, it’s not mandatory.
  • dup is a word that duplicates the TOS (top of stack). After calling dup, we thus have the same value on the TOS and in the NOS (next on stack). The stack effect of dup is ( n — n n ).
  • The stack effect of the word * is ( op1 op2 — product ), showing that it pops the two operands from the stack and leaves the resulting product on the stack. At this point the TOS contains the initial value squared.
  • ; is a word to denote the end of the word definition.

Now that our new word is added to the dictionary, we can use it.

2 square .

. is a word to print the TOS. Its stack effect is ( n — ), meaning that it takes one value on the stack but doesn’t leave any value.

Starting Forth

To learn more about Forth, I recommend Starting Forth, by Leo Brodie. gforth is a good Forth implementation that can be used as companion with Starting Forth.

Implementation

Another reason for looking into Forth besides the language itself is the relative ease to implement an interpreter or compiler for it. Parsing of the language is very simple because the language doesn’t have any explicit grammar: whitespace characters are the only word delimiter. When a whitespace character is reached, it means that a new word has been formed. The dictionary is looked up and if the word is found, it’s executed. Otherwise, Forth attempts to convert the word to a number. If the conversion fails, it means that the word was not a number and a “word unknown” error is sent. Otherwise, the word is a well-formed number and the number is pushed onto the stack.

Of great interest is also the fact that Forth can easily bootstrap itself. A small portion of Forth can be written in another language for core words and the rest, including conditional statements and loop structures, can be written in Forth itself, using the core words. This makes Forth quite fascinating. The Wikipedia article about Forth gives good insights about Forth implementations.

jonesforth

A very interesting Forth implementation is jonesforth. It is written as a literate program and is abundantly commented, which makes it a great tutorial to learn Forth and how to implement Forth. The program is organized to follow the flow of thoughts necessary to understand the implementation. And I must say the author does a great job explaining the internals. A very small core is written in x86 assembly and the rest in Forth itself.

Although it’s not absolutely necessary to understand the comments, knowledge of assembly and of the GNU assembler syntax can be useful to maximize the learning experience. For that, I recommend the book Professional Assembly Language. This book is very well structured and written. I also liked the emphasize on using the GNU tools such gas, gdb and gprof and as well as the insights on how C compilers work.

RetroForth

Another implementation of interest is RetroForth. The source code is minimalistic and very clear. The approach used by this implementation is interesting. A Forth interpreter called Toka is used to convert Forth to bytecodes and a virtual machine called Ngaro is used to read the bytecodes (called the image). The virtual machine emulates the behavior of a MISC processor (only 30 instructions) as well as input/output devices. By porting the virtual machine, you can get a working Forth with very little effort. For example, the author made a Javascript version which can work in a browser, in less than 400 lines of code.

Just for the sake of it, I made a port of the virtual machine to Ruby. It seems to work well but it’s very slow. You can found the code here.

$ ruby retroruby.rb
RETRO 10

ok : square dup * ;

ok 2 square .

ok
ok 4
ok words
square e new i ia n p x d s v (e) (v) (line) (block) block blk line-ending offset #-block-size #-blocks < > <> = ;then if FALSE TRUE .” `x `c fill copy — ++ constant variable variable: allot }} —reveal— {{ find-last-visible } {
 double which heap #mem fh fw fb update tib compiler last ty tx next for ['] pop push ( 0; again repeat then !if if =if ;; ; [ s" ok >number save notfound reset depth boot d->name d->xt d->class accept .inline .data .macro .word with-class (remap-keys) bye getLength keepString redraw tempString literal, compile devector is :devector :is -! +! !+ @+ ' wait compare " . execute neg mod / off on 2dup tuck -rot rot not 2drop over key words clear type emit cr macro: : create ] , here out in dup nip >> << /mod * - + ! @ xor or and drop swap 1- 1+
ok

Alternatively, if you want to include code before using the interactive mode, you can call it like this:
cat somefile.retro - | ruby retroruby.rb

In addition, it supports saving your words with the word “save”. This allows you to restore your words when you start the virtual machine.

One Response to “And so Forth”

  1. Mathieu Blondel » Blog Archive » Numbers game solver Says:

    [...] So it becomes clear that solving the numbers game is equivalent to writing little programs. Although, I used the tree representation, a possibly simpler representation that I now think of is like a string of stack operations, like in Forth: [...]

Leave a Reply