Forth is NOT intrinsically slow

An essay

From the Small C node:

This conclusion is amazing. I want to cite here 2 fragments of Tom Novelli's paper:

[..] In the course of my parser studies, I came across Small C. The compiler is the size of a simple Forth, inherently faster, and it produces tighter code. The concept is simple: the parser methodically translates infix formulas into postfix, stack-based, assembly/machine code -- basically it translates to Forth! [..] Since C translates directly to Forth, Forth can never win the race. [..]

Evidently a non sequitur. Prescinding from the accuracy of the above description, how one can conclude that C is faster than Forth?? Where does this paper demonstrate that Forth compilers cannot produce fast code as (at least) C compilers?

What are these Forth features impossible to implement efficiently? The double stack architecture? Nonsense. With a simple technique (to implement and to understand) is possible to cache in registers both the data stack and the return address stack. This technique, at least for caching data stack items in registers, is documented in:

This is needed on register machines, of course. But what is so great about register machines? I suggest to have a look at this book, now on-line:

Professor Koopman is the designer of TIGRE.

There are no technical obstacles to appling more aggressive optimizations when compiling Forth programs (the above mentioned papers shows, for example, how to apply type inference to Forth programs). For reference see various papers by M. Anton Ertl and others about aggressive optimizations (but not about type systems, he is skeptic on this matter):

Besides all this, the paper by Tom Novelli misses a fundamental point about Forth: its background philosophy of simplicity and minimalism without which Forth is really much less interesting. Given his dedication and effort with Retro this come as a great surprise to me.

The only thing I could agree is the fact that these days Forth is a minority "language" (Forth originated as a complete environment, see FORTH - A Language for Interactive Computing) and is quite different from mainstream languages (but not so different from Lisp for example) so there much less research into Forth optimizing compilers.

Nonetheless at least both main commercial Forth providers, Forth, Inc. and MPE, Ltd., invest substantial effort into native code optimizing compilers.

Finally, Forth's inventor, Chuck Moore, worth a mention: he is totally against complex automatic optimizations, he is against compilers in a certain sense (to long to explain here: his rational needs a lot of context).

-- Mad70

In retrospect, it was somewhat thoughtless of me to say that one language was intrinsically faster than another, since this is rarely the case. I think Tom mixed up the idea of "faster" with "easier to optimize." It's pretty clear to me that a language like Small C where types are explicit would be easier to optimize than a language like Forth. The other, small, probably wrong argument I could make in favor of tcn is that Forth is a more dynamic language than Small C, which definitely does have an effect on the performance of output code. I apologize for the lack of thought I put into the Small C node.

-- seaslug

I bet the Small C compiler already does some optimizing whereas most Forths don't

-- futhin

Though Forth is a dynamic environment, the language is as static as C. Look at the code output from a Forth compiler, and compare against C's output -- neither invoke routines to interpret anything (unless you explicitly call EVALUATE in Forth, or yyparse() in C).

-- KC5TJA

Well, I want to make clear that your comment was a pretext to write this essay but my only intention was and is to address some common misconceptions about Forth, not to stigmatize you or Tom (this remains for me a strictly technical discussion).

About "one language is intrinsically faster than another" being a thoughtless statement in general: I am not from the school of "all languages are made equal" (the argument of Turing equivalence especially is a nonsense, see the thread "so-called Turing-Equivalence" by Fare from 1999-August Archives of the Tunes mailing-list). So, for me, inquiries about the possibility, for a given language, to implement a reasonable efficient translator (interpreter and/or compiler) are absolutely legitimate. In particular, I expect that languages with abstraction inversion, where you are forced to express primitive concepts in terms of more complex, are intrinsically difficult to optimize (but remember that "primitive" is not an absolute concept, see Quotienting).

About dynamism of Forth: yes and no. Forth as language (opposed to Forth as environment, see above mentioned essay: "FORTH - A Language for Interactive Computing") retains the ability to load and interpret/compile source code during run-time. This ability is not extensively used these days, for what I can see in sources published on Internet (I don't know about commercial, closed source code). So in this sense is certainly more dynamic than Small C. But, as counterbalance, is necessary to consider also that these small fragments of code are intended to be developed with a philosophy of early binding, as stated by Chuck Moore.

I think this is enough for now.

-- Mad70

Another quote from Tom Novelli's paper is:

[..] As an intermediate language, such as the proposed TUNES LLL, Forth is a poor choice. It does little to abstract the differences between hardware platforms [..]

I disagree. Forth doesn't come out of the box with hardware abstractions, but that doesn't mean it can't abstract the differences between hardware platforms easily. It can. It is easy to add new words to Forth's vocabulary that abstract the hardware layer out.

Look at EMIT, a common Forth word for sending a character to the screen; is it not an abstraction? It can be implemented for all the different hardware platforms. It will behave the same way. Similar words to EMIT can easily be made for any other hardware abstractions you desire such as accessing files, memory, or using the fpu vs. floating-point implemented in software.

The fact is, Forth encourages abstractions. Coding in Forth is all about how well you can abstract and factor out your code. It provides all the tools that a programmer needs to abstact his code quickly and easily.

Looking at it in a different way, a minimal Forth can be implemented for each architecture in less than 5KB (2KB-3KB is possible). Each Forth could contain all the primitives necessary to run an entire OS ontop of it. A programmer with a decent grasp of assembly could implement a simple, minimal Forth for any architecture within a few days.

Because of the ease in implementing a Forth for a new architecture, I like to say that Forth is the most portable language out there. As long as all the Forth implementations have the same wordset and as long as Forth programs don't delve into assembly, there is no issue about the portability of the code.

The nice thing about this kind of portability is that the machine is still accessible. A hardware platform with 3d graphics can be fully utilized, whereas for other platforms, the 3d graphics support is implemented in software.

-- thin

ANY language can access the raw hardware, if the OS lets it. Forth is handy for testing new OSes and embedded systems, I'll give you that - mainly because it reduces upload/reboot cycles. I'm actually working on Retro again.. but I don't see Forth playing a central role in the future.

For the record, I only said a Small C COMPILER is inherently faster than a Forth COMPILER. Forth does a dictionary search on every token, and the dictionary is bigger than in C. All other things being equal, of course. The Forth Philosophy (tm) is great. C is a generally a better language, though.

I stand by my conclusions. And by the way, I'm making good progress improving the user interface & managability of Linux.

-- Tom Novelli

Forth's speed compared to C depends entirely on the CPU type. On many CPUs, an unoptimized Forth will compete, if not outperform, some optimizing C compilers. The opposite may be true on others, but the former seems to be the more common case.

On another note, Forth can usually handle more subroutine calls than C, referring to both speed and memory. A Forth with good register assignments can have the minimal amount of call overhead the CPU allows. For example, on x86 using STC, when using the hardware stack as the return stack, just CALL and RET are needed for each call. No "stack frame" or the like needs to be set up at the beginning of each word, and destroyed at the end. Additionally, no information needs to be saved before a call.

-- Arke

... although the execution of RET/CALL instruction chains under the Intel cause quite severe stalls. As Agner Fog explains:

[..] jump instructions, calls, and returns are not fully pipelined. You cannot execute a new jump in the first clock cycle after a preceding jump. So the maximum throughput for jumps, calls, and returns is one for every two clocks.

Due to Forth style, there tend to be a lot more small routines than say, c, and those 4 clock penalties do add up. As Arke explains however, we are neither worrying about a stack frame, or wasting cycles duplicating values up and down the stack. Adding stack simulation and inlining to a compiler seem to remove most of these performance problems outright anyway (at the expense of a more complex compiler). I would argue that Forth is NOT intrinsically slow, moreover that it is significantly easier to write an extremely efficient optimizing native compiler for, it's just that people don't. Forth has had a hard time because the system as usually described draws on so much of the traditional (outdated) implementation details of the language and the happy hack-factor days of FIGForth. Like Lisp in the 80's, most Forth compilers are dragging their heels compared to the mainstream, which is a pity, since unlike Lisp, Forth promises both simplicity and efficiency, although perhaps at the expense of some readability.

-- Matt Seddon

This of course assumes that all forth words are invoked as subroutines. No modern forth is implemented this way.

-- chuck

This page is linked from: Forth Small C