An attempt to articulate Forth's practical strengths and eternal usefulness

(im-just-lee.ing)

85 points by todsacerdoti61 days ago

20 comments

ofalkaed54 days ago
Considerably better than most such articles that I have read on this but I think if the Forth community wants to get people into Forth it really needs to stop talking about how it can fit in a boot sector and the REPL; the former is not of interest or use to most programmers and the latter is probably a major cause of the misconception of Forth code being impossible to read.What I see as the real strength of Forth is that if you write your program in source files, there is no abstraction. You stick your word definitions in those source files and let the application you are writing dictate the words you define instead of relying on the dictionary you built up on REPL and things quickly start becoming easy and the code remains readable. It might seem like a lot of work and endlessly reinventing the wheel but if you start with a featureful Forth like gforth it does not take that much time or effort once you have the basics down; you can build complex applications with gforth as easily as you can build up a full Forth with that minimal Forth that fits in your boot sector.The main thing is learning that application development with Forth is top down design and bottom up writing. You break the application up into its major functions, break those into smaller functions, break those into words and then break those words into those simple sort words that everyone in the Forth community says you should write. Then you start writing code. I am just starting to get the hang of Forth and it is surprisingly quick and powerful once you start getting the sense of it.
- kunley53 days ago
 Do you have any references to a quick and powerful Forth examples of responding to a web request or manipulating text; you know, typical stuff we deal with every day
 - 7thaccount53 days ago
 <a href="https://8th-dev.com/words.html" rel="nofollow">https://8th-dev.com/words.html</a><a href="https://8th-dev.com/manual.html" rel="nofollow">https://8th-dev.com/manual.html</a>Not sure if that is the best example, but go to the "network" section and you can see plenty of examples of connection stuff. Also cool things in the map, graph, console, hardware, DB (database), and nuklear (GUI) sections.It is commercial though (albeit with a free tier iirc), so that may or may not be attractive for you if you wanted to see all source. For me, I just wanted to spend time playing around with a well polished ~forth that had all these things builtin, so fine for my more limited use cases. Coming from Python as my daily driver, I found it really easy to pickup the tooling and have fun building some super simple toy apps. The most default data structure is basically JSON, which is a pretty unconventional forth approach, but just clicked with me as I'm used to Python dictionaries. You might also be able to do all that with gForth, but not sure (referring to the ease of use from high level data structures).There was also a forth-like language written in C# that was open source I think and pretty cool. It might have been retroforth which is available in various formats and has been talked about on here many times. I think the source comes with the zip file, but haven't looked in years. I assume it has some utility libraries for doing normal things.
 - NooneAtAll353 days ago
 "Ch. 1 What is 8th?" - and nothing in the chapter explains what it issome CLI program
 - 7thaccount53 days ago
 All that is shown here:<a href="https://8th-dev.com/" rel="nofollow">https://8th-dev.com/</a>As for the manual's omission, I'm guessing it's a typo. I want to say it used to be there, but haven't looked in a long time.
 - kunley53 days ago
 Thank you.PS. Didn't realize the existence of 8th, looks cool
xelxebar54 days ago
Oh, cool! SmithForth[0] is how I originally learned about x86-64 microarchitecture. It's a Forth that bootstraps off hand-coded x86-64 opcodes. I decided to go the other direction and decompile the binary by hand. It really is a beautiful piece of code. Highly recommended reading.Also, you're excited by Forth and Lisp, you might like Forsp[1]. It uses a call-by-push-value evaluation strategy in a way that really makes the language feel like a true hybrid of the two. The implementation is also brilliantly simple.Anyway, thank you for the article. I lurk on the DuskOS mailing list and wish I could find out where the Forthers gather IRL so I could osmote more of their world into my own.[0]:<a href="https://dacvs.neocities.org/SF/" rel="nofollow">https://dacvs.neocities.org/SF/</a>[1]:<a href="https://xorvoid.com/forsp.html" rel="nofollow">https://xorvoid.com/forsp.html</a>
- binary13253 days ago
 I’ve started thinking of concatenative languages as “backwards lisps” in my head and so far it seems like a reasonable take.
ebiederm54 days ago
There is an aspect of the history of Forth and C I have been trying to wrap my head around.The early B compiler was reported to generate threaded code (like Forth). The threaded code was abandoned fairly early in the port to the PDP11 from the PDP7 as it was deemed to slow to write an operating system in.At which point unix and C lost a very interesting size optimization. With the net result that Forth was more portable and portable between machines circa 1970 and Unix had to wait until circa 1975 with UnixV6.I have been trying to go back through the history to see if I could understand why threaded code was deemed too slow. Today the most part of executables is code that is not run often and would probably benefit from a smaller representation (less memory and cache space making the system faster overall). So this is a practical question even today.I found a copy of unix for the PDP7. Reproduced from old printouts typed in. If I have read the assembly correctly the B compiler was not using an efficient form of threaded code at all.The PDP7 is an interesting machine. It's cells were 18 bits wide. The adress bus was 12bits wide. Which meant there was room for an opcode and a full address in every cell.As I read the B compiler it was using a form of token threading with everything packed into a single 18 bit cell. The basic operations of B were tokens and an if token encoded with a full address in the cell. Every token had to be decoded via a jump table, the address of the target code was then plugged into a jump instruction which was immediately run.Given the width of the cells, I wonder what the conclusions about performance of B would have been if subroutine threading or a similar technique using jmp instructions would have been.Does anyone know if Forth suffers measurably in inner loops from have to call words that perform basic operations?Is this where a Forth programmer would be accustomed to write the inner loop in assembly to avoid the performance penalty?
- jacquesm54 days ago
 I can probably shed some light on that. I've used Forth on 8 bit platforms (6502, 6809), 16 bit platforms (80286) and 32 bit platforms (68K), as well as assembly, and on the 16 and 32 bit platforms C. Putting these side-by-side and assuming roughly equivalent programmer competence levels at the time assembler would win out, C would get you to maybe half the speed of assembly on a good day and Forth was about 10x slower than that.Which was still incredibly fast for the day, given that Forth was compiled to an intermediary format with the Forth interpreter acting as a very primitive virtual machine. This interpretation step had considerable overhead, especially in inner loops with few instructions the overhead would be massive. For every one instruction doing actual work you'd have a whole slew of them assigned to bookkeeping and stack management. What in C would compile to a few machine instructions (which a competent assembly programmer of the time would be able to significantly improve upon) would result in endless calls to lower and lower levels.There were later Forth implementations that improved on this by compiling to native code but I never had access to those when I was still doing this.For a lark I wrote a Forth in C rather than bootstrapping it through assembly and it performed quite well, Forth is ridiculously easy to bring up, it is essentially a few afternoons work to go from zero to highway speeds on a brand new board that you have a compiler for. Which is one of the reasons it is still a favorite for initial board bring-up.One area where Forth usually beat out C by a comfortable margin was code size, Forth code tends to be extremely compact (and devoid of any luxury). On even the smallest micro controllers (8051 for instance, and later, MicroChip and such) you could get real work done in Forth.
 - ebiederm46 days ago
 Thanks. That helps put things into perspective.I looked and today Swift Forth from Forth Inc, has some version of a compiler that generates not horrible code so I expect the 10x slowdown compared to C is a thing of the past.
- rerdavies54 days ago
 It is no so much the parts of the code that run infrequently that contribute to poor performance, but the very tiny <1% of all code that does run frequently, and should be running completely in cache. So code size doesn't have an enormous impact on speed of execution.The overhead of threading seems pretty obvious: call and return instructions are expensive compared to the cost of the one equivalent instruction that would have been executed in a compiled implementation. And placing arguments on a stack means that all operands have to to be read from and written to memory, incurring additional ferocious overhead, whereas a compiler would enregister values, particularly in performance-critical code. Not unreasonable to expect that Forth code is going to run at least an order of magnitude slower than compiled code.
 - Someone53 days ago
 Some of those could be partially fixed in hardware. Examples:- returns can run in as good as zero time- when calling a word, the CPU could prefetch the cache line containing the next word to be called, on the assumption that it would be called soon (an assumption that would be correct for many “almost leaf” calls)- the top of the stack could be kept in register-speed memory.For an example, see <a href="https://users.ece.cmu.edu/~koopman/stack_computers/sec4_4.html" rel="nofollow">https://users.ece.cmu.edu/~koopman/stack_computers/sec4_4.ht...</a>:“The internal structure of the NC4016 is designed for single clock cycle instruction execution. All primitive operations except memory fetch, memory store, and long literal fetch execute in a single clock cycle. This requires many more on-chip interconnection paths than are present on the Canonical Stack Machine, but provides much better performance.[…]The NC4016 subroutine return bit allows combining a subroutine return with other instructions in a similar manner. This results in most subroutine exit instructions executing "for free" in combination with other instructions. An optimization that is performed by NC4016 compilers is tail-end recursion elimination. Tail-end recursion elimination involves replacing a subroutine call/subroutine exit instruction pair by an unconditional branch to the subroutine that would have been called.”(¿Almost?) all modern hardware is designed for other language paradigms, though, so these won’t help with hardware you can buy.
- kragen53 days ago
 > Does anyone know if Forth suffers measurably in inner loops from have to call words that perform basic operations?Yes, the slowdown is of the order of 8× for DTC, ITC, and bytecode ("token threading"). Eliminating the jump table reduces the overhead a bit, but it's still order 8×.The B compiler bundled a copy of the bytecode interpreter into each executable; that might have made it less appealing as a size optimization. For a big enough program it would still have won.Subroutine threading is really just compact native code, but it still suffers from typically about 4× overhead for basic operations like dup, @, +, or exit (the traditional name for the runtime effect of ;). The primitive operations these execute are typically one or two cycles on a RISC such as a Cortex-M4, while a subroutine call and return are two more cycles, often plus two to four cycles of pipeline bubble (if the processor doesn't have good enough branch prediction). Presumably on the PDP-7 a subroutine call would have needed an additional memory cycle to store the return address into memory and another one to fetch it, plus two more memory cycles to fetch the call and return instructions. (I'm not familiar with the -7's instruction set, so correct me if I'm wrong.)With respect to dup, though, commonly dup, drop, swap, and over represent operations that don't appear in optimized native code—they just tell the following operations which data to operate on, a purpose which is normally achieved by operand fields in native code. So the runtime overhead of stack-bytecode interpretation is a worse than it appears at first: each bytecode instruction takes time of the order of 4× or 8× the time of as a native instruction doing the same thing, but you have to run about twice as many bytecode instructions because about half of them are stack manipulation. So your total slowdown is maybe 8× or 16×.You may also be interested in looking at the program dc, which IIRC was one of the programs Unix was originally written to run. It's a stack bytecode designed to be written by hand, like HP desk calculators of the time but with arbitrary precision.
chuckadams53 days ago
I'm familiar enough with Forth, picked it up in the 80's with Blazin' Forth on a C64 with Leo Brodie's book at my side. Played a little with the boot environment on some Sun boxes, made some simple contributions to another Forth project on x86 that I can't remember the name of. I never got the wide-eyed wonder some people seem to have about it though, let alone saw it as a silver bullet. The tax bracket code at the bottom of the article is a pretty good illustration of why: it's a really elegant sort of macro assembler, but I'm not really interested in writing whole apps in macro assembly.
aperrien54 days ago
What I like about Forth is that it can be expressed at the lowest level of computation, and that it can be used to bridge that to the highest level of computation. For example, Forth only requires about 12 opcodes to run, which can be implemented in a few dozen chips. But now that you have that, since it's Turing-complete, you can now pull across a lisp or C compiler, and build a working operating system from there. Granted, that would be a lot of work, but it's relatively straightforward work, and that's always impressed me.
- kragen53 days ago
 As anthk points out at <a href="https://news.ycombinator.com/item?id=46272791">https://news.ycombinator.com/item?id=46272791</a>, you really only need one opcode for any program...
vdupras54 days ago
I like the spirit of this article, but I find it strange that they open their article by quoting me, but then don't include Dusk OS's C compiler in the list.Fairly counting SLOC is a tricky problem, but my count is 1119 lines of code for the C compiler (written in Forth of course), that's less than 8x the count of chibicc, described as the smallest.
- fallat54 days ago
 (typing from phone) i simply had not known. it is a great example of a short path to a c dialect then! merci pour votre travaille au dusk et collapse! i will add a paragraph about it to the essay. this took me many days to write and many revisions and still i see it isnt perfect!note the point of that section was really that anyone using gcc or clang should ack. the real cost when using them.
- kragen54 days ago
 That's a good point, thanks. How complete is it?
 - vdupras54 days ago
 By design, it's not a fully compliant ANSI C compiler, so it's never going to be complete, but it's complete enough to, for example, successfully compile Plan 9's driver for the Raspberry Pi USB controller with a minimal porting effort.So, Dusk's compiler is not apple-to-apple comparable to the other, but comparable enough to give a ballpark idea that its code density compares very, very favorably.
 - kragen54 days ago
 It can be hard to tell how much extra complexity would be introduced by unimplemented features.
 - vdupras54 days ago
 Indeed, and it might be why the author didn't try, but I still find it odd to not at least mention how small this C compiler is, even if it is to say that it's not apples-to-apples comparable. I mean, 8x smaller than the smallest C compiler listed is still something notable...
 kragen54 days ago
 It certainly sounds like a perfectly usable C dialect.
 - entaloneralie54 days ago
 It's complete in sofar as being capable of compiling C programs, but it has a few quirks.<a href="https://git.sr.ht/~vdupras/duskos/tree/master/item/fs/doc/comp/c.txt#L53" rel="nofollow">https://git.sr.ht/~vdupras/duskos/tree/master/item/fs/doc/co...</a>
 - kragen54 days ago
 Possibly this is why it wasn't mentioned? There are enough differences in that list that I can't imagine any existing C library would compile unchanged.
 - entaloneralie54 days ago
 After using cc<< for non-trivial programs, it's about as quirky as the Plan 9 C compiler, the lack of multi-dimensional arrays is the one thing that trips me up the most with cc<<
 - NooneAtAll354 days ago
 > The "&&", "||" and "?:" operators do shortcutting.is shortcutting different from short circuiting?
 - vdupras53 days ago
 It's the same thing, I used the wrong term.
 - norir54 days ago
 From what I can tell chibicc, unlike tcc, is not a complete c compiler in and of itself. Looking at its source code, it relies upon external tools for both x86_64 code gen and linking: <a href="https://github.com/rui314/chibicc/blob/90d1f7f199cc55b13c7fdb5839d1409806633fdb/main.c#L570-L573" rel="nofollow">https://github.com/rui314/chibicc/blob/90d1f7f199cc55b13c7fd...</a>
 - fuhsnn54 days ago
 It does rely on binutils, but by this standard GCC is not a complete C compiler either.
 - kragen53 days ago
 Relying on binutils and an assembler is fine. I don't think it really affects the internal complexity of the compiler much, but having textual assembly to look at can be handy for debugging the compiler, so it might reduce the human effort to get it working.
- entaloneralie54 days ago
 Came here to say exactly that, cc<< blows these numbers out of the water. Strange choice from the author.
forgotpwd1653 days ago
>The project's README then proceeds into lengthy detail about how to implement >Forth in C.Seems people have/are more fun/interested implementing Forth interpreters/compilers than actually using Forth. Same with CHIP-8. It's all about making emulators.
- DonHopkins53 days ago
 Then you can have fun implementing Forth in Forth, with a metacompiler that compiles itself!Here's some of Mitch Bradley's beautiful code from OpenFirmware, his Forth kernel meta-compiler written in Forth, which supports 8, 16, 32, an 64 bit, big-endian and little-endian architectures, as well as direct, indirect, and token threaded code, with or without headers, etc:kernel.fth: <a href="https://github.com/MitchBradley/openfirmware/blob/master/forth/kernel/kernel.fth" rel="nofollow">https://github.com/MitchBradley/openfirmware/blob/master/for...</a>metacompile.fth: <a href="https://github.com/MitchBradley/openfirmware/blob/master/forth/kernel/metacompile.fth" rel="nofollow">https://github.com/MitchBradley/openfirmware/blob/master/for...</a>The OpenFirmware kernel is a Forth meta-compiler, which can compile itself on any architecture, and also cross-compile for different target architectures.It has cross-architecture extensions to FORTH (like \16 \32 comments and /n /n* generically typed words) that make it possible to write platform, word size, byte order, and threading independent code, and compile images (stripped or with headers) for embedded systems (like the OLPC boot ROMs) and new CPU architectures (like Sun's transition from 68K to SPARC), and share code with a more powerful development environments.
 - larsbrinkhoff53 days ago
 In my mind, taking your toy Forth from implemented in C, assembler, or what have you, to metacompiled is transformative. I struggled at first, making a few abortive attempts. But when I finally did it, it was a revelation.
throwaway8152354 days ago
Article is lame in multiple ways, and also eForth was written by Bill Muench. Dr Ting adopted Muench's version to use assembly language bootstrapping instead of metacompilation. Bootstrapping is possibly easier for beginners to understand, but metacompilation is part of Forth's fiendish cleverness and it's a shame for an aficionado to miss out on it.
- fallat53 days ago
 Oh, I'll have to correct this! I've only seen eForth mentioned with Dr Ting's name all over. Thank you.The metacompilation part is really nice. Did the self-modification section of the essay not convey that to you? Because that's what it was :s I'll have to revise it.I really want this essay to be definitive, so even after 4 revisions there is still some way to go. All the comments have been extremely helpful to further reach that goal :)
 - throwaway8152353 days ago
 You should also look at how cmForth's metacompilation worked. It's even more fiendish, but relied on cmForth's separate interpreter and compiler dictionaries, which apparently was annoying to use. <a href="https://dl.acm.org/doi/10.1145/382125.382916" rel="nofollow">https://dl.acm.org/doi/10.1145/382125.382916</a>I didn't notice anything in the article that made me think of metacompilation, but maybe I missed it and should re-read.Added: you should also look at Bill Muench's version of eForth including its metacompiler. I'm not that big a fan of eForth for practical use (it's TOO minimal) but wow it is simple.
 - kragen53 days ago
 Dr. Ting did a lot of wonderful expository writing about different Forths, including eForth, but yes, the eForth Model is by Bill Muench.
kragen54 days ago
This is a pretty good explanation. I think it maybe undersells the importance of the REPL a bit. (I'm not a Forth expert, but I did write StoneKnifeForth, a self-compiling compiler in a Forth subset, and I've frequently complained about the quality of Forth explainers.)
- ofalkaed54 days ago
 >I think it maybe undersells the importance of the REPL a bit.Howo? Or would you agree that value is perhaps a more suitable word than importance? For me I think these articles have such a tendency to fixate on the strengths of Forth to the extent that they have reduced Forth to those strengths in the eyes of many. TFA does a fair job of avoiding this and shows Forth more as a powerful and flexible general purpose language than a very niche language, but I think it still focuses a bit much on the strengths at cost of the general.You are a Forth expert compared to me and probably most of HN, so try and keep that in mind with your response if you could.Edit: I am probably asking for insight into your workflow with Forth, but maybe not?
 - kragen53 days ago
 You can see my workflow with Forth in <a href="https://asciinema.org/a/621404" rel="nofollow">https://asciinema.org/a/621404</a>, which should help reinforce my point that I'm not an expert.What I mean is that Forth as a programming language is kind of... not great? Like, it's kind of hard to read and hard to write.For years I thought this might be just a question of familiarity, but not I'm resigned to the fact that I will probably not learn to read Forth as easily as I can read conventional infix syntax within my natural lifetime. I wrote my first RPN programs on an HP-38E calculator in about 01985, I wrote an RPL program to search my address book on my HP-48GX in the 90s, I wrote a parametric CAD system for laser cutting in PostScript, I wrote a quasi-Forth compiler that compiles itself to machine code, and I think I just have to give up on being able to read<pre><code> o ->s dup * o ->c dup * + sqrt 400 / amp ! </code></pre> as easily as<pre><code> long amp = sqrt(o->s*o->s + o->c*o->c) / 400.0; </code></pre> It might still be a problem of familiarity, rather than some kind of objective truth, but it's one I'm going to have to live with. (I'm absolutely sure that the reason I have to sound out Greek words letter by letter instead of reading them instantly the way I do in English, Spanish, French, or Portuguese is 100% a question of familiarity; it's completely implausible that Greek is objectively harder to read. I just have the Latin alphabet wired into my brain by decades of constant practice.)There's a familiarity problem I flatter myself to think is separate, with Forth's vocabulary; for example, within takes its arguments in the order x min max, and because I've programmed much less in Forth than in other languages, I always have to look things like that up, whereas I know the order of arguments to read() or strcat() without having to do so.It's not a huge difference from C; Forth has more metaprogramming and reflection power than C, but the syntax is less readable, and it's more error-prone in a variety of ways (parameter passing, recursion, types). Presumably infallible programmers would prefer Forth to C, since those weaknesses would not affect them, and they'd have less code to write. I am far from an infallible programmer.But a lot of those weaknesses are because Forth is designed as a single language for the whole system: assembler, high-level programming, editor commands, debugger, "shell" commands, the whole works. So you have things like ? which is simply defined as : ? @ . ; and seems kind of goofy from a programming-language perspective—why would you dedicate a precious single-character word to printing out the value of a memory location? How often do you want to do that in the middle of your program? Why not just write @ . instead? Wouldn't ? be more valuable in a switch/case statement or something?However, in the context where Forth grew up, the sibling of DTSS BASIC and DEBUG.COM and DDT, it makes perfect sense; if you've just tested a word (subroutine) that is supposed to change the value of a variable x, you want to be able to say x ? rather than typing out the whole x @ . phrase. It sounds trivial, but it's actually really important, especially if you can't touch-type, as most programmers couldn't at the time. BASIC did the same thing for the same reason: instead of typing print x or even printx you could type ?x to see the value of x.Similarly, the lack of syntax is important for things like editor interaction, or interactively poking at hardware registers, or whatever. As Yosef Kreinin wrote in <a href="https://yosefk.com/blog/i-cant-believe-im-praising-tcl.html" rel="nofollow">https://yosefk.com/blog/i-cant-believe-im-praising-tcl.html</a>:> The small overhead [of extra punctuation] is tolerable, though sucky, when you program, because you write the piece of code once and while you're doing it, you're concentrating on the task and its specifics, like the language syntax. When you're interacting with a command shell though, it's a big deal. You're not writing a program – you're looking at files, or solving equations, or single-stepping a processor. I have a bug, I'm frigging anxious, I gotta GO GO GO as fast as I can to find out what it is already, and you think now is the time to type parens, commas and quotation marks?! Fuck you! By which I mean to say, short code is important, short commands are a must.So, Forth is designed so that you can use it as a command language and a high-level programming language and an assembly language. It's like Robert A. Heinlein's ideal unspecialized Renaissance-man language: it can change a diaper, plan an invasion, butcher a hog, program a computer, etc. This (necessarily in my view) involves some compromises—the best possible result will often be worse as a high-level programming language than a language that's designed for just that, and worse as a command language than a language that's designed for just that, and maybe worse as an assembly language too.You can make a convincing argument for the general case of this with a 2×2 matrix of candidate language design features:<pre><code> ╭────────────┬──────────────────┬──────────────────╮ │ │ good for │ bad for │ │ │ command language │ command language │ ├────────────┼──────────────────┼──────────────────┤ │ good for │ │ │ │ high-level │ 0 │ 1 │ │ language │ │ │ ├────────────┼──────────────────┼──────────────────┤ │ bad for │ │ │ │ high-level │ 2 │ 3 │ │ language │ │ │ ╰────────────┴──────────────────┴──────────────────╯ </code></pre> The argument is simply that the set of candidate language design features that go in boxes 1 and 2 is not exactly the empty set. It would be an astounding coincidence if it were, wouldn't it? And every time you add a feature from box 1 to your language, you make it better as a programming language and worse as a command language, and vice versa for box 2. Omitting a feature from box 1 makes your language better as a command language and worse as a programming language, and vice versa for box 2.The more difficult argument to make is that the compromises are substantial. A skeptic might wonder whether the only compromises are trivial things like the ? I mentioned above. I think it's an argument Yossi has made well in the post I linked above, which has nothing specifically to do with Forth. Also, though, I think that a lot of Forth's design decisions that are unorthodox for programming languages, such as its lack of typing, its lack of syntax, and its lack of stack frames with local variables, are easily understood as accommodations for interactive use, and I think that they do in fact make it substantially worse as a programming language. This is highly debatable, and debated, but it is my current point of view.In my view, the REPL somewhat makes up for Forth's weaknesses as a programming language in two ways: first, by allowing you to interactively test your code as you write it, and second, by freeing you from having to write user interface code that does things like parse command lines.There are a few different ways that Forth encourages writing your code as a ravioli-code soup of tiny one-line definitions. Single-line definitions are easier to test interactively, and statically allocating your local variables allows you to share them between multiple definitions, which reduces the required parameter passing (the abstraction penalty). Implicit parameter passing also reduces the syntactic abstraction penalty of subroutine calls. And, barring inlining compiler optimizations, Forth is faster at calling subroutines than any other language (arguably except for other Forth-like things like FOCAL), reducing the abstraction penalty at runtime as well.This is both good and bad. Ravioli code is more flexible, because you can call existing definitions in new contexts, but harder to understand, because the definition you're editing might be called from a context you aren't seeing. If you were infallible, this greater composability would enable you to bootstrap from nothing to whatever application you wanted to build with less total code. This makes Forth's drawbacks less serious for throwaway code (which doesn't need to be understood or maintained) and for infallible programmers.Independent of any of this, the REPL is a huge advantage if you're exploring an unknown hardware platform that might be buggy. You probably need one, whether Forth or something else.So, that's why I think the REPL is very important for understanding Forth—both its virtues and its vices. It is valuable, but it also makes UX demands on other parts of the language which makes them worse in other ways.
 - kragen53 days ago
 I do have one objective thing to say about readability. In a pop infix language like C, Python, Lua, or JS, the expression<pre><code> e(d(), c(b, a())) </code></pre> has fairly clear dataflow: data flows from a and b to c, and from c and d to e. This is knowable even without any previous knowledge of those five identifiers. The RPN version, in languages like Forth, PostScript, and Factor<pre><code> a b c d e </code></pre> can just as well correspond to any of these dataflow patterns:<pre><code> a(); b(); c(); d(); e(); //none b(a); d(c); e(); e(d(c, b()), a); {a, b, c, d, e} // all going somewhere else together </code></pre> And many others. You don't know if a or b is consuming something left on the stack from before, either.On this basis I think it's at least somewhat defensible to claim that stack languages are "less readable": information about the dataflow graph which is easily available in the infix syntax is not present, at least locally. You can reconstruct it by knowing, or guessing, the stack effect of each word. But that's different from just having it plainly written down.As a result, in Forth and PostScript, I regularly have bugs where I pass a parameter to, or receive a result from, the wrong place. This is not a major practical problem (it's usually pretty easy to figure out in the REPL) but it serves as evidence that stack languages really do require more effort to read and understand than pop infix languages.Of course, you can make almost exactly the same argument that explicit typing helps readability, and implicit variable capture by closures hurts it. I think there's some merit in that, actually.
 - ofalkaed52 days ago
 Forth words are not functions and if you try and use them like functions, things will get messy. You should never have things like "a b c d e" in Forth, you should have "abcde," a single word with a descriptive name built from a, b, c, d, and e, and designed so all you have to worry about is the stack effect of "abcde" and not the words it is built on or the data flow. I would say this is like saying C is terrible because people do ridiculous things with macros.I will give your posts a reread with fresh eyes tomorrow and probably have more to say, it is a bit too much to digest at this hour.
 kragen52 days ago
 Forth "words" are subroutines. C "functions" are also subroutines. Forth "words" are "not functions" merely because the use of "function" for "subroutine" is terminology specific to certain families of languages (C, Lisp) to which Forth does not belong. This is all irrelevant because I didn't say anything about "functions", just dataflow. My example expression is just as valid in Perl (which calls subroutines "subroutines") or Ruby (which calls them "methods").A colon definition containing five sequential calls to different words without any control flow or stack manipulation is perfectly unremarkable. Here's a sample word from the F83 block editor, which I am using because F83 is generally accepted as highly competent Forth code, if not exemplary:<pre><code> n NEW moves the terminal's cursor to the start of line n, and overwrites lines until a line is begun with null input ( a Carraige Return). : NEW (S n -- ) L/SCR SWAP DO [ FORTH ] I [ EDITOR ] T EDIT-AT >IN OFF QUERY SPAN @ IF P ELSE [ FORTH ] I REDISPLAY LEAVE THEN .SCREEN LOOP .SCREEN ; </code></pre> The phrase t edit-at >in off query span is just such a sequence, six words long rather than 5. The word off here is a standard word that sets a memory location to 0, and >in is the standard input-pointer variable, although you can't really be sure of that without more context—note, for example, that the editor vocabulary has redefined i as an insertion command rather than the usual loop-counter definition, thus the vocabulary switching commands.The dataflow in that six-word code sequence is, in C syntax,<pre><code> t(i); edit_at(); off(&in); query(); &span ... </code></pre> But to figure that out, I had to look up the stack comments of t, edit-at, query, and span (which is just a variable), and know the stack effects of >in and off. (In the Forth-83 standard <a href="https://www.complang.tuwien.ac.at/forth/fth83std/FORTH83.TXT" rel="nofollow">https://www.complang.tuwien.ac.at/forth/fth83std/FORTH83.TXT</a> query was the usual way to do what we do nowadays with accept—read a line of input, implicitly into tib.)With slightly different definitions, this could easily have been, for example,<pre><code> edit_at(t(i)); off(&in); span(query()) ... </code></pre> Reconstructing the dataflow thus is not some kind of insuperable difficulty. It took me a while in this example, but were I more expert with Forth, and in particular Forth-83, I probably could have figured it out relatively quickly. If it were taking too much time, or if the stack comments were wrong, I could have figured it out interactively at the REPL, or single-stepping through the code in F83's debugger.My point is just that reconstructing the dataflow is a problem you have to solve when you are reading Forth code. The author knew what the dataflow was, presuming it's working code you're looking at. An efficient compiler would have to know, too. But, as the reader, you don't know until you reconstruct it mentally with your global knowledge of the things it's calling. It's an extra error-prone decoding step between the code you're looking at and the semantic understanding you seek.(I said "without stack manipulation", but stack manipulation generally makes the problem more mentally challenging, not less.)By contrast, with pop infix syntax, the dataflow is represented locally with the parentheses and commas.The same thing is true of types in languages with implicit typing (like OCaml and JS), of when call/return pairs follow a conventional stack discipline in Scheme (where the possibility of call/cc exists), of variable capture and escape in languages with closures, of which variables are mutated in languages like Python where all variables are mutable (and to a slightly smaller extent in languages like C, where immutability is possible but not default). The author has this information; the maintainer needs it to successfully modify the program; the compiler would need it to avoid producing grievously inefficient code; and there's no way to express it in the language.My claim is that, while the optimum may be subjective on this spectrum between implicitness and explicitness, the implicitness itself is objective. You claim to disagree, but to me it sounds like you disagreed simply because you didn't understand what I was trying to express.
 ofalkaed51 days ago
 Within the context of the four languages you listed and your example, I don't think Forth word vs function is merely terminology, it tells us a great deal of useful and important information. You clearly were able to grasp that but you worked under the assumption that I did not understand. There is nothing in Forth that requires you to make ambiguous constructs like that, you will come across them but like most every language (if not all), we sometimes trade readability for some other gain, or out of habit, ignorance, laziness, etc.
optimalsolver54 days ago
I always recommend the book Starting Forth [0].It has the most charming illustrations I've ever seen in a text book.[0] <a href="https://www.forth.com/starting-forth/" rel="nofollow">https://www.forth.com/starting-forth/</a>
kscarlet51 days ago
Well written in general. However:> C++ has method override but it's not the same: you cannot change the behavior of how addition works on two 64-bit integers (such as treating them both as fixed-point numbers).Wouldn't you just create a 1-field struct/class and override all the arithmetic operators? Or if you're less fixated about using the same operator (like me as a Lisper), invent a method called ADD and use that.> Changing addition to work on "bignum"s (numbers that have arbitrarily large precision) is a good usecase of overriding the addition operation.I don't see this as something unique to Forth compared to other languages, even C++.
larsbrinkhoff53 days ago
I emailed this to Lee. I guess it can go here too.---I have been fortunate to have worked professionally with Forth recently. It was so fun! But I still struggle to point out exactly why I like Forth, and why and how it's different. Your essay is fresh take, which is good.To me, maybe the most important lessons are.1. Eschew complexity (sometimes to a fault), and 2. Improve the code by redefining the problem. Look at things from another angle. (I hate to say it, but think out of the box.)Much of Forth falls out from these principles. E.g. people are quick to point out Forth is a stack based programming language. Which is true enough, but to me it's kind of beside the point. The point is the language does away with local variables (redefine the problem) to lay the ground for a much simpler implementation (eschew complexity).Yes, there's REPL. But why? Because Forth is (or can be) a programming language, operating system, compiler, and command line rolled into one. Heaps of layers and components done away with.File system, virtual system, code structure, documentation? Blocks!The list goes on. Once you internalize this, the veil falls from your eyes, and you see how much needless complexity stands in your way in most other languages, operating system, tools, apps, ... it's everywhere.
bvrmn54 days ago
It's hard to see practical strengths, especially with provided code examples. Most of tax code is stack tossing hiding core logic.Code as structure could be more conveniently expressed as language data structures as structure nowdays.
anthk53 days ago
EForth running under Subleq:<a href="https://howerj.github.io/subleq.htm" rel="nofollow">https://howerj.github.io/subleq.htm</a><a href="https://howerj.github.io/subleq.htm" rel="nofollow">https://howerj.github.io/subleq.htm</a>The same, but multiplexing instructions (it runs much faster):<a href="https://howerj.github.io/muxleq.htm" rel="nofollow">https://howerj.github.io/muxleq.htm</a>
- MaxBarraclough53 days ago
 The last link is broken. GitHub repo: <a href="https://github.com/howerj/muxleq" rel="nofollow">https://github.com/howerj/muxleq</a>
 - anthk53 days ago
 Thanks.
Joker_vD54 days ago
What... what are those "development effort estimates"? They seem to assume an average rate of approximately 11 lines of code written per day which seems a bit too low if you ask me.Besides, once a C compiler is written for one platform, porting it to another one takes significantly less time than writing from scratch (especially if the compiler is written with portability in mind).
- danparsonson53 days ago
 > They seem to assume an average rate of approximately 11 lines of code written per day which seems a bit too low if you ask me.I didn't calculate it but if you're just dividing one number by the other then you're assuming the final code arrived fully-formed in one go - don't forget about refactoring, debugging, testing, etc. etc.
- ErroneousBosh54 days ago
 > Besides, once a C compiler is written for one platform, porting it to another one takes significantly less time than writing from scratch (especially if the compiler is written with portability in mind).Why would you think that's different for any other language, like oh for example Forth?
 - Joker_vD54 days ago
 Why do you think I think so? I don't. Those effort estimates don't make the slightest sense.
- kragen53 days ago
 Yes, 11 lines of debugged, delivered source lines of code written per day is within the normal range. The basic COCOMO model used by David A. Wheeler's "SLOCCount" estimates person-months as 2.4 * (KSLOC**1.05), which was calibrated with painfully-won experience on a variety of software projects in the 80s. If you approximate that as 2.4 months per kSLOC, it works out to 13.7 lines of code per calendar day, or 18.5 lines of code per workday. Other studies found lower speeds. Like 10: <a href="https://softwareengineering.stackexchange.com/questions/450695/where-did-the-quote-or-study-of-developers-write-10-lines-of-code-per-day-come-f" rel="nofollow">https://softwareengineering.stackexchange.com/questions/4506...</a>Every programmer thinks this is stupid the first time they see it, because we can all remember times we wrote 50 or 100 or 200 or 300 lines of code in a day, and that code worked. But other days we write zero lines of code, because we spend them in meetings, or debugging code, or arguing about the protocol design, or testing hardware we need to support, or testing our code. And sometimes we'll spend a week refactoring the codebase to reduce the number of lines of code (<a href="https://www.folklore.org/Negative_2000_Lines_Of_Code.html" rel="nofollow">https://www.folklore.org/Negative_2000_Lines_Of_Code.html</a>) so those 200 or 300 lines of code may not end up in the delivered product—if they were ever intended to, since test stubs and test scripts also don't count as delivered code.And that's how we get to Peter Naur's "Programming as Theory Building": <a href="https://ratfactor.com/papers/naur1" rel="nofollow">https://ratfactor.com/papers/naur1</a> because writing the code clearly isn't the bottleneck. It's figuring out how things have to work that bottlenecks us. It's more like proving a theorem than writing a novel. The code is a product of the process, but it's more like the exhaust of an engine than its power output.And that's why the exponent of kSLOC is 1.05 instead of 1: bigger systems are harder to add to.We had a discussion of this here 17 years ago <a href="https://news.ycombinator.com/item?id=333650">https://news.ycombinator.com/item?id=333650</a> and maybe see also Forth authority Phil Koopman's take at <a href="https://betterembsw.blogspot.com/2010/05/only-10-lines-of-code-per-day-really.html" rel="nofollow">https://betterembsw.blogspot.com/2010/05/only-10-lines-of-co...</a> and Brian's take at <a href="https://blog.ndepend.com/mythical-man-month-10-lines-per-developer-day/" rel="nofollow">https://blog.ndepend.com/mythical-man-month-10-lines-per-dev...</a>.
Surac54 days ago
I often use a heavily forth inspired script language in my bigger c# projects. I have a hidden repl and can input scripts. I like how easy it is to produce results with such low vocabulary. Also there is no expression parsing
- graboid54 days ago
  Sounds cool, as someone interested in concatenative languages and also a user of C#, might I ask if you have a link?
jburgy53 days ago
Lovely article, which I took the liberty to share on the Forthish discord [1][1] <a href="https://discord.gg/9DveEJ42" rel="nofollow">https://discord.gg/9DveEJ42</a>
user____name53 days ago
The bootstrapping part ignores all the other stuff Forth leaves out, like all the analysis and optimization a C compiler does.
pmarreck53 days ago
Shouldn't Forth (when directly translated to either ARM or x86_64 assembly) be faster than compiled C?
- kragen53 days ago
  Depends on your compiler.