16 comments

  • kazinator215 days ago
    <i>Homoiconic</i> has a pretty clear definition. It was coined by someone in reference the property of a specific system, many decades ago. That system stored program definitions in the same form that the programmer entered them in (either just the original character-level text, or some tokenized version of it), allowing the definitions to be recalled at runtime and redefined. He turned &quot;same form&quot; into &quot;homoiconic&quot; with the help of Greek&#x2F;Latin. It&#x27;s all in the Wikipedia.<p>Line numbered BASIC is homoiconic: you can edit any line of code and continue the program.<p>POSIX shell lets functions be redefined. They can be listed with the <i>set</i> command executed without arguments, and copy-pasted.<p>In Common Lisp, there is a function called <i>ed</i>, support for which is implementation-defined. If support is available, it is supposed to bring up an editor of some kind to allow a function definition to be edited. That is squarely a homoiconic feature.<p>Without <i>ed</i> support or anything like it, the implementation does not retain definitions in a way that can be edited; i.e. is not homoiconic. Some Lisps compile everything entered into them; you cannot edit a <i>defun</i> because it has been turned into machine language.
    • gwd215 days ago
      &gt; Line numbered BASIC is homoiconic: you can edit any line of code and continue the program.<p>Oh man, anyone else remember those self-modifying BASIC programs, which would:<p>1. Clear the screen<p>2. Print a bunch of new BASIC lines on the screen, with a CONTINUE command at the end, thus:<p><pre><code> 100 PRINT $NEWVAR 110 &lt;whatever&gt; CONTINUE </code></pre> 3. Position the cursor at the top of the screen<p>4. Enable some weird mode where &quot;Enter&quot; was considered to be pressed over and over again<p>5. Execute the BREAK command, so that the interpreter would then read the lines just printed?<p>I forget the kinds of programs that used this technique, but thinking back now as a professional developer, it seems pretty wild...
      • gwd215 days ago
        Decided to look it up. Ah, memories:<p><a href="https:&#x2F;&#x2F;www.atariarchives.org&#x2F;creativeatari&#x2F;SelfModifying_Programs.php" rel="nofollow">https:&#x2F;&#x2F;www.atariarchives.org&#x2F;creativeatari&#x2F;SelfModifying_Pr...</a>
      • tolciho214 days ago
        You can (sort of) do this with a shell script combined with another process that seeks the (shared) file descriptor somewhere else in the file, as the shell is very line oriented. Not very well; it requires that the shell script block or sleep while the other process fiddles with the seek position.
        • gwd214 days ago
          In bash you have &quot;eval&quot;, which I&#x27;ve used for some monstrosities in the past; but at least doesn&#x27;t need to be visible to the user as you&#x27;re doing it!
      • eep_social215 days ago
        Your description makes me think of quines? <a href="https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;Quine_(computing)" rel="nofollow">https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;Quine_(computing)</a>
    • thaumasiotes215 days ago
      &gt; He turned &quot;same form&quot; into &quot;homoiconic&quot; with the help of Greek&#x2F;Latin.<p>Well, sort of. Mostly that&#x27;s just English.<p>There&#x27;s no Latin at all, but <i>hom-</i> [same] and <i>icon</i> [image] are arguably Greek roots. The Latin equivalents would be <i>eadem</i> [same, as in &quot;idempotent&quot;] and <i>imago</i> [image, and the feminine gender of this word explains why we need &quot;eadem&quot; and not &quot;idem&quot;]. I&#x27;m not sure how you&#x27;d connect those. (And you might have issues turning <i>imago</i> into an adjective, since the obvious choice would be <i>imaginary</i>.)<p>However, since <i>icon</i> begins with a vowel, I don&#x27;t think it&#x27;s possible for <i>hom-</i> to take the epenthetic <i>-o-</i> that appears when you&#x27;re connecting two Greek roots that don&#x27;t have an obvious way to connect. If the word was constructed based on Greek principles, it would be <i>hom(e)iconic</i>. Treating <i>homo-</i> as a prefix that automatically includes a final O is a sign of English; in Greek they&#x27;re separate things.<p>I remember that when there was a scandal around cum-ex financial instruments, a lot of people wanted to say that cum-ex was Latin for &quot;with-without&quot;, which it isn&#x27;t; it&#x27;s Latin for &quot;with-from&quot;. (&quot;Without&quot; in Latin is <i>sine</i>, as compare French <i>sans</i> or Spanish <i>sin</i>.) Cum-ex is <i>English</i> for &quot;with-without&quot;, and the same kind of thing is going on with <i>homoiconic</i>.
      • redbar0n211 days ago
        &gt; He turned &quot;same form&quot; into &quot;homoiconic&quot; with the help of Greek&#x2F;Latin.<p>«same form» in Greek would rather be «homomorphic». (Or in latin «eademforma», which could maybe be turned to «idoform» in english)<p>«Homoiconic» could also have been named «monomorphic» (single form), similar to to «polymorphic» (many forms).<p>«Homoiconic» in Greek means «same-likeness» or «self-similarity» in English.
      • Y_Y215 days ago
        I&#x27;d like to offer some additional amateur translation options for &quot;homoiconic&quot; to Latin. There&#x27;s already a decent word &quot;conformis&quot; which has the close English counterpart &quot;conformal&quot;, but if we&#x27;re inventing new words, I&#x27;d propose &quot;coninstar&quot;, as in &quot;con-&quot; meaning &quot;together in&#x2F;sharing&quot; and &quot;instar&quot; being &quot;representation&#x2F;form&quot;.
        • thaumasiotes215 days ago
          <i>Con-</i> before vowels is <i>co-</i>; compare <i>cohabit</i>; <i>coincide</i>.<p>(Technically, you wouldn&#x27;t expect an N before vowels anyway because the root word ends in an M, so hypothetically you&#x27;d have &quot;cominstar&quot;. But since the consonant just disappears before vowels, that&#x27;s moot. [Though technically technically, disappearing when before vowels is expected of M - this is a feature of Latin pronunciation generally - and not of N.])
          • Y_Y215 days ago
            I&#x27;ll plead ignorance here, and ask for clemency on the grounds that modern coinages like &quot;conurbation&quot; may be exempt, and also that there seem to be notable exceptions to this rule, like this example I&#x27;ve thrown together[0] :<p>&quot;con&quot;+&quot;iacio&quot; (also &quot;jacio&quot;) =&gt; &quot;conicio&quot; (also &quot;coicio&quot; also &quot;conjicio&quot;)<p>(Also &quot;coinstar&quot; is a trademark of those spare change gobblers you find after the register at Walmart.)<p>[0] <a href="https:&#x2F;&#x2F;www.perseus.tufts.edu&#x2F;hopper&#x2F;text?doc=Perseus:text:1999.04.0059:entry=conicio" rel="nofollow">https:&#x2F;&#x2F;www.perseus.tufts.edu&#x2F;hopper&#x2F;text?doc=Perseus:text:1...</a>
            • thaumasiotes215 days ago
              &gt; also &quot;jacio&quot;<p>It&#x27;d be a better example of an exception if it unambiguously started with a vowel. This is sort of the reverse of the case I pointed to above, where &quot;habito&quot; <i>does</i> start with a vowel, or rather it almost does, enough to trigger the same changes.<p><a href="https:&#x2F;&#x2F;www.etymonline.com&#x2F;word&#x2F;com-" rel="nofollow">https:&#x2F;&#x2F;www.etymonline.com&#x2F;word&#x2F;com-</a><p>&gt; Before vowels and aspirates, it is reduced to co-; before -g-, it is assimilated to cog- or con-; before -l-, assimilated to col-; before -r-, assimilated to cor-; before -c-, -d-, -j-, -n-, -q-, -s-, -t-, and -v-, it is assimilated to con-, which was so frequent that it often was used as the normal form.<p>I and J aren&#x27;t different letters in Latin, but they are different kinds of sound, if sometimes only hazily different. Same goes for U and V. By modern convention we have <i>convention</i> and <i>conjecture</i>; the hazy difference seems sufficient to explain why the Romans left us every variety of the compound, from <i>coniicio</i> through <i>conicio</i> to <i>coicio</i>. A naive analysis (the most I can really do) would say that <i>coniicio</i> comes from someone who sees <i>iacio</i> as starting with a consonant, <i>coicio</i> comes from someone who doesn&#x27;t, and <i>conicio</i> is a reduced form of <i>coniicio</i>.
            • lupire215 days ago
              And Google&#x27;s etymology feature says that <i>con-</i> and <i>-ation</i> are English, while <i>-urb-</i> is Latin.<p><a href="https:&#x2F;&#x2F;www.google.com&#x2F;search?q=conurbation" rel="nofollow">https:&#x2F;&#x2F;www.google.com&#x2F;search?q=conurbation</a>
              • thaumasiotes214 days ago
                That&#x27;s not the most objective decision in the world. If we&#x27;re describing &quot;conurbation&quot; in specific, why not call <i>-urb-</i> English too, taken from the common English word <i>urban</i>? <i>Urban</i> ultimately draws from Latin <i>urbs</i>, but so does <i>con-</i> draw from <i>com-</i> and <i>-ation</i> draw from <i>-io(n)</i>.<p>(In Latin, there are plenty of words ending in <i>-atio(n)</i>; however, within the language this is not a single unit, it&#x27;s a sequence of part of the verb stem plus two separate morphemes <i>-a-t-io(n)</i>. The <i>-at-</i> marks the passive participial form of an a-stem verb; compare <i>faction</i> (zero-stem), <i>inhibition</i> (e-stem).)
    • ValentinA23215 days ago
      &gt;stored program definitions in the same form that the programmer entered them in<p>&gt;allowing the definitions to be recalled at runtime and redefined<p>&gt;Some Lisps compile everything entered into them; you cannot edit a defun because it has been turned into machine language.<p>Ability to recall and redefine definitions at runtime, even when the language is compiled is orthogonal to homoiconicity. Ruby can do this (interpreted). Clojure too (compiled). To do so, they don&#x27;t store the program as text, they store source locations (file:&#x2F;&#x2F;...:line:col) and read the files from the disk (or jar). In fact any programming language that does source-mapping and has eval() is inches away from being able to do this. This was the case for Ruby and was made possible by the pry REPL library [1]. And then there are tools like javassist [2] that allow you to edit compiled code to some extent using a limited form of the language.<p>Note that in the case of lisps, this is entirely orthogonal to macros (the source is passed as arguments to macros in the form of an AST&#x2F;list rather than a pointer into a file), which is where homoiconicity shines. Storing code in the same format it is written in (strings) doesn&#x27;t alleviate the headache of processing it when you want to do meta programming.<p>Additionally, macros allow you to do structured meta programming: macros are guaranteed to only impact code they enclose. Compare this with redefinitions that are visible to the whole code base. This is like global vs local variables: macros don&#x27;t redefine code, they transform it.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;pry&#x2F;pry#edit-methods">https:&#x2F;&#x2F;github.com&#x2F;pry&#x2F;pry#edit-methods</a><p>[2] <a href="https:&#x2F;&#x2F;www.javassist.org&#x2F;tutorial&#x2F;tutorial2.html#before" rel="nofollow">https:&#x2F;&#x2F;www.javassist.org&#x2F;tutorial&#x2F;tutorial2.html#before</a>
      • marcosdumay215 days ago
        &gt; they store source locations (file:&#x2F;&#x2F;...:line:col) and read the files from the disk<p>That&#x27;s also known as &quot;storing the program as text&quot;.<p>But yeah, macros are related to another kind of homoiconicity, where the interpreted bytecode is written using the same symbols as your program data.<p>You can have both of those (source = bytecode) and (bytecode = data structures) only one of them or neither.
    • coldtea215 days ago
      &gt;<i>In Common Lisp, there is a function called ed, support for which is implementation-defined. If support is available, it is supposed to bring up an editor of some kind to allow a function definition to be edited. That is squarely a homoiconic feature.</i><p>It&#x27;s enough that the language stores the current source code and can reload it for that. So hot-code-swapping&#x2F;reload is enough, not homoiconicity needed - which makes it not so squarely a homoiconic feature.
    • ggm215 days ago
      I think this comment re-enforced my sense the author wanted to drive to a destination and didn&#x27;t want to divert down a road of &quot;why LISP homoiconic is different to eval()&quot; which I think was .. lazy.<p>The idea has merit. Having the REPL deal with the parse structure of data in such a way that taking parsed data and presenting it as code has a lower barrier to effective outcome on the current run state than eval() is pretty big.<p>I&#x27;d say eval() isn&#x27;t self-modifying. You can&#x27;t come out the other side of eval() with future execution state of yourself different. As I understand it, the homoiconic features of LISP means you can.
    • samth214 days ago
      Notably, the definition given in the Wikipedia entry referencing TRAC means that &quot;homoiconic&quot; is a property of an _implementation_, not of a language. This would mean that Lisp, a programming language, could not properly be described as homoiconic, since it admits multiple implementations including those that do not have this property (eg, SBCL rather clearly doesn&#x27;t).
  • galaxyLogic215 days ago
    If I understand the gist of this article it goes like ...<p>1. Scanner divides source-code-string into ordered chunks each with some identifying information, what is the type and content of each chunk.<p>2. The next stage better NOT be a &quot;Parser&quot; but a &quot;Reader&quot; which assembles the chunks into a well-formed tree-structure thus recognizing which chunks belong togeether in the branches of such trees.<p>3. Parser then assigns &quot;meaning&quot; to the nodes and branches of the tree produced by Reader, by visiting them. &quot;Meaning&quot; basically means (!) what kind of calculation will be performed on some nodes of the tree.<p>4. It is beneficial if the programming language has primitives for accessing the output of the reader, so it can have macros that morph the reader-produced tree so it can ask the parser to do its job on such a re-morphed tree.<p>Did I get it close?
    • Joker_vD215 days ago
      &gt; 2. The next stage better NOT be a &quot;Parser&quot; but a &quot;Reader&quot; which assembles the chunks into a well-formed tree-structure thus recognizing which chunks belong togeether in the branches of such trees.<p>&gt; 3. Parser then assigns &quot;meaning&quot; to the nodes and branches of the tree produced by Reader, by visiting them. &quot;Meaning&quot; basically means (!) what kind of calculation will be performed on some nodes of the tree.<p>So, an &quot;AST builder&quot; that is followed by a &quot;semantic pass&quot;. That&#x27;s... how most of the compilers have been structured, at least conceptually, since their invention. In particularly memory-starved environments those passes were actually separate programs, launched sequentially; most famously the ancient IBM FORTRAN compilers were structured like this (they couldn&#x27;t manage fit both the program being compiled <i>and</i> the whole compiler into the core; so they&#x27;ve split the compiler into 60-something pieces).
      • indigo945215 days ago
        It helps to read the article... the author was not introducing this as a novel concept, but elaborating on how this is a better mental model for how an interpreter or compiler works. It&#x27;s not Tokenize -&gt; Parse, it&#x27;s Tokenize -&gt; Read -&gt; Parse.<p>The article discusses this particularly with regards to the meme of LISPs being &quot;homoiconic&quot;. The author elaborates that the difference between LISPs and other programming languages lies actually not in &quot;homoiconicity&quot; (a Javascript string can contain a program, and you can run `eval` on it, hence Javascript is &quot;homoiconic&quot;), but in what step of the parsing pipeline they let you access: with Javascript, it&#x27;s before Tokenization happens; with LISPs, it&#x27;s after Reading happened, before the actual Parse step.
        • Joker_vD215 days ago
          I&#x27;ve actually read the article, thank you; the author also argues that this &quot;bicameral&quot; style is what allows one to have useful tooling since it can now consume tree-like AST instead of plain strings. Unfortunately, that is <i>not</i> the unique advantage of &quot;languages with bicameral syntax&quot; although the author appears (?) to believe it to be so. The IDEs has been dealing with ASTs long before LSP has been introduced although indeed, this has only been seriously explored since the late nineties or so, I believe.<p>So here is a problem with the article: the author believes that what he calls &quot;bicamerality&quot; is unique to LISPs, and that it also requires some S-expr&#x2F;JSON&#x2F;XML-like syntax. But that&#x27;s not true, isn&#x27;t? Java, too, has a tree-like AST which can be (very) easily produced (especially when you don&#x27;t care about the semantic passes such as resolving imports and binding names mentions to their definitions, etc.), and it has decidedly non-LISP-like syntax.<p>And no, I also don&#x27;t believe the author actually cares all that much about the reader&#x2F;parser&#x2F;eval being available inside the language itself: in fact, the article is structured in a way that mildly argues against having this requirement for a language to be said to have &quot;bicameral syntax&quot;.
          • indigo945215 days ago
            <p><pre><code> &gt; So here is a problem with the article: the author &gt; believes that what he calls &quot;bicamerality&quot; is unique to &gt; LISPs, and that it also requires some S-expr&#x2F;JSON&#x2F;XML- &gt; like syntax. </code></pre> I didn&#x27;t find that assumption anywhere in the article. My reading is that all interpreters and compilers, for any language, are built to implement two non-intersecting sets of requirements, namely to &quot;read&quot; the language (build an AST) and to &quot;parse&quot; the language (check if the AST is semantically meaningful). Therefore, all language implementations require Tokenization, Reading and Parsing steps, but not all interpreters and compilers are structured in a way that cleanly separates the latter two of these three sets of concerns (or &quot;chambers&quot;), and (therefore) not all languages give the programmer access to the results of the intermediate steps. Java obviously has an AST, but a Java program, unlike a LISP program, can&#x27;t use macros to modify its own AST. The programmer has no access to what the compiler &quot;read&quot; and can&#x27;t modify it.
            • Joker_vD215 days ago
              Mmmm. This article is like one of those duck-rabbit pictures, isn&#x27;t it? With a slight mental effort, you can read it one way, or another way.<p>So, here are some excerpts:<p><pre><code> These advantages (&quot;It’s a lot easier to support matching, indentation, coloring, and so on&quot;, and &quot;tools hit the trifecta of: correct, useful, and relatively easy&quot;) are offset by one drawback: some people just don’t like them. It feels constraining to some to always write programs in terms of trees, rather than more free-form syntax. Still, what people are willing to embrace for writing data seems to irk them when writing programs, leading to the long-standing hatred for Lispy syntaxes. But, you argue, “Now I have a bicameral syntax! Nobody will want to program in it!” And that may be true. But I want you to consider the following perspective. [...] a bicameral syntax that is a very nice target for programs that need to generate programs in your language. This is no longer a new idea, so you don’t have to feel radical: formats like SMT-LIB and WebAssembly text format are s-expressions for a reason. </code></pre> The last three paragraphs play upon each other: people hate Lispy syntax; people dislike bicameral syntaxes; S-expressions are bicameral syntax.<p>And notice that nothing in those excerpts and nothing in the text surrounding them (sections 4 to 7) really refers to the ability to access the program&#x27;s syntax from inside the program itself. In fact, the sections 1 to 2 argue that such an ability is not really all that important and is not what makes LISPs LISPs. Then what does? The article goes on about &quot;bicamerality&quot; (explicit distinction between the reader and the parser) but doesn&#x27;t ever mention again the ability of the program to modify its own syntax or eval.<p>I can&#x27;t help but to make the tacit deduction that those never-again-mentioned things are not part of &quot;bicamerality&quot;. You, perhaps, instead take those things as an implicit, never-going-out-of-sight context that is always implied to be important, so those things are never mentioned again because already enough has been said about them but they still are crucial part of &quot;bicamerality&quot;.<p>It&#x27;s a duck-reabbit article. We both perceive it very differently; perhaps in reality it&#x27;s just an amalgam of ideas that, when mixed together in writing, lack the coherent meaning?
              • indigo945214 days ago
                Yes, I understand your meaning now (and no longer understand the article&#x27;s, which indeed seems to quack like a rabbit).
      • skrishnamurthi215 days ago
        No, this isn&#x27;t what the article says. I have not bothered saying anything about the &quot;semantic pass&quot;, which is downstream from getting an AST. What the article talks about is <i>not</i> what &quot;ancient IBM FORTRAN compilers&quot; did.
      • aidenn0214 days ago
        The output of the Lisp reader is <i>not</i> an AST. It is completely unaware of many syntactical rules of the language, and is absent of any context. The equivalent in a C like language would be a stage that quite willingly generates a tree for the following:<p><pre><code> void foo(int int) { else { x = 3; } } </code></pre> Which most compilers will never construct a tree for despite it following some unifying rules for the structure of code in a C-like language (braces and parentheses are balanced, statement has a semicolon after it, &amp;c.).
    • skrishnamurthi215 days ago
      Author here. Yes, very close. #4 is not a bit strong: there is value to doing this <i>even if</i> you don&#x27;t have macros, for instance, because of other benefits (e.g., decent support from editors). But of course it <i>also</i> makes macros relatively easy and very powerful.
      • galaxyLogic214 days ago
        And what about homoiconity in Lisp vs. other lanaguages? In Lisp it means that programs are &quot;lists&quot; and so is &quot;data&quot;. Programs in lisp are more than strings, like in most other languages, they are &quot;nested lists&quot;. Lisps let us write prograssm as lists, adn store data as lists. JavaScript only allows us to write programs as (structureless) strings.<p>Of course that is well-known but I think it is a big deal, that you have such homo-iconicity in Lisp but no in most other languages. Prolog maybe?
  • codeflo215 days ago
    It seems that the Rust macro system is inspired by a similar idea: In the first step (the &quot;reader&quot; in this article&#x27;s terminology), the source is converted into something called a <i>token tree</i>.<p>A token tree is not a full parse tree with resolved operator precedence and whatnot. It only has child nodes for bracket pairs ((), [] and {}) and their contents, in part to determine where the macro call ends. Otherwise, it&#x27;s a flat list of tokens that the macro (what this article would call the &quot;parser&quot;) can interpret in any way it wants.
    • wruza214 days ago
      Sounds like Rust did to macros what I wanted long ago in C (and everyone frowned upon me for that). Lisps and sexprs aren’t exclusive to this. You <i>can</i> “load” the code into a var and modify it through regular data processing and then feed it to an executor. You just need language designers to implement that. This entire lisp homoiconicity religion bugged me since forever. It’s just a read-eval part of a loop which never had a requirement for everything to be represented as a Cons.
    • samth214 days ago
      Indeed, the Rust macro system was designed by people who had worked on the Racket macro system previously.
    • moomin214 days ago
      I think you’re right. What LISP really brought to the party was a very simple token structure. This made it pretty easy to express manipulations of that structure and hence create whatever macros you like.<p>This is instantly useful to the compiler writer because most of “LISP” is built upon more basic primitives. The disadvantage is the Jeff Goldblum “You scientists” meme.
  • kibwen215 days ago
    I liked the first half of the article, but I&#x27;m not sure I got anything from the second half. As the author notes, in order to be useful a definition must exclude something, and the &quot;bicameral&quot; distinction doesn&#x27;t seem to exclude anything; even Python eventually gets parsed into a tree. Conceptually splitting out &quot;parsing&quot; into &quot;tree validation&quot; and &quot;syntax validation&quot; is slightly interesting (although isn&#x27;t this now a <i>tricameral</i> system?), but in practice it just seems like a simple aid to constructing DSLs.<p><i>&gt; These advantages are offset by one drawback: some people just don’t like them. It feels constraining to some to always write programs in terms of trees, rather than more free-form syntax.</i><p>I think this is misdiagnosing why many people are averse to Lisp. It&#x27;s not that I don&#x27;t like writing trees; I love trees for representing data. But I don&#x27;t think that thinking of code as data is as intuitive or useful as Lisp users want me to think it is, despite how obviously powerful the notion is.
    • Y_Y215 days ago
      I also struggled with the &quot;bicameral&quot; definition. The best I could come up with is that because e.g. Scheme represents code and and data in the same way (isn&#x27;t there a word for this?) it&#x27;s possible to represent and manipulate (semantically) invalid code. This is because the semantics are done in the other &quot;chamber&quot;. The example given was `(lambda 1)` which is a perfectly good sexp, but will error if you eval it.<p>This could be contrasted with C where code (maybe more precisely program logic) is opaque (modulo preprocessor) and can only be represented by function pointers (unless you&#x27;re doing shellcode). Here the chamber that does the parsing from text (if we don&#x27;t look inside GCC) also does semantic &quot;checking&quot; and so while valid functions can be represented within C (via the memory contents at the function pointer), the unchecked AST or some partial program is not represented.<p>I&#x27;ve tried not to give too many parentheticals above, but I&#x27;m not sure the concept holds water if you play tricks. Any Turing machine can represent any program, presumably in a way that admits cutting it up into atoms and rearranging to an arbitrary (potentially invalid) form. I&#x27;d be surprised if this hasn&#x27;t been discussed in more detail somewhere in the literature.<p>This
    • chubot215 days ago
      It excludes languages that build a single AST directly from tokens. I am pretty sure Clang is like this, and probably v8. (They don&#x27;t have structured macros, so it&#x27;s not observable by users.)<p>As opposed to building first an untyped CST (concrete syntax tree), and then transforming that into a typed AST.<p>CPython does exactly this, but it has no macro stage either, so it&#x27;s not exposed to users. (Python&#x2F;ast.c is the CST -&gt; AST transformation. It transforms an untyped tree to a typed tree.)<p>So the key reason it matters is that it&#x27;s a place to insert the macro stage.<p>---<p>I agree that the word &quot;bicameral&quot; is confusing people, but it basically means &quot;reader --&gt; parser&quot; as opposed to just &quot;parser&quot;.<p>The analogies in the article are very clear to me -- in this world, JSON and XML parsers are &quot;readers&quot;, but they are NOT &quot;parsers&quot;! (and yes that probably confuses many people, some new words could be necessary)<p>The JSON Schema or XML Schema would be closer to the parser -- it determines whether you have a &quot;for loop&quot; or &quot;if statement&quot;, or an &quot;employee&quot; and &quot;job title&quot;, etc.<p>Another clarifying comment - <a href="https:&#x2F;&#x2F;lobste.rs&#x2F;s&#x2F;ici6ek&#x2F;bicameral_not_homoiconic#c_bmx0vf" rel="nofollow">https:&#x2F;&#x2F;lobste.rs&#x2F;s&#x2F;ici6ek&#x2F;bicameral_not_homoiconic#c_bmx0vf</a>
      • chubot215 days ago
        I&#x27;ll also argue that the ideas in this post absolutely matter in practice.<p>For example, Github Actions uses YAML as its Reader &#x2F; S-expression &#x2F; CST layer.<p>And then it has a separate &quot;parser&quot;, for say &quot;if&quot; nodes, and then another parser for the string value of those &quot;if&quot; nodes.<p><a href="https:&#x2F;&#x2F;docs.github.com&#x2F;en&#x2F;actions&#x2F;writing-workflows&#x2F;workflow-syntax-for-github-actions#jobsjob_idif" rel="nofollow">https:&#x2F;&#x2F;docs.github.com&#x2F;en&#x2F;actions&#x2F;writing-workflows&#x2F;workflo...</a><p><pre><code> if: ${{ ! startsWith(github.ref, &#x27;refs&#x2F;tags&#x2F;&#x27;) }} if: github.repository == &#x27;octo-org&#x2F;octo-repo-prod&#x27; </code></pre> This fact is poorly exposed to users:<p><i>You must always use the ${{ }} expression syntax or escape with &#x27;&#x27;, &quot;&quot;, or () when the expression starts with !, since ! is reserved notation in YAML format.</i><p>So I feel that they could have done a better job with language design by taking some lessons from the past.<p>Gitlab has the same kind of hacky language on top of YAML as far as I remember
  • Karellen215 days ago
    I thought part of the beauty of homoiconicity, which doesn&#x27;t seem to be mentioned here, is not just that it&#x27;s natural to interpret tokens as code, but that it&#x27;s possible to interpret <i>the code of the program that&#x27;s currently running</i> as tokens, and manipulate them as you would any other data in the program?
    • tines214 days ago
      Yeah, exactly. The whole point is macros and metaprogramming!
  • zzo38computer215 days ago
    It is not only Lisp. PostScript is also homoiconic; tokens have values like any other values (and procedures are just executable arrays (executing an array involves executing each element of that array in sequence), which can be manipulated like any other arrays). The {} block in PostScript is a single token that contains other tokens; the value of the token is an executable array whose elements are the values of the tokens that it contains.<p>Strings don&#x27;t make it &quot;homoiconic&quot; in the usual way, I think; so, JavaScript does not count.
    • ashton314215 days ago
      You might be interested in what the author has to say about weak vs strong homoiconicity then…
      • lmm215 days ago
        The author doesn&#x27;t go far enough; eval operating on strings is still very weak (unless your language is something like BrainFuck that really doesn&#x27;t have a more structured representation available). The point is exposing the structured form that the language implementation runs as datastructures within the language - and not as some second-class reflection API, but directly as they are. You want to be able to capture something like an AST representation (not necessarily literally an AST), manipulate it, and then run it.<p>I think &quot;Bicameral&quot; isn&#x27;t really a great way to capture this, because there are often multiple layers of parsing&#x2F;lexing&#x2F;compilation&#x2F;interpretation and you might want to hook in at multiple of them (e.g. in lisps you may have both reader macros that operate at a low-level stage and higher-level macros that operate after parsing). And of course it&#x27;s a spectrum, but essentially the more the language exposes itself as a set of compositional libraries rather than just being a monolithic service.
        • astrobe_214 days ago
          On a side note, I was expecting &quot;bicameral&quot; as in [1].<p>[1] <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Bicameral_mentality" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Bicameral_mentality</a>
  • clausecker215 days ago
    Another language with this property is FORTH, which has many surprising similarities with LISP. I like to call it “LISP, but the other way round.” It usues RPN instead of PN, stacks&#x2F;arrays instead of lists, and is procedural instead of functional.
    • obijohn215 days ago
      I was thinking about this reading the article. In fact, I’ve recently seen Lisp implemented in Forth[0] and Forth implemented in Lisp[1]. In both cases, the implementations are decently complete and surprisingly efficient (i.e. not “toy” interpreters).<p>I think this is due to a significant property shared by both languages: the parser’s primary role is distinguishing between numbers and anything that’s not a number. No need to worry about operator precedence, keywords, or building complex syntax trees. Tokens are numbers and “not-numbers”, and that’s it.<p>In Forth, a “not-number” is a Word, and in Lisp a Symbol, both of which can be variables or functions. The only difference between the two is that Forth checks for Word definitions first, and Lisp checks for numbers first. If you wanted to redefine 4 to 5 for some reason, Forth’s got your back, but Lisp will save you ;).<p>A Forth Dictionary is very similar to a Lisp Environment; they both serve as a lookup table for definitions, and they both allow the programmer (or program!) to redefine words&#x2F;symbols.<p>They also both have REPLs to facilitate a much more dynamic development cycle than other REPLs in most languages.<p>I could go on, but on a fundamental level the similarities are striking (at least to me, anyway). It’s an interesting rabbit hole to explore, with lots of “drink me” bottles laying around. It’s fun here.<p>[0] <a href="https:&#x2F;&#x2F;git.sr.ht&#x2F;~vdupras&#x2F;duskos&#x2F;tree&#x2F;master&#x2F;item&#x2F;fs&#x2F;doc&#x2F;comp&#x2F;lisp.txt" rel="nofollow">https:&#x2F;&#x2F;git.sr.ht&#x2F;~vdupras&#x2F;duskos&#x2F;tree&#x2F;master&#x2F;item&#x2F;fs&#x2F;doc&#x2F;co...</a><p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;gmpalter&#x2F;cl-forth">https:&#x2F;&#x2F;github.com&#x2F;gmpalter&#x2F;cl-forth</a>
  • ValentinA23215 days ago
    As a long time lisper I don&#x27;t think homoiconicity is that relevant, at least when comparing lisps with other programming language. What I miss when writing C++ is the incremental compilation model of lisps, and in particular the ability to have compile time data drive code generation.<p>Homoiconicity is more useful when comparing lisps IMO, and pondering on how they could be improved. To me, homoiconicity is a constant struggle and should be appreciated in degrees because homoiconicity is about <i>immediacy</i>.<p>A lisp that doesn&#x27;t allow you to embed data along with code, JSON&#x2F;Javascript style, is less homoiconic than a language that does, and it&#x27;s more about what the core library allows than the language itself. For instance I&#x27;d say Clojure is more homoiconic than Scheme because it allows you to embed hashmaps in your code natively, whereas in scheme you only have `(make-hash-table)` without the corresponding reader macro. Similarly, a lisp without syntax quote would be less homoiconic than one that has it.<p>This is why I say it&#x27;s about <i>immediacy</i>. When you don&#x27;t have to deal with hashmaps, or templated s-exprs in terms of the process that builds them, the mediation layer disappears.<p>Things I&#x27;d like to be more immediate in Clojure:<p>- keeping track of whitespaces within s-exprs. Useful when you want to print code as it is indented in the source file. There&#x27;s a library for that (rewrite-clj), but it isn&#x27;t integrated in the reader+compiler pipeline, so it&#x27;s a bit of an headache as you have to read code from files, which implies, bridging the gap between the compilation pipeline and this library on your own.<p>- accessing semantic info within macros. Which functions use which variables. Which variables are global vs local (in particular when lexically shadowed), which variables are closed over by which lambdas, etc. To do this you have to use clojure.core.analyzer, which is very complex and poorly documented: not immediate enough.
  • taeric215 days ago
    <a href="https:&#x2F;&#x2F;taeric.github.io&#x2F;CodeAsData.html" rel="nofollow">https:&#x2F;&#x2F;taeric.github.io&#x2F;CodeAsData.html</a> was my take at exploring parts of this idea. Being able to manipulate code with the same constructs as you generally write the code is pretty cool.
  • djaouen215 days ago
    How one could have spent any time at all studying Lisp starting in the 80s (!) and not understand what the word &quot;homoiconic&quot; means is <i>baffling</i> to me!
    • kazinator215 days ago
      The term homoiconic does not come from the Lisp culture. I think it might have been in the 1990s that it came into use as a way of describing a property of languages in the Lisp family, using a different definition from the original homoiconic, and it might have been introduced by outsiders.<p>Using Google Books search, we can identify that a 1996 book called <i>Advanced Programming Language Design</i> by Raphael A. Finkel uses the word in this new way, claiming that TCL and Lisp are homoiconic.<p>The word returns to flatlining towards the end of the 1990s, and then surges after 2000.
      • mikelevins215 days ago
        I feel like use of the term &quot;homoiconic&quot; is misguided. It seems like an attempt to turn an incidental attribute of some Lisps into a sort of Platonic ideal. I don&#x27;t think that&#x27;s helpful.<p>I think the property being discussed is more understandable if you just describe it simply: in some Lisps (notably Common Lisp and its direct ancestors) source code is not made of text strings; it&#x27;s made of symbolic expressions consisting of cons cells and atoms.<p>The text that you see in &quot;foo.lisp&quot; isn&#x27;t Lisp source code; it&#x27;s a <i>serialization</i> of Lisp source code. You could serialize it differently to get a different text file, but the reader would turn it into the same source code. The actual source code is distinct from any specific text serialization of it.<p>We write programs in the form of text serialization because the reader will convert it for us, and because it&#x27;s easier and more rewarding to write good and comfortable text editors than to write good and comfortable s-expression editors.<p>There are of course text editors and addons that attempt to make text editing act more like s-expression editing, but I don&#x27;t know of many actual s-expression editors. The canonical one, I suppose, is Interlisp&#x27;s DEdit, which operates on actual s-expression data structures in memory.<p>From this point of view, what people mean by &quot;homoiconic&quot; is just that source code is all made of convenient arrangements of standard data structures defined by the language that can be conveniently operated on by standard functions defined by the language.<p>Or, to put it another way, &quot;homoiconic&quot; basically means &quot;convenient&quot;, and &quot;non-homoiconic&quot; means &quot;inconvenient&quot;.<p>That&#x27;s all there is to it, really, but it has far-reaching consequences. In a Lisp designed this way, basic manipulation of source code is trivially easy to do with operations that are all provided for you in advance by the language itself. That makes all sorts of code-processing tools exceptionally easy to write.<p>That&#x27;s not true in most languages. Take C, for example: sure, a C compiler parses text and turns it into an abstract syntax tree before processing it further in order to eventually yield executable machine code. Is all of that machinery part of the language definition? Can you count on those APIs and data structures to be exposed and documented by any arbitrary C compiler?<p>No.<p>In that sense, any programming language could be made &quot;homoiconic&quot; if enough people wanted it. They manifestly don&#x27;t, because most languages aren&#x27;t.<p>But some programmers prefer working with a language implementation that makes it so very easy to manipulate code. So that&#x27;s what we use.<p>It&#x27;s not some Platonic ideal of language design, but it doesn&#x27;t need to be. It&#x27;s a pragmatic design decision made by certain implementors in a certain lineage, and it has consequences that a certain fraction of programmers find congenial. Congenial enough that it makes some of us prefer to work with languages and implementations that work that way.
        • 082349872349872214 days ago
          Nice description; it makes me wonder if there are any languages in which code and data have different serialisations, but these are <i>isomorphic</i> in the sense that code and data can be turned into each other losslessly? (we ought to be able to round trip between the two: code-&gt;data-&gt;code and data-&gt;code-&gt;data ought to produce equivalent structures to what they started from)
          • kazinator214 days ago
            What would prevent us from being used the &quot;wrong&quot; way around: code being written with a data notation and vice versa. When the system prints data it could choose one or the other, based on some educated guess as to whether it is cold or data.
            • 082349872349872213 days ago
              nothing — I was assuming the conversions would need to be explicit.<p>Do you have a candidate in mind?
    • samth214 days ago
      Perhaps in that case you should supply the definition that the author ought to have known.
  • wduquette214 days ago
    TCL is exactly &quot;strongly homoiconic&quot; in the OP&#x27;s sense; one does metaprogramming by creating and evaluating strings in some desired context. It&#x27;s an advanced technique, but works quite well in practice. Many years ago I wrote a complete object system for TCL, SNIT, that executes a type definition script (using TCL syntax); this produces a new TCL script that actually implements the type; and then executes this new script. It&#x27;s been used in commercial products.<p>TCL is not &quot;bicameral&quot; in the OP&#x27;s sense, but that doesn&#x27;t seem to stop anyone from doing metaprogramming.
    • cmacleod4214 days ago
      I would argue that Tcl is almost &quot;bicameral&quot; in the OP&#x27;s sense. The application of the &quot;dodekalogue&quot; rules - <a href="https:&#x2F;&#x2F;wiki.tcl-lang.org&#x2F;page&#x2F;Dodekalogue" rel="nofollow">https:&#x2F;&#x2F;wiki.tcl-lang.org&#x2F;page&#x2F;Dodekalogue</a> - largely corresponds to the &quot;Reader&quot;. It goes further in that it also specifies substitution and evaluation rules, but it is similar in that it only applies a few basic structural rules, and knows nothing about the specifics of individual commands.<p>Tcl&#x27;s equivalent of the &quot;Parser&quot; is built-in to each command, which decides whether to interpret its arguments as data, code, option flags, etc..<p>I suspect this division of responsibilities is very helpful for metaprogramming techniques.
      • wduquette214 days ago
        This is true. In Lisp terms every TCL command is effectively a special form, and can do whatever it pleases with its arguments.<p>On the other hand, TCL provides much less support for building up the string to be evaluated if it&#x27;s more complex than a single command; and even for a single command it can be tricky.
  • peanut-walrus215 days ago
    &gt; <i>Data are data, but programs—entities that we can run—seem to be a separate thing.</i><p>Is this a view some people actually hold? Would be interesting to see some argumentation why someone would think this is the case.
    • zokier215 days ago
      Harvard architecture is a thing. If you can not access or manipulate the program in any way then its not really meaningful to call it data even if it is stored as bytes somewhere.
    • spiritplumber215 days ago
      The Story Of Mel
  • aidenn0214 days ago
    All I want for Christmas is the ability to redefine the CL scanner.<p>Seriously; if we could redefine the CL scanner, then e.g. package-local-nicknames could be a library instead of having to have been reimplemented in every single CL implementation.
  • acka215 days ago
    &quot;We started with Lisp, so let’s go back there. What is Lisp? Lisp is a feeling, an emotion, a sentiment; Lisp is a vibe; Lisp is the dew on morning grass, it’s the scent of pine wafting on a breeze, it’s the sound of a cricket ball on a bat, it’s the…oh, wait, where was I. Sorry.&quot;<p>Leaving this here, with the deepest respect.<p>Eternal Flame - Julia Ecklar <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=u-7qFAuFGao" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=u-7qFAuFGao</a>
  • svilen_dobrev215 days ago
    i have been using python as syntax &quot;carrier&quot; for many Domain languages&#x2F;DSL. Re-purposing what constructs like class:.., with..: certain func-calls, etc. mean within that. Works wonders.. though one has to be careful as it may not look like python at all :&#x2F;
  • nycticorax213 days ago
    Thoroughly enjoyed this!<p>Minor typo: &quot;an syntax&quot; in the second-to-last paragraph should be &quot;a syntax&quot;.