17 comments

  • lasgawe10 minutes ago
    Great article. There are many things every developer should do when starting to learn programming or when trying to improve their skills. This is one of them. I once built a shell-like programming language (not an interpreter). If anyone reading this wants to improve their skills, I strongly suggest building your own shell from scratch.
  • lvales6 hours ago
    Building a shell is a great exercise, but honestly having to deal with string parsing is such a bother that it robs like 2&#x2F;3 of the joy along the way. I once built a very simple one in Go [0] as a learning exercise and I stopped once I started getting frustrated with all the corner cases.<p>[0] <a href="https:&#x2F;&#x2F;github.com&#x2F;lourencovales&#x2F;codecrafters&#x2F;blob&#x2F;master&#x2F;shell-go&#x2F;shell-go.go" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;lourencovales&#x2F;codecrafters&#x2F;blob&#x2F;master&#x2F;sh...</a>
    • chubot1 hour ago
      A common problem I noticed is that if you took certain courses in computer science, you may have a pre-conceived notion of how to parse programming languages, and the shell language doesn&#x27;t quite fit that model<p>I have seen this misconception many times<p>In Oils, we have some pretty minor elaborations of the standard model, and it makes things a lot easier<p><i>How to Parse Shell Like a Programming Language</i> - <a href="https:&#x2F;&#x2F;www.oilshell.org&#x2F;blog&#x2F;2019&#x2F;02&#x2F;07.html" rel="nofollow">https:&#x2F;&#x2F;www.oilshell.org&#x2F;blog&#x2F;2019&#x2F;02&#x2F;07.html</a><p>Everything I wrote there still holds, although that post could use some minor updates (and OSH is the most bash-compatible shell, and more POSIX-compatible than &#x2F;bin&#x2F;sh on Debian - e.g. <a href="https:&#x2F;&#x2F;pages.oils.pub&#x2F;spec-compat&#x2F;2025-11-02&#x2F;renamed-tmp&#x2F;spec&#x2F;compat&#x2F;TOP.html" rel="nofollow">https:&#x2F;&#x2F;pages.oils.pub&#x2F;spec-compat&#x2F;2025-11-02&#x2F;renamed-tmp&#x2F;sp...</a> )<p>---<p>To summarize that, I&#x27;d say that doing as much work as possible in the lexer, with regular languages and &quot;lexer modes&quot;, drastically reduces the complexity of writing a shell parser<p>And it&#x27;s not just one parser -- shell actually has 5 to 15 different parsers, depending on how you count<p>I often show this file to make that point: <a href="https:&#x2F;&#x2F;oils.pub&#x2F;release&#x2F;0.37.0&#x2F;pub&#x2F;src-tree.wwz&#x2F;_gen&#x2F;_tmp&#x2F;match.re2c-input.h.html" rel="nofollow">https:&#x2F;&#x2F;oils.pub&#x2F;release&#x2F;0.37.0&#x2F;pub&#x2F;src-tree.wwz&#x2F;_gen&#x2F;_tmp&#x2F;m...</a><p>(linked from <a href="https:&#x2F;&#x2F;oils.pub&#x2F;release&#x2F;0.37.0&#x2F;quality.html" rel="nofollow">https:&#x2F;&#x2F;oils.pub&#x2F;release&#x2F;0.37.0&#x2F;quality.html</a>)<p>Fine-grained heterogenous algebraic data types also help. Shells in C tend to use a homogeneous command* and word* kind of representation<p><a href="https:&#x2F;&#x2F;oils.pub&#x2F;release&#x2F;0.37.0&#x2F;pub&#x2F;src-tree.wwz&#x2F;frontend&#x2F;syntax.asdl.html" rel="nofollow">https:&#x2F;&#x2F;oils.pub&#x2F;release&#x2F;0.37.0&#x2F;pub&#x2F;src-tree.wwz&#x2F;frontend&#x2F;sy...</a> (~700 lines of type definitions)
    • healeycodes6 hours ago
      Author here, and yeah, I agree. I skipped writing a parser altogether and just split on whitespace and `|` so that I could get to the interesting bits.<p>For side-projects, I have to ask myself if I&#x27;m writing a parser, or if I&#x27;m building something else; e.g. for a toy programming language, it&#x27;s way more fun to start with an AST and play around, and come back to the parser if you really fall in love with it.
    • ferguess_k3 hours ago
      Can say the same for control characters in terminals. I even think maybe it&#x27;s just easier to ditch them all and use QT to build a &quot;terminal&quot; with clickable urls, something similar to what TempleOS does.
  • rrampage39 minutes ago
    Fun read! I built a minimal Linux shell [0] in c and Zig last year which does not depend on libc. It was a great way to learn about execve, the new-ish clone3 syscall and how Linux starts a process. Parsing strings is the least fun part of the building the shell.<p>[0] <a href="https:&#x2F;&#x2F;gist.github.com&#x2F;rrampage&#x2F;5046b60ca2d040bcffb49ee38e86041f" rel="nofollow">https:&#x2F;&#x2F;gist.github.com&#x2F;rrampage&#x2F;5046b60ca2d040bcffb49ee38e8...</a>
  • wei032883 hours ago
    The pipe section is the part that changes how you think about processes. Once you&#x27;ve manually done the dup2 dance — close write-end in parent, close read-end in child, wire them up — it stops being magic and starts being obvious why `grep | sort | uniq` works at all. The thing that surprised me building a similar toy was how late in the process job control has to come: you can get a working pipe chain surprisingly fast, and then job control (SIGTSTP, tcsetpgrp, the whole mess) costs 5x more than everything else combined.
    • chubot1 hour ago
      Yup, job control is a huge mess. I think Bill Joy was able to modify the shell, the syscall interface, and the terminal driver at the same time to implement the hacky mechanism of job control. But a few years later that kind of crosscutting change would have been harder<p>One thing we learned from implementing job control in <a href="https:&#x2F;&#x2F;oils.pub" rel="nofollow">https:&#x2F;&#x2F;oils.pub</a> is that the differing pipeline semantics of bash and zsh makes a difference<p>In bash, the last part of the pipeline is forked (unless shopt -s lastpipe)<p>In zsh, it isn&#x27;t<p><pre><code> $ bash -c &#x27;echo hi | read x; echo $x&#x27; # no output $ zsh -c &#x27;echo hi | read x; echo $x&#x27; hi </code></pre> And then that affects this case:<p><pre><code> bash$ sleep 5 | read ^Z [1]+ Stopped sleep 5 | read zsh$ sleep 5 | read # job control doesn&#x27;t apply to this case in zsh ^Zzsh: job can&#x27;t be suspended </code></pre> So yeah the semantics of shell are not very well specified (which is one reason for OSH and YSH). I recall a bug running an Alpine Linux shell script where this difference matters -- if the last part is NOT forked, then the script doesn&#x27;t run<p>I think there was almost a &quot;double bug&quot; -- the script relied on the `read` output being &quot;lost&quot;, even though that was likely not the intended behavior
  • emersion6 hours ago
    Some time ago I&#x27;ve written an article about a particular aspect of shells, job control: <a href="https:&#x2F;&#x2F;emersion.fr&#x2F;blog&#x2F;2019&#x2F;job-control&#x2F;" rel="nofollow">https:&#x2F;&#x2F;emersion.fr&#x2F;blog&#x2F;2019&#x2F;job-control&#x2F;</a>
  • doe8835 minutes ago
    Is there a (real) shell whose code is relatively short and self contained and would be valuable to read? This was always something I wanted to do but never quite spent time to look for a good one to explore.
    • giancarlostoro31 minutes ago
      Although not the same... Destroy All Software has videos on building your own shell using Ruby. I watched it to learn and it was a lot of fun to watch him basically building a shell, I&#x27;m not really a Ruby guy, but it was easy to grasp. It&#x27;s not free, you would need a subscription, but its worth the watch otherwise.<p><a href="https:&#x2F;&#x2F;www.destroyallsoftware.com&#x2F;screencasts&#x2F;catalog&#x2F;shell-from-scratch" rel="nofollow">https:&#x2F;&#x2F;www.destroyallsoftware.com&#x2F;screencasts&#x2F;catalog&#x2F;shell...</a>
    • epr16 minutes ago
      I think there&#x27;s a good one if you search around for &quot;xv6 sh.c&quot;. Hard to tell immediately from a google search just now since there are many implementations (people do it in school) and github&#x27;s currently blocking requests from my phone.<p>Also helpful may be running strace on your shell, then reviewing the output line by line to make sure you understand each. This is a VERY instructive exercise to do in general.
  • rigorclaw6 hours ago
    The pipe implementation section is really clean. Working through fork&#x2F;exec&#x2F;dup2 by hand like this is one of those exercises that makes you appreciate how much composability Unix got right. Processes that know nothing about each other just work together because they read stdin and write stdout. I built something similar years ago and the moment pipes actually worked felt like unlocking a cheat code.
  • mzs6 hours ago
    Had an assignment to build a shell in a week, how hard could it be?<p><pre><code> controlling terminal session leader job control </code></pre> The parser was easy in comparison.
  • ratzkewatzke1 hour ago
    There&#x27;s a very good exercise on Codecrafters (<a href="https:&#x2F;&#x2F;app.codecrafters.io&#x2F;courses&#x2F;shell&#x2F;overview">https:&#x2F;&#x2F;app.codecrafters.io&#x2F;courses&#x2F;shell&#x2F;overview</a>) to walk you through writing your own shell. I found it enlightening, as well as a good way to learn a new language.
  • lioeters2 hours ago
    Link was previously posted by author: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=47398749">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=47398749</a> There are other good quality articles on their site, and maybe deserves the imaginary points.
  • hexer3033 hours ago
    Unix shells are conceptually simple but hide a surprising amount of complexity under the hood that we take for granted. I recently had build my own PTY controller. There were so many edge-cases to deal with. It took weeks of stress testing and writing many tests to get it right.
  • dirk940183 hours ago
    Interesting. I wanted to do toast | bash to let the AI drive the computer but the bash shell really got in the way. Too much complexity. The things that annoy humans, $ expansion, special characters, etc don&#x27;t work for AI either. Ended up writing a custom shell for AI (and humans). When a tool gets in the way, sometimes it just time to change the tool.
  • zokier6 hours ago
    Bit of pedantry but I don&#x27;t think traditional unix shell (like this) follows repl model; the shell is not usually doing <i>printing</i> of the result of <i>evaluation</i>. Instead the printing happens more as a side effect of the commands.
    • jermaustin15 hours ago
      I remember my first shell programming I ever did was batch in windows back in the 3.11&#x2F;95 days.<p>The first line was always to turn off echo, and I&#x27;ve always wondered why that was a decision for batch script. Or I&#x27;m misremembering. 30 years of separation makes it hard to remember the details.
      • enoint4 hours ago
        Echo in that case prints command lines before executing them. Its analog is `set -x` rather than `echo`.
    • skydhash5 hours ago
      It’s a shell, not the whole thing. The whole thing is the shell+kernel+programs.
  • austy695 hours ago
    Fun read. Wonder if you are able to edit text in the shell, or if you need to implement a gap buffer to allow it?
    • healeycodes5 hours ago
      Editing the current line works because I brought in <a href="https:&#x2F;&#x2F;man7.org&#x2F;linux&#x2F;man-pages&#x2F;man3&#x2F;readline.3.html" rel="nofollow">https:&#x2F;&#x2F;man7.org&#x2F;linux&#x2F;man-pages&#x2F;man3&#x2F;readline.3.html</a> towards the end so I could support editing, tab completion, and history.<p>IIRC readline uses a `char *` internally since the length of a user-edited line is fairly bounded.
      • austy694 hours ago
        Very cool. Currently working on the beginning of a small text editor so this part seemed interesting and was curious of any overlap. Thanks for the interesting post!
  • stainlu19 minutes ago
    [dead]
  • leontloveless6 hours ago
    [dead]
  • hristian6 hours ago
    [flagged]
    • jmmv6 hours ago
      Somebody blamed this comment on LLMs, and maybe&#x2F;probably it is, but I think the first sentence is spot-on so I thought it was worth replying to.<p>Dealing with the corner cases ends up teaching you a lot about a language and for an ancient language like the shell, dealing with the corner cases also takes you through the thinking process of the original authors and the constraints they were subject to. I found myself in this situation while writing EndBASIC and wrote an article with the surprises I encountered, because I found the journey fascinating: <a href="https:&#x2F;&#x2F;www.endbasic.dev&#x2F;2023&#x2F;01&#x2F;endbasic-parsing-difficulties.html" rel="nofollow">https:&#x2F;&#x2F;www.endbasic.dev&#x2F;2023&#x2F;01&#x2F;endbasic-parsing-difficulti...</a>
    • gf0006 hours ago
      Not sure it tells all that much about &#x27;how the OS works&#x27;. This is a historical abstraction that happened to look how it looks today with all its numerous warts and shortcomings.<p>We can easily imagine it done a better way - for all the criticism of Windows, PowerShell gives a glimpse into this hypothetical future.
    • Retr0id6 hours ago
      Fascinating that you resurrected an account from 2014 just for LLM spam, were the credentials compromised or something?
      • IncreasePosts6 hours ago
        Maybe the author had it logged into something that their claw had access to