25 comments

  • ramon156145 days ago
    Cool stuff! I can see some GPT comments that can be removed<p>&#x2F;&#x2F; Increased for better learning<p>this doesn&#x27;t tell me anything<p>&#x2F;&#x2F; Use the constants from lib.rs<p>const MAX_SEQ_LEN: usize = 80;<p>const EMBEDDING_DIM: usize = 128;<p>const HIDDEN_DIM: usize = 256;<p>these are already defined in lib.rs, why not use them (as the comment suggests)
    • leoh144 days ago
      They should stay, because they are indicative of the fact that this wasn&#x27;t built with actual understanding.
    • mitchitized144 days ago
      You&#x27;re absolutely correct!
    • tialaramex145 days ago
      For the constants is it possible the author didn&#x27;t know how? I remember in my first week of Rust I didn&#x27;t understand how to name things properly, basically I was overthinking it.
      • vlovich123145 days ago
        Lots of signs this is an LLM-generated project. All the emojis in the README are a hint as well.
      • tayo42145 days ago
        From his reddit post<p><a href="https:&#x2F;&#x2F;old.reddit.com&#x2F;r&#x2F;rust&#x2F;comments&#x2F;1nguv1a&#x2F;i_built_an_llm_from_scratch_in_rust_just_ndarray&#x2F;ne8hu1m&#x2F;" rel="nofollow">https:&#x2F;&#x2F;old.reddit.com&#x2F;r&#x2F;rust&#x2F;comments&#x2F;1nguv1a&#x2F;i_built_an_ll...</a>
    • tmaly144 days ago
      did you add these as a PR ?
    • ericdotlee145 days ago
      Do you think vibe coded rust will rot the quality of language code generally?
      • 6r17144 days ago
        For AI you definitely need to clean up and I think even targeted learning on some practices would be beneficiary ; for users ; it depends on the people, and I&#x27;d argue that vibe-coded rust can be better than just &quot;written-rust&quot; IF the important details and mind of the user are actually focused on what is important ; Eg ; I could vibe-code a lock-free well architect-ed s3 - focus on all the important details that would actually make it high perf - or write some stuff myself 10x slower - which means I will have 10 x less time to work on the important stuff.<p>However what you asked is wether the vibe coded rust will rot the quality of language ; this is a more difficult to answer to, but I don&#x27;t think that people who are uninterested in the technics are going to go for rust anyway - from the signals I feedback people are actually not really liking it - they find it too difficult for some reason and prefer to blanket with stuff like C# or python.<p>Can&#x27;t explain why.
        • miki123211144 days ago
          &gt; I&#x27;d argue that vibe-coded rust can be better than just &quot;written-rust<p>I never thought about it this way, but it actually makes sense. It&#x27;s just like how Rust &#x2F; Go &#x2F; Java &#x2F; C# can sometimes be orders of magnitude faster than C, only because they&#x27;re more expressive languages. If you have a limited amount of time, it may be possible to write an efficient, optimal and concurrent algorithm in Java, while in C, all you can do is the simplest possible solution. Linked list versus slices (which are much more cache-friendly) is the perfect example here.
      • adastra22144 days ago
        These things will be corrected over time.
      • ramon156144 days ago
        Vibe coded is fine, but keep the comments useful. GPT&#x27;s are so quick with putting a comment on everything that it kind of enriches your codebase with slop. I wouldn&#x27;t call it rotting, but definitely redundant
    • sloppytoppy145 days ago
      [flagged]
  • untrimmed145 days ago
    As someone who has spent days wrestling with Python dependency hell just to get a model running, a simple cargo run feels like a dream. But I&#x27;m wondering, what was the most painful part of NOT having a framework? I&#x27;m betting my coffee money it was debugging the backpropagation logic.
    • ricardobeat145 days ago
      Have you tried uv [1]? It has removed 90% of the pain of running python projects for me.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;astral-sh&#x2F;uv" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;astral-sh&#x2F;uv</a>
      • mtlmtlmtlmtl145 days ago
        I&#x27;m sure it&#x27;s true and all. But I&#x27;ve been hearing the same claim about all those tools uv is intended to replace, for years now. And every time I try to run any of those, as someone who&#x27;s not really a python coder, but can shit out scripts in it if needed and sometimes tries to run python software from github, it&#x27;s been a complete clusterfuck.<p>So I guess what I&#x27;m wondering is, are you a python guy, or are you more like me? because for basically any of these tools, python people tell me &quot;tool X solved all my problems&quot; and people from my own cohort tell me &quot;it doesn&#x27;t really solve anything, it&#x27;s still a mess&quot;.<p>If you <i>are</i> one of us, then I&#x27;m really listening.
        • hobofan145 days ago
          I&#x27;m one of you.<p>I&#x27;m about the highest tier of package manager nerd you&#x27;ll find out there, but despite all that, I&#x27;ve been struggling to create&#x2F;run&#x2F;manage venvs out there for ages. Always afraid of installing a pip package or some piece of python-based software (that might muck up Python versions).<p>I&#x27;ve been semi-friendly with Poetry already, but mostly because it was the best thing around at the time, and a step in the right direction.<p>uv has truely been a game changer. Try it out!
        • tinco145 days ago
          As a Ruby guy: uv makes Python feel like it finally passed the year 2010.
          • llIIllIIllIIl144 days ago
            Don’t forget to schedule your colonoscopy as a Ruby guy
        • jhardy54145 days ago
          I’m a “Python guy” in that I write Python professionally, but also am like you in that I’ve been extremely underwhelmed by Portry&#x2F;Pipenv&#x2F;etc.<p>Python dependencies are still janky, but uv is a significant improvement over existing tools in both performance and ergonomics.
        • Yoric144 days ago
          As a developer: it basically solved all of my problems that could be solved by a package manager.<p>As an occasional trainer of scientists: it didn&#x27;t seem to help my students.
          • buildbot144 days ago
            It installs stuff super fast!<p>It sadly doesn’t solve stuff like transformer_engine being built with cxx11 ABI and pytorch isn’t by default, leading to missing symbols…
        • OrderlyTiamat144 days ago
          I&#x27;m (reluctantly) a python guy, and uv really is a much different experience for me than all the other tools. I&#x27;ve otherwise had much the same experience as you describe here. Maybe it&#x27;s because `uv` is built in rust? ¯\_ (ツ)_&#x2F;¯<p>But I&#x27;d also hesitate to say it &quot;solves all my problems&quot;. There&#x27;s plenty of python problems outside of the core focus of `uv`. For example, I think building a python package for distribution is still awkward and docs are not straightforward (for example, pointing to non-python files which I want to include was fairly annoying to figure out).
        • OoooooooO144 days ago
          As a mainly Python guy (Data Engineering so new project for every ETL pipeline = a lot of projects) uv solved every problem I had before with pip, conda, miniconda, pipx etc.
        • beacon294144 days ago
          It doesn&#x27;t handle python version management, it only handles pip. It doesn&#x27;t solve bundling Python.
          • re144 days ago
            It does handle python version management: <a href="https:&#x2F;&#x2F;docs.astral.sh&#x2F;uv&#x2F;concepts&#x2F;python-versions&#x2F;" rel="nofollow">https:&#x2F;&#x2F;docs.astral.sh&#x2F;uv&#x2F;concepts&#x2F;python-versions&#x2F;</a>
            • beacon294138 days ago
              That&#x27;s great news, I&#x27;ll have to try to replace pyenv (again).
        • J_Shelby_J145 days ago
          Isn’t UV essentially cargo for python?
          • adastra22144 days ago
            Somewhat literally so. It is written in Rust and makes use of the cargo-util crate for some overlapping functionality.
        • rossant144 days ago
          I know, but uv truly is different.
      • DiabloD3145 days ago
        uv is great, but I think the real fix is just abandoning Python.<p>The culture that language maintains is rather hostile to maintainable development, easier to just switch to Rust and just write better code by default.
        • trklausss145 days ago
          Every tool for the right job. If you are doing tons of scripting (for e.g. tests on platforms different than Rust), Python can be a solid valid alternative.<p>Also, tons of CAE platforms have Python bindings, so you are &quot;forced&quot; to work on Python. Sometimes the solution is not just &quot;abandoning a language&quot;.<p>If it fits your purpose, knock yourself out, for others that may be reading: uv is great for Python dependency management on development, I still have to test it for deployment :)
          • aeve890145 days ago
            &gt;Every tool for the right job. If you are doing tons of scripting (for e.g. tests on platforms different than Rust), Python can be a solid valid alternative.<p>I&#x27;d say Go is a better alternative if you want to replace python scripting. Less friction and much faster compilation times than Rust.
            • DiabloD3145 days ago
              I am not a huge fan of Go, but if all the world&#x27;s &quot;serious&quot; Python became Go, the average code quality would skyrocket, so I think I can agree to this proposal.
            • physicsguy145 days ago
              Go performance is terrible for numeric stuff though, no SIMD support.
              • 9rx145 days ago
                That&#x27;s not really true, but we&#x27;re talking about a Python replacement for scripting tasks, not core compute tasks, anyway. It is not like Python is the paragon of SIMD support. Any real Python workloads end up being written in C for good reason, using Python only as the glue. Go can also interface with C code, and despite all the flack it gets for its C call overhead it is still significantly faster at calling C code than Python is.
                • adastra22144 days ago
                  For the record of people reading this, I wrote a multithreaded SIMD-heavy compute task in Go, and it suffered only 5% slowdown vs the original hand-optimized C++ version.<p>The low level SIMD stuff was called out to over the c FFI bridge; golang was used for the rest of the program.
              • DiabloD3145 days ago
                (given the context of LLMs) Unless you&#x27;re doing CPU-side inference for corner cases where GPU inference is worse, lack of SIMD isn&#x27;t a huge issue.<p>There are libraries to write SIMD in Go now, but I think the better fix is being able to autovectorize during the LLVM IR optimization stage, so its available with multiple languages.<p>I think LLVM has it now, its just not super great yet.
              • wild_egg145 days ago
                Lots of packages out there using SIMD for lots of things.<p>You can always drop into straight assembly if you need to as well. Go&#x27;s assembler DX is quite nice after you get used to it.
              • pjmlp144 days ago
                Go itself no, but luckily like in any compiler toolchain, there is an Assembler available.
              • pclmulqdq145 days ago
                There are Go SIMD libraries now, and there&#x27;s also easy use of C libraries via Cgo.
        • airza145 days ago
          There&#x27;s not really another game in town if you want to do fast ML development :&#x2F;
          • DiabloD3145 days ago
            Dunno, almost all of the people I know anywhere in the ML space are on the C and Rust end of the spectrum.<p>Lack of types, lack of static analysis, lack of ... well, lack of everything Python doesn&#x27;t provide and fights users on costs too much developer time. It is a net negative to continue pouring time and money into anything Python-based.<p>The sole exclusion I&#x27;ve seen to my social circle is those working at companies that don&#x27;t directly do ML, but provide drivers&#x2F;hardware&#x2F;supporting software to ML people in academia, and have to try to fix their cursed shit for them.<p>Also, fwiw, there is no reason why Triton is Python. I dislike Triton for a lot of reasons, but its just a matmul kernel DSL, there is nothing inherent in it that has to be, or benefits from, being Python.... it takes DSL in, outputs shader text out, then has the vendor&#x27;s API run it (ie, CUDA, ROCm, etc). It, too, would benefit from becoming Rust.
            • mountainriver145 days ago
              I love Rust and C, I write quite a bit of both. I am an ML engineer by trade.<p>To say most ML people are using Rust and C couldn’t be further from the truth
              • Narishma145 days ago
                They said most people they knew, not most people.
            • wolvesechoes144 days ago
              &gt; It, too, would benefit from becoming Rust.<p>Yet it was created for Python. Someone took that effort and did it. No one took that effort in Rust. End of the story of crab&#x27;s superiority.<p>Python community is constantly creating new, great, highly usable packages that become de facto industry standards, and maintain old ones for years, creating tutorials, trainings and docs. Commercial vendors ship Python APIs to their proprietary solutions. Whereas Rust community is going through forums and social media telling them that they should use Rust instead, or that they &quot;cheated&quot; because those libraries are really C&#x2F;C++ libraries (and BTW those should be done in Rust as well, because safety).
            • nkozyra145 days ago
              &gt; Dunno, almost all of the people I know anywhere in the ML space are on the C and Rust end of the spectrum.<p>I wish this were broadly true.<p>But there&#x27;s too much legacy Python sunk cost for most people though. Just so much inertia behind Python for people to abandon it and try to rebuild an extensive history of ML tooling.<p>I think ML will fade away from Python eventually but right now it&#x27;s still everywhere.
              • DiabloD3144 days ago
                A lot of what I see in ML is all focused around Triton, which is why I mentioned it.<p>If someone wrote a Triton impl that is all Rust instead, that would do a _lot_ of the heavy lifting on switching... most of their hard code is in Triton DSL, not in Python, the Python is all boring code that calls Triton funcs. That changes the argument on cost for a lot of people, but sadly not all.
            • airza145 days ago
              Okay. Humor me. I want to write a transformer-based classifier for a project. I am accustomed to the pytorch and tensorflow libraries. What is the equivalent using C?
              • adastra22144 days ago
                You do know that tensorflow was written in C++ and the Python API bolted on top?
                • wolvesechoes144 days ago
                  It could be written in mix of Cobol and APL. No one cares.<p>People saying &quot;oh those Python libraries are just C&#x2F;C++ libraries with Python API, every language can have them&quot; have one problem - no other language has them (with such extensive documentation, tutorials etc.)
                  • adastra22144 days ago
                    Tensorflow has extensive documentation of its C++ interface, as that is the primary interface for the library (the Python API is a wrapper on top).
                    • wolvesechoes144 days ago
                      I hoped it was quite obvious that by &quot;other languages&quot; I meant &quot;other than Python and C&#x2F;C++ in which they are written&quot;.<p>At least sibling actually mentioned Java.
                      • adastra22144 days ago
                        Scroll up this thread and the other poster was asking if you can use pytorch and tensorflow from C. Both are C++ libraries, so accessing them from C&#x2F;C++ is pretty trivial and has first-class support.
                        • wolvesechoes144 days ago
                          You should read more carefully before responding.<p>I said &quot;beside Python, and C&#x2F;C++ in which they are written&quot;<p>You: &quot;you can see people are using it from C&quot;.<p>What a surprise that library usable from Python through wrapped C API has C API!
                  • pjmlp144 days ago
                    PyTorch and Tensorflow also support C++ (naturally) and Java.
                • airza144 days ago
                  I am. Are you suggesting that as an alternative to the python bindings i should use C to invoke the C++ ABI for tensorflow?
                  • adastra22144 days ago
                    &gt; Okay. Humor me. I want to write a transformer-based classifier for a project. I am accustomed to the pytorch and tensorflow libraries. What is the equivalent using C?<p>Use C++ bindings in libtorch or tensorflow. If you actually mean C, and not C++, then you would need a shim wrapper. C++ -&gt; C is pretty easy to do.
          • pjmlp144 days ago
            PyTorch also supports C++ and Java, Tensorflow also does C++ and Java, Apple AI is exposing ML libraries via Swift, Microsoft is exposing their AI stuff via .NET and Java as well, then there is Julia and Mojo is coming along.<p>It is happening.
            • famouswaffles144 days ago
              TensorFlow is a C++ library with a python wrapping, yet nobody (obviously exaggeration) actually uses tensorflow (or torch) in C++ for ML R&amp;D.<p>It&#x27;s like people just don&#x27;t get it. The ML ecosystem in python didn&#x27;t just spring from the ether. People wanted to interface in python badly, that&#x27;s why you have all these libraries with substantial code in another language yet development didn&#x27;t just shift to that language.<p>If python was fast enough, most would be fine to ditch the C++ backends and have everything in python, but the reverse isn&#x27;t true. The C++ interface exists, and no-one is using it.
              • pjmlp144 days ago
                The existing C++ API is done according to that &quot;beautiful&quot; Google guidelines, thus it could be much better.<p>However people are definitely using it, as Android doesn&#x27;t do Python, neither does ChromeOS.
                • famouswaffles144 days ago
                  &gt;However people are definitely using it, as Android doesn&#x27;t do Python, neither does ChromeOS.<p>That&#x27;s not really a reason to think people are using it for that when things like onnxruntime and executorch exist. In fact, they are very likely not using it for that, if only because the torch runtime is too heavy for distribution on the edge anyway (plus android can run python).<p>Regardless, that&#x27;s just inference of existing models (which yes I&#x27;m sure happens in other languages), not research and&#x2F;or development of new models (what &#x2F;u&#x2F;airza was concerned about), which is probably 99% in python.
                  • pjmlp144 days ago
                    Well, onnxruntime is also having polyglot bindings, and yet another way to avoid Python.<p>Yes, you can package Python alongside your APK, if you feel like having fun making it compiled with NDK, and running stuff even more slowly in phone ARM chipsets over Dalvik JNI than it already is on desktops.
        • pjmlp145 days ago
          I know Python since version 1.6.<p>It is great for learning on how to program (BASIC replacement), OS scripting tasks as Perl replacement, and embedded scripting in GUI applications.<p>Additionally understand PYTHONPATH, and don&#x27;t mess with anything else.<p>All the other stuff that is supposed to fix Python issues, I never bothered with them.<p>Thankfully, other languages are starting to also have bindings to the same C and C++ compute libraries.
        • wavemode144 days ago
          Rust is not a viable replacement for Python except in a few domains.
        • WhereIsTheTruth145 days ago
          abandoning Python for Rust in AI would cripple the field, not rescue it<p>the disease is the cargo cult addiction (which Rust is full of) to micro libraries, not the language that carries 90% of all peer reviewed papers, datasets, and models published in the last decade<p>every major breakthrough, from AlphaFold to Stable Diffusion, ships with a Python reference implementation because that is the language researchers can read, reproduce, and extend, remove Python and you erase the accumulated, executable knowledge of an entire discipline overnight, enforcing Rust would sabotage the field more than anything<p>on the topic of uv, it will do more harm than good by enabling and empowering cargo cults on a systemic level<p>the solution has always been education, teaching juniors to value simplicity, portability and maintainability
          • stonemetal12144 days ago
            Nah, it would be like going from chemistry to chemical engineering. Doing chemical reactions in the lab by hand is great for learning but you aren&#x27;t going to run a fleet of cars on hand made gas. Getting ML out of the lab and into production needs that same mental conversion from CS to SE.
        • Exuma145 days ago
          i hate python, but the idea of replacing python with rust is absurd
      • TheAceOfHearts145 days ago
        Switching to uv made my python experience drastically better.<p>If something doesn&#x27;t work or I&#x27;m still encountering any kind of error with uv, LLMs have gotten good enough that I can just copy &#x2F; paste the error and I&#x27;m very likely to zero-in on a working solution after a few iterations.<p>Sometimes it&#x27;s a bit confusing figuring out how to run open source AI-related python projects, but the combination of uv and iterating on any errors with an LLM has so far been able to resolve all the issues I&#x27;ve experienced.
      • shepardrtc144 days ago
        uv has been amazing for me. It just works, and it works fast.
    • farhanhubble144 days ago
      I have heard of similar experiences on HN a few times. Haven&#x27;t seen any such conflicts on real projects in the last five years or so, since I started using Poetry and then UV. I deal with data science code and the people writing it have a tendency to create dependency spaghetti, for example including the Scikit package in a mainly Pytorch code, just because they need a tried-and-tested accuracy() function.<p>I do remember banging my head against failed dependency resolution in my Early days of Python, circa 2014, with Pip and Conda, etc.<p>The dependency issues I have faced were mostly due to data science folks pinning exact package versions for the sake of replicability in requirements.txt for example
    • farhanhubble144 days ago
      My biggest gripes with Python are:<p>- exports being broken if code is executed from a different directory<p>- packaging being more complicated than it should be<p>and I don&#x27;t even have too much experience in the area of packaging, besides occasionally publishing to a private repo.
    • codetiger145 days ago
      I guess, resource utilization like GPU, etc
    • Galanwe145 days ago
      &gt; spent days wrestling with Python dependency hell<p>I mean I would understand that comment in 2010, but in 2025 it&#x27;s grossly ridiculous.
      • virtualritz144 days ago
        So in 2025, in Python, if I depend on two packages. A and B, and they both depend on different, API-incompatible or behavior-incompatible (or both) versions of C, that won&#x27;t be an issue?<p>That&#x27;s not my experience and e.g. uv hasn&#x27;t helped me with that. I believe this is an issue with Python itself?<p>If parent was saying something &quot;grossly ridiculous&quot; I must be doing something wrong too. And I&#x27;m happy to hear what as that would lower the pain of using Python.<p>I.e. this was assumably true three years ago:<p><a href="https:&#x2F;&#x2F;stackoverflow.com&#x2F;questions&#x2F;70828570&#x2F;what-if-two-python-packages-have-different-versions-of-package-dependencies" rel="nofollow">https:&#x2F;&#x2F;stackoverflow.com&#x2F;questions&#x2F;70828570&#x2F;what-if-two-pyt...</a>
        • Galanwe144 days ago
          Well, first, this a purposefully contrived example, that pretty much does not happen in real life scenarios. So you&#x27;re pretty much acknowledging that there is no real problem by having to resort to such length.<p>Second, what exactly would you like to happen in that instance? You want to have, in a single project, the same library but at different and conflicting versions. The only way to solve that is to disambiguate, per call site, each use of said library. And guess what, that problem exist and was solved 30 years ago by simply providing different package names for different major version. You want to use both gtk 1 and gtk 2 ? Well you have the &quot;gtk&quot; and &quot;gtk2&quot; package, done, disambiguated. I don&#x27;t think there is any package manager out there providing &quot;gtk&quot; and having version 1 and 2, it&#x27;s just &quot;gtk&quot; and &quot;gtk2&quot;.<p>Now we could design a solution around that I guess, nothing is impossible in this brave new world of programing, but that seems like a wasted effort for not-a-problem.
          • virtualritz144 days ago
            &gt; Well, first, this a purposefully contrived example [...]<p>So you are saying that (a) I made this up and (b) intentionally so.<p>How so? I am always flabbergasted when people make such statements.<p>You know nothing of my use of Python. I work in a specific field (computer graphics) and within that an even more specific sub field, visual effects.<p>I have to use Python maybe every three months. And there is some dependency related pain <i>every single time</i>. Python&#x27;s dependency management &quot;is straight up terrible&quot; (quoted from elsewhere in this thread), I concur.<p>And thusly, in my world, this example is not &quot;contrived&quot; and given the aforementioned circumstances -- that were unknown to you -- even less so &quot;purposefully&quot;.<p>&gt; Second, what exactly would you like to happen in that instance?<p>I would expect Python to namespace-wrap (on-the-fly) conflicting versions.<p>See Rust for some similar solution.<p>&gt; [...] a wasted effort for not-a-problem.<p>If this was &quot;not-a-problem&quot; why would Rust&#x2F;cargo go out of its way to solve it? And why would people regularly point out for this to be one of <i>the</i> reasons dependencies are indeed a &quot;not-a-problem&quot; in Rust and how great that is compared to whatever else they battled with before?<p>Indeed you and I do live in different worlds.
            • Galanwe143 days ago
              &gt; I am always flabbergasted when people make such statement<p>Sit down, have a coffee, re-read your whole comments, create bullet points for your case, and try to have an *objective* look at your arguments.<p>- Your are frustrated with your use case, seemingly to the point where you don&#x27;t care about reasonable arguments but just want to lash out at something.<p>- By your own description, you have a specific use case, in a specific field, in a narrower sub field.<p>- You are not primarily a Python developer, and use it every 3 months when you have to.<p>Your experience, in your field, on your project, does not make you a poster child of what everyday Python is like. Sorry for the news.<p>Now I get that frustration of &quot;I just want things done and not care about that whole ecosystem&quot;, but the reality is, that&#x27;s not a Python thing, it&#x27;s a &quot;that&#x27;s not my preferred stack thing&quot;.<p>I have that same feeling whenever I need to get things done in a stack I don&#x27;t know, and get stuck by something &lt;insert preferred stack&gt; does.<p>I used Rust the other day and ended up in a case where I needed to implement a trait I do not own. Well that ended up not being possible. That pissed me off for a time, that *really* made the most sense for my use case. Yet... I&#x27;m not going to complain that Rust is unusable because of &quot;trait ownership hell&quot; on the internet.<p>If we let the frustration aside for a minute:<p>Your use case, as a fact, is very contrived.<p>One does not stumble into projects that need to work with different, incompatible, similarly named, versions of a same library, every day.<p>As I mentioned, when that need arises, library maintainers usually just create a new package, with a different name.<p>That is what have been done for 99.99% of package managers ever in existence, be it system package managers, or language package managers.<p>And the reason for it is really just common sense:<p>- It does not happen very often<p>- Whenever that happens, the solution of providing a new package is the simplest and most well established<p>- The pattern works, and has been used since 30 years<p>- It is unambiguous<p>Note that Rust does _not_ magically solve that problem either, as there is no one size fits all solution to this problem. The best Rust can do, is:<p>- In the subset use case of this problem where said dependency is solely accessed from the inside of another dependency<p>- And said library symbols need not be externally accessible<p>- And said library data structures need not be shared<p>- Then Rust can build the outer most dependency against a specific version of said inner dependency.
          • adastra22144 days ago
            Maybe this doesn’t happen in Python, but I find that hard to believe. This is a common thing in Rust, where cargo does support compiling with multiple versions of the same crate. If I have dependency X that depends on version 1.x of crate Z, and dependency Y which depends on version 2.x, cargo will compile BOTH versions of crate Y, and handle the magic of linking dependencies X and Y to their own, different copies of this common dependency.
            • steveklabnik144 days ago
              Yes, Rust can do this. I know Ruby cannot, and I believe Python may not either, but I am less sure about it because I’m less good with Python’s semantics here, but I’d believe your parent.
      • adastra22144 days ago
        Yeah, because of a tool written in Rust, copying the Rust way of doing things for Python developers.
        • Galanwe144 days ago
          I am not even thinking of `uv`, but rather of pyproject.toml, and the various improvements as to how dependencies are declared and resolved. You don&#x27;t get much simpler than a toml file listing your dependencies and constraints, along with a lock file.<p>Also let&#x27;s keep middle school taunts at home.
    • zoobab145 days ago
      &quot;a simple cargo run feels like a dream&quot;<p>A cargo build that warms up your CPU during winter while recompiling the whole internet is better?
      • surajrmal145 days ago
        It has 3 direct dependencies and not too many more transitively. You&#x27;re certainly not recompiling the internet. If you&#x27;re going to run a local llm I doubt you&#x27;re building on a toaster so build speed won&#x27;t be a big ordeal either.
      • tracker1144 days ago
        I recently upped to a 9950X with a gen5 nvme.. TBH, even installing a few programs from cargo (which does compiles) is pretty quick now. Even coming from a 5950X with a gen4 drive.
    • taminka145 days ago
      lowkey ppl who praise cargo seem to have no idea of the tradeoffs involved in dependency management<p>the difficulty of including a dependency should be proportional to the risk you&#x27;re taking on, meaning it shouldn&#x27;t be as difficult as it in, say, C where every other library is continually reinventing the same 5 utilities, but also not as easy as it is with npm or cargo, because you get insane dependency clutter, and all the related issues like security, build times, etc<p>how good a build system isn&#x27;t equivalent of how easy it is include a dependency, while modern languages should have a consistent build system, but having a centralised package repository that anyone freely pull to&#x2F;from, and having those dependencies freely take on any number of other dependencies is a bad way to handle dependencies
      • dev_l1x_be145 days ago
        &gt; lowkey ppl who praise cargo seem to have no idea<p>Way to go on insulting people on HN. Cargo is literally the reason why people coming to Rust from languages like C++ where the lack of standardized tooling is giant glaring bomb crater that poses burden on people every single time they need to do some basic things (like for example version upgrades).<p>Example:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;facebook&#x2F;folly&#x2F;blob&#x2F;main&#x2F;build.sh" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;facebook&#x2F;folly&#x2F;blob&#x2F;main&#x2F;build.sh</a>
        • taminka145 days ago
          i&#x27;m saying that ease of dependency inclusion <i>should not</i> be a main criterion for evaluating how good a build system is, not that it isn&#x27;t the main criterion for many people...<p>like the entire point of my comment is that people have misguided criteria for evaluating build systems, and your comment seems to just affirm this?
          • Sl1mb0145 days ago
            &gt; dependency inclusion _should not_ be a main criterion for evaluating how good a build system is<p>That&#x27;s just like, your opinion, man.
            • lutusp144 days ago
              &gt; That&#x27;s just like, your opinion, man.<p>I would love to know how many younger readers recognize this classic movie reference.
            • taminka145 days ago
              i mean, unless you have some absolute divine truths, that&#x27;s kind of the best i have :shrug
              • virtualritz144 days ago
                There are no truths but your opinion in this case runs counter of what 35 years developing software have taught me.<p>Obviously, I may be an outlier. Some crank who&#x27;s just smitten by the proposal of spending his time writing code instead of trying to get a dependency (and its sub-dependencies and their sub-dependencies) to build at all (e.g. C&#x2F;C++) or to have the right version that works with ALL the code that depends on it (e.g. Python).<p>I.e. I use cargo foremost (by a large margin) for that reason.
                • taminka144 days ago
                  in my original comment i specifically mentioned that C (and C++) situation is also too extreme and not optimal...
          • adwn145 days ago
            &gt; <i>like the entire point of my comment is that people have misguided criteria for evaluating build systems, and your comment seems to just affirm this?</i><p>I think dev_l1x_be&#x27;s comment is meant to imply that your believe about people having misguided criteria [for evaluation build systems] is itself misguided, and that your favored approach [that the difficulty of including a dependency should be proportional to the risk you&#x27;re taking on] is also misguided.
            • taminka145 days ago
              my thesis is that negative externalities of build systems are important and i don&#x27;t know how to convince of importance of externalities someone whose value system is built specifically on ignoring externalities and only factoring in immediate convenience...
          • CodeMage144 days ago
            Dependency management should most definitely be one of the main criteria for evaluating how good a build system is. What&#x27;s misguided is intentionally opting for worse dependency management in an attempt to solve a people problem, i.e. being careless about adding dependencies to your project in circumstances when you should be careful.
        • huflungdung145 days ago
          [dead]
      • quantumspandex145 days ago
        Security is another problem, and should be tackled systematically. Artificially making dependency inclusion hard is not it and is detrimental to the more casual use cases.
      • hobofan145 days ago
        &gt; but having a centralised package repository that anyone freely pull to&#x2F;from, and having those dependencies freely take on any number of other dependencies is a bad way to handle dependencies<p>So put a slim layer of enforcement to enact those policies on top? Who&#x27;s stopping you from doing that?
      • MangoToupe144 days ago
        &gt; the difficulty of including a dependency should be proportional to the risk you&#x27;re taking on<p>Why? Dependency hell is an unsolvable problem. Might as well make it easier to evaluate the tradeoff between dependencies and productivity. You can always arbitrarily ban dependencies.
      • itsibitzi145 days ago
        What tool or ecosystem does this well, in your opinion?
        • taminka145 days ago
          any language that has a standardised build system (virtually every language nowadays?), but doesn&#x27;t have a centralised package repository, such that including a dependency is seamless, but takes a bit of time and intent<p>i like how zig does this, and the creator of odin has a whole talk where he basically uses the same arguments as my original comment to reason why odin doesn&#x27;t have a package manager
          • zoobab145 days ago
            &quot;a standardised build system (virtually every language nowadays?)&quot;<p>Python packages still manage poorly dependencies that are in another lang like C or C++.
            • taminka144 days ago
              that&#x27;s two different languages, they don&#x27;t have have a standardised build system across them
      • IshKebab145 days ago
        This is the weirdest excuse for Python&#x27;s terrible tooling that I&#x27;ve ever heard.<p>&quot;It&#x27;s <i>deliberately</i> shit so that people won&#x27;t use it unless they really have to.&quot;
        • taminka145 days ago
          i just realised that my comment sounds like it&#x27;s praising python&#x27;s package management since it&#x27;s often so inconvenient to use, i want to mention that that wasn&#x27;t my intended point, python&#x27;s package management contains the worst aspects from both words: being centralised AND horrible to use lol<p>my mistake :)
      • jokethrowaway145 days ago
        Is your argument that python&#x27;s package management &amp; ecosystem is bad by design - to increase security?<p>In my experience it&#x27;s just bugs and poor decision making on the maintainers (eg. pytorch dropping support for intel mac, leftpad in node) or on the language and package manager developers side (py2-&gt;3, commonjs, esm, go not having a package manager, etc).<p>Cargo has less friction than pypi and npm. npm has less friction than pypi.<p>And yet, you just need to compromise one lone, unpaid maintainer to wreck the security of the ecosystem.
        • taminka145 days ago
          nah python&#x27;s package management is just straight up terrible by every metric, i just used it as a tangent to talk about how imo ppl incorrectly evaluate build systems
  • techsystems145 days ago
    &gt; ndarray = &quot;0.16.1&quot; rand = &quot;0.9.0&quot; rand_distr = &quot;0.5.0&quot;<p>Looking good!
    • kachapopopow145 days ago
      I was slightly curious: cargo tree llm v0.1.0 (RustGPT) ├── ndarray v0.16.1 │ ├── matrixmultiply v0.3.9 │ │ └── rawpointer v0.2.1 │ │ [build-dependencies] │ │ └── autocfg v1.4.0 │ ├── num-complex v0.4.6 │ │ └── num-traits v0.2.19 │ │ └── libm v0.2.15 │ │ [build-dependencies] │ │ └── autocfg v1.4.0 │ ├── num-integer v0.1.46 │ │ └── num-traits v0.2.19 (<i>) │ ├── num-traits v0.2.19 (</i>) │ └── rawpointer v0.2.1 ├── rand v0.9.0 │ ├── rand_chacha v0.9.0 │ │ ├── ppv-lite86 v0.2.20 │ │ │ └── zerocopy v0.7.35 │ │ │ ├── byteorder v1.5.0 │ │ │ └── zerocopy-derive v0.7.35 (proc-macro) │ │ │ ├── proc-macro2 v1.0.94 │ │ │ │ └── unicode-ident v1.0.18 │ │ │ ├── quote v1.0.39 │ │ │ │ └── proc-macro2 v1.0.94 (<i>) │ │ │ └── syn v2.0.99 │ │ │ ├── proc-macro2 v1.0.94 (</i>) │ │ │ ├── quote v1.0.39 (<i>) │ │ │ └── unicode-ident v1.0.18 │ │ └── rand_core v0.9.3 │ │ └── getrandom v0.3.1 │ │ ├── cfg-if v1.0.0 │ │ └── libc v0.2.170 │ ├── rand_core v0.9.3 (</i>) │ └── zerocopy v0.8.23 └── rand_distr v0.5.1 ├── num-traits v0.2.19 (<i>) └── rand v0.9.0 (</i>)<p>yep, still looks relatively good.
      • imtringued145 days ago
        <p><pre><code> cargo tree llm v0.1.0 (RustGPT) ├── ndarray v0.16.1 │ ├── matrixmultiply v0.3.9 │ │ └── rawpointer v0.2.1 │ │ [build-dependencies] │ │ └── autocfg v1.4. │ ├── num-complex v0.4.6 │ │ └── num-traits v0.2.19 │ │ └── libm v0.2.15 │ │ [build-dependencies] │ │ └── autocfg v1.4.0 │ ├── num-integer v0.1.46 │ │ └── num-traits v0.2.19 () │ ├── num-traits v0.2.19 () │ └── rawpointer v0.2.1 ├── rand v0.9.0 │ ├── rand_chacha v0.9.0 │ │ ├── ppv-lite86 v0.2.20 │ │ │ └── zerocopy v0.7.35 │ │ │ ├── byteorder v1.5.0 │ │ │ └── zerocopy-derive v0.7.35 (proc-macro) │ │ │ ├── proc-macro2 v1.0.94 │ │ │ │ └── unicode-ident v1.0.18 │ │ │ ├── quote v1.0.39 │ │ │ │ └── proc-macro2 v1.0.94 () │ │ │ └── syn v2.0.99 │ │ │ ├── proc-macro2 v1.0.94 () │ │ │ ├── quote v1.0.39 () │ │ │ └── unicode-ident v1.0.18 │ │ └── rand_core v0.9.3 │ │ └── getrandom v0.3.1 │ │ ├── cfg-if v1.0.0 │ │ └── libc v0.2.170 │ ├── rand_core v0.9.3 () │ └── zerocopy v0.8.23 └── rand_distr v0.5.1 ├── num-traits v0.2.19 () └── rand v0.9.0 ()</code></pre>
      • cmrdporcupine145 days ago
        linking both rand-core 0.9.0 and rand-core 0.9.3 which the project could maybe avoid by just specifying 0.9 for its own dep on it
        • Diggsey144 days ago
          It doesn&#x27;t link two versions of `rand-core`. That&#x27;s not even possible with rust (you can only link two semver-incompatible versions of the same crate). And dependency specifications in Rust don&#x27;t work like that - unless you explicitly override it, all dependencies are semver constraints, so &quot;0.9.0&quot; will happily match &quot;0.9.3&quot;.
          • 0xffff2144 days ago
            So there&#x27;s no difference at all between &quot;0&quot;, &quot;0.9&quot; and &quot;0.9.3&quot; in cargo.toml (Since semver says only major version numbers are breaking)? As a decently experienced Rust developer, that&#x27;s deeply surprising to me.<p>What if devs don&#x27;t do a good job of versioning and there is a real incompatibility between 0.9.3 and 0.9.4? Surely there&#x27;s some way to actually require an exact version?
            • Diggsey144 days ago
              They are different:<p><pre><code> &quot;0&quot;: &quot;&gt;=0.0.0, &lt;1.0.0&quot; &quot;0.9&quot;: &quot;&gt;=0.9.0, &lt;1.0.0&quot; &quot;0.9.3&quot;: &quot;&gt;=0.9.3, &lt;1.0.0&quot; </code></pre> Notice how the the minimum bound changes while the upper bound is the same for all of them.<p>The reason for this is that unless otherwise specified, the ^ operator is used, so &quot;0.9&quot; is actually &quot;^0.9&quot;, which then gets translated into the kind of range specifier I showed above.<p>There are other operators you can use, these are the common ones:<p><pre><code> (default) ^ Semver compatible, as described above &gt;= Inclusive lower bound only &lt; Exclusive upper bound only = Exact bound </code></pre> Note that while an exact bound will force that exact version to be used, it still doesn&#x27;t allow two semver compatible versions of a crate to exist together. For example. If cargo can&#x27;t find a single version that satisfies all constraints, it will just error.<p>For this reason, if you are writing a library, you should <i>in almost all cases</i> stick to regular semver-compatible dependency specifications.<p>For binaries, it is more common to want exact control over versions and you don&#x27;t have downstream consumers for whom your exact constraints would be a nightmare.
            • steveklabnik144 days ago
              Note that in the output, there is <i>rand</i> 0.9.0, and two instances of <i>rand_core</i> 0.9.3. You may have thought it selected two versions because you missed the <i>_core</i> there.<p>&gt; So there&#x27;s no difference at all between &quot;0&quot;, &quot;0.9&quot; and &quot;0.9.3&quot; in cargo.toml<p>No, there is a difference, in particular, they all specify different minimum bounds.<p>The trick is that these are using the ^ operator to match, which means that the version &quot;0.9.3&quot; will satisfy all of those constraints, and so Cargo will select 0.9.3 (the latest version at the time I write this comment) as the one version to satisfy all of them.<p>Cargo will only select multiple versions when it&#x27;s not compatible, that is, if you had something like &quot;1.0.0&quot; and &quot;0.9.0&quot;.<p>&gt; Surely there&#x27;s some way to actually require an exact version?<p>Yes, you&#x27;d have to use `=`, like `=0.9.3`. This is heavily discouraged because it would lead to a proliferation of duplication in dependency versions, which aren&#x27;t necessarily unless you are trying to avoid some sort of specific bugfix. This is sometimes done in applications, but basically should never be done in libraries.
              • 0xffff2144 days ago
                Sorry, I don&#x27;t understand the &quot;^ operator&quot; in this context. Do I understand correctly that cargo will basically select the latest release that matches within a major version, so if I have two crates that specify &quot;0.8&quot; and &quot;0.7.1&quot; as dependencies then the compiler will use &quot;0.8.n&quot; for both? And then if I add a new dependency that specifies &quot;0.9.5&quot;, all three crates would use &quot;0.9.5&quot;? Assuming I have that right, I&#x27;m quite surprised that it works in practice.
                • steveklabnik144 days ago
                  It’s all good. Let me break it down.<p>Semver specifies versions. These are the x.y.z (plus other optional stuff) triples you see. Nothing should be complicated there.<p>Tools that use semver to select versions also define syntax for defining which versions are acceptable. npm calls these “ranges”, cargo calls them “version requirements”, I forget what other tools call them. These are what you actually write in your Cargo.toml or equivalent. These are not defined by the semver specification, but instead, by the tools. They are <i>mostly</i> identical across tools, but not always. Anyway, they often use operators to define the ranges (that’s the name I’m going to use in this post because I think it makes the most sense.) So for example, ‘&gt;3.0.0’ means “any version where x &gt;= 3.” “=3.0.0” means “any version where x is 3, y is 0, and z is 0” which 99% of the time means only one version.<p>When you write “0.9.3” in a Cargo.toml, you’re writing a range, not a version. When you do not specify an operator, Cargo treats that as if you use the ^ operator. So “0.9.3” is equivalent to “^0.9.3” what does ^ do? It means two things, one if x is 0 and one if x is nonzero. Since “^0.9.3” has x of zero, this range means “any version where x is 0, y is 9, and z is &gt;= 3.” Likewise, “0.9” is equivalent to “^0.9.0” which is “any version where x is 0, y is 9, and z is &gt;=0.”<p>Putting these two together:<p><pre><code> 0.9.0 satisfies the latter, but not the former 0.9.1 satisfies the latter, but not the former 0.9.2 satisfies the latter, but not the former 0.9.3 satisfies both </code></pre> Given that 0.9.3 is a version that has been released, if one package depends on “0.9” and another depends on “0.9.3”, version 0.9.3 satisfies both constraints, and so is selected.<p>If we had “0.8” and “0.7.1”, no version could satisfy both simultaneously, as “y must be 8” and “y must be 7” would conflict. Cargo would give you both versions in this case, whichever y=8 and y=7 versions have the highest z at the time.
                  • 0xffff2143 days ago
                    Awesome. Thanks for taking the time. Glad to understand all of this better. I feel a bit silly now meticulously going through and changing all of my &quot;0.9.3&quot;s to &quot;0.9&quot; in the past, but at least now I know better.
                    • steveklabnik143 days ago
                      You&#x27;re welcome!<p>It is true that, if the change works on z &lt; 3, you are expanding the possible set of versions a bit, so it&#x27;s not useless; one could argue that you should only depend on z != 1 if there&#x27;s a bug you want to make sure that you use the versions past when it works, otherwise, no reason to restrict yourself, but it&#x27;s not a big deal either way :)
          • eximius144 days ago
            This doesn&#x27;t sound right. If A depends on B and C - B and C can each bring their own versions of D, I thought?
            • Diggsey144 days ago
              Within a crate graph, for any given major version of a crate (eg. D v1) only a single minor version can exist. So if B depends on D v1.x, and C depends on D v2.x, then two versions of D will exist. If B depends on Dv1.2 and C depends on Dv1.3, then only Dv1.3 will exist.<p>I&#x27;m over-simplifying a few things here:<p>1. Semver has special treatment of 0.x versions. For these crates the minor version depends like the major version and the patch version behaves like the minor version. So technically you could have v0.1 and v0.2 of a crate in the same crate graph.<p>2. I&#x27;m assuming all dependencies are specified &quot;the default way&quot;, ie. as just a number. When a dependency looks like &quot;1.3&quot;, cargo actually treats this as &quot;^1.3&quot;, ie. the version must be at least 1.3, but can be any semver compatible version (eg. 1.4). When you specify an exact dependency like &quot;=1.3&quot; instead, the rules above still apply (you still can&#x27;t have 1.3 and 1.4 in the same crate graph) but cargo will error if no version can be found that satisfies all constraints, instead of just picking a version that&#x27;s compatible with all dependents.
            • steveklabnik144 days ago
              <i>can</i> does not mean <i>must</i>. Cargo attempts to unify (aka deduplicate) dependencies where possible, and in this case, it can find a singular version that satisfies the entire thing.
    • worldsavior144 days ago
      This doesn&#x27;t mean anything. A project can implement things from scratch inefficiently but there might be other libraries the project can use instead of reimplementing.
    • tonyhart7145 days ago
      is this satire or does I must know context behind this comment???
      • stevedonovan145 days ago
        These are a few well-chosen dependencies for a serious project.<p>Rust projects can really go bananas on dependencies, partly because it&#x27;s so easy to include them
      • obsoleszenz145 days ago
        The project only has 3 dependencies which i interpret as a sign of quality
      • leoh144 days ago
        I don&#x27;t know if OP intended satire, but either way it is an absurd comment. Think about how &quot;from scratch&quot; this really is.
  • enricozb145 days ago
    I did this [0] (gpt in rust) with picogpt, following the great blog by jaykmody [1].<p>[0]: <a href="https:&#x2F;&#x2F;github.com&#x2F;enricozb&#x2F;picogpt-rust" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;enricozb&#x2F;picogpt-rust</a> [1]: <a href="https:&#x2F;&#x2F;jaykmody.com&#x2F;blog&#x2F;gpt-from-scratch&#x2F;" rel="nofollow">https:&#x2F;&#x2F;jaykmody.com&#x2F;blog&#x2F;gpt-from-scratch&#x2F;</a>
  • jlmcgraw145 days ago
    Some commentary from the author here: <a href="https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;rust&#x2F;comments&#x2F;1nguv1a&#x2F;i_built_an_llm_from_scratch_in_rust_just_ndarray&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;rust&#x2F;comments&#x2F;1nguv1a&#x2F;i_built_an_ll...</a>
  • om8144 days ago
    Have a similar project. Also written in rust, runs in a browser using web assembly<p>In-browser demo: <a href="https:&#x2F;&#x2F;galqiwi.github.io&#x2F;aqlm-rs" rel="nofollow">https:&#x2F;&#x2F;galqiwi.github.io&#x2F;aqlm-rs</a><p>Source code: <a href="https:&#x2F;&#x2F;github.com&#x2F;galqiwi&#x2F;demo-aqlm-rs" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;galqiwi&#x2F;demo-aqlm-rs</a>
  • Snuggly73145 days ago
    Congrats - there is a very small problem with the LLM - its reusing transformer blocks and you want to use different instances of them.<p>Its a very cool excercise, I did the same with Zig and MLX a while back, so I can get a nice foundation, but since then as I got hooked and kept adding stuff to it, switched to Pytorch&#x2F;Transformers.
    • icemanx145 days ago
      correction: It&#x27;s a cool exercise if you write it yourself and not use GPT
      • Snuggly73145 days ago
        well, hopefully the author did learn something or at least enjoyed the process :)<p>(the code looks like a very junior or a non-dev wrote it tbh).
  • Charon77145 days ago
    Absolutely love how readable the entire project is
    • koakuma-chan145 days ago
      It&#x27;s AI generated
      • Revisional_Sin145 days ago
        How do you know? The over-commenting?
        • koakuma-chan145 days ago
          I know because this is how an AI generated project looks. Clearly AI generated README, &quot;clean&quot; code, the way files are named, etc.
          • magackame145 days ago
            Not sure myself. Commit messages look pretty human. But the emojis in readme and comments like &quot;&#x2F;&#x2F; Re-export key structs for easier access&quot;, &quot;# Add any test-specific dependencies here if needed&quot; are sus indeed.
          • cmrdporcupine145 days ago
            To me it looks like LLM generated README, but not necessarily the source (or at least not all of it).<p>Or there&#x27;s been a cleaning pass done over it.
            • koakuma-chan145 days ago
              I think pretty clearly the source is also at least partially generated. None the less, just a README like that already sends a strong signal to stop looking and not trust anything written there.
        • adastra22144 days ago
          Because the author said so on Reddit.
        • GardenLetter27145 days ago
          The repeated Impls are strange.
          • magackame145 days ago
            Where? Don&#x27;t see any on latest main (685467e).
            • yahoozoo145 days ago
              `llm.rs` has many `impl LLM` blocks
    • emporas145 days ago
      It is very procedural&#x2F;object oriented. This is not considered good Rust practice. Iterators make it more functional, which is better, more succinct that is, and enums more algebraic. But it&#x27;s totally fine for a thought experiment.
    • yieldcrv145 days ago
      Never knew Rust could be that readable. Makes me think other Rust engineers are stuck in a masochistic ego driven contest, which would explain everything else I&#x27;ve encountered about the Rust community and recruiting on that side.
      • GardenLetter27145 days ago
        Most Rust code looks like this - only generic library code goes crazy with all the generics and lifetimes, due to the need to avoid unnecessary mallocs and also provide a flexible API to users.<p>But most people aren&#x27;t writing libraries.
        • cmrdporcupine144 days ago
          Don&#x27;t underestimate what some programmers trying to prove their cleverness (or just trying to have fun) can do if left unchecked. I think most Rust code does indeed look like this but I&#x27;ve seen plenty of projects that go crazy with lifetimes and generics juggling where they don&#x27;t have to.
      • jmaker145 days ago
        Not sure what you’re alluding to but that’s just ordinary Rust without performance or async IO concerns.
  • ndai145 days ago
    I’m curious where you got your training data? I will look myself, but saw this and thought I’d ask. I have a CPU-first, no-backprop architecture that works very well on classification datasets. It can do single‑example incremental updates which might be useful for continuous learning. I made a toy demo to train on tiny.txt and it can predict next characters, but I’ve never tried to make an LLM before. I think my architecture might work well as an on-device assistant or for on-premises needs, but I want to work with it more before I embarrass myself. Any open-source LLM training datasets you would recommend?
    • electroglyph145 days ago
      <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;datasets&#x2F;NousResearch&#x2F;Hermes-3-Dataset" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;datasets&#x2F;NousResearch&#x2F;Hermes-3-Datase...</a>
      • Snuggly73145 days ago
        To my untrained eye, this looks more like an instruct dataset.<p>For just plain text, I really like this one - <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;datasets&#x2F;roneneldan&#x2F;TinyStories" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;datasets&#x2F;roneneldan&#x2F;TinyStories</a>
    • kachapopopow145 days ago
      huggingface has plenty of openai and antrophic user to assistant chains, beware there are dragons (hallucinations), but good enough for instruction training. I actually recommend distilling kimi k2 instead for instruction following capabilities.
    • hadlock144 days ago
      The training data is contained inside main.rs; it looks like about 50 statements about general facts. Probably to keep training time down. If you go off-script things fall apart pretty quickly:<p>Enter prompt: hello Model output: What are eclipses ? Assistant : Eclipses occur when one celestial body moves into the shadow of another &lt;&#x2F;s&gt;<p>Enter prompt: what are facts Model output: eclipses Assistant : Eclipses Eclipses What What was when What through you ? through you meet through using of What was What tall ? of What thank help you explain using , and do you a patterns you you a patterns through air in tall you help metal metal &lt;&#x2F;s&gt;<p>Enter prompt: how are mountains formed? Model output: I ? &#x27; I &#x27; : Of : in happy Hi wind in yeast altering it it &lt;&#x2F;s&gt;
  • Goto80145 days ago
    Nice. Mind to put a license on that?
  • kachapopopow145 days ago
    This looks rather similar to when I asked an AI to implement a basic xor problem solver I guess fundementally there&#x27;s really only a very limited amount of ways to implement this.
  • abricq145 days ago
    This is great ! Congratulations. I really like your project, especially I like how easily it is to peak at.<p>Do you plan on moving forward with this project ? I seem to understand that all the training is done on the CPU, and that you have next steps regarding optimizing that. Do you consider GPU accelerations ?<p>Also, do you have any benchmarks on known hardware ? Eg, how long would it take to train on a macbook latest gen or your own computer ?
    • thomask1995144 days ago
      HI! OG Author here.<p>Honestly, I don&#x27;t know.<p>This was purely a toy project&#x2F;thought experiment to challenge myself to learn exactly how these LLMs worked.<p>It was super cool to see the loss go down and it actually &quot;train&quot;.<p>This is SUPER far from a the real deal. Maybe it could be cool to see how far a fully in memory LLM running on CPU can go.
  • lutusp144 days ago
    It would have been nice to see a Rust&#x2F;Python time comparison for both development and execution. You know, the &quot;bottom line&quot;?
  • selinkocalar144 days ago
    The memory safety guarantees in Rust are probably useful here given how easy it is to have buffer overflows in transformer implementations. CUDA kernels are still going to dominate performance though. Curious about the tokenization approach - are you implementing BPE from scratch too or using an existing library?
  • farhanhubble144 days ago
    I&#x27;ve not written a single line of Rust ever, but I have occasionally looked under the hood of Tensorflow, Pytorch etc. and have been a machine learning practitioner for several years. The succinctness of the interfaces surprised me!
  • yobbo144 days ago
    Very nice! Next thing to add would be numerical gradient testing.
    • tripplyons144 days ago
      Is that where you approximate a partial derivative as a difference in loss over a small difference in a single parameter&#x27;s value?<p>Seems like a great way to verify results, but it has the same downsides as forward mode automatic differentiation since it works in a pretty similar fashion.
      • yobbo144 days ago
        Yes, the purpose is to verify the gradient computations which are typically incorrect on the first try for things like self-attention and softmax. It is very slow.<p>It is not necessary for auto-differentiation, but this project does not use that.
  • chcardoz144 days ago
    super fun!! I am running it right now and going to use it to train on a corpus of my own writing to make a gpt of myself.
  • bionhoward145 days ago
    That time to first token is impressive, it seems like it responds immediately
  • amoskvin144 days ago
    great job! which model does it implement? gpt-2?
  • bigmuzzy145 days ago
    nice
  • capestart145 days ago
    [dead]
  • ericdotlee145 days ago
    This is incredibly cool, but I wonder when more of the AI ecosystem will move past python tooling into something more... performant?<p>Very interesting to already see rust based inference frameworks as well.
    • leoh144 days ago
      &quot;Python&quot; is perfectly performant for AI and this demonstrates a deep lack of understanding. Virtually every library in python used for AI delegates to lower-level code written in C++.
      • tcfhgj144 days ago
        well, not all the time, e.g. orchestration and handling between multiple libraries
  • trackflak145 days ago
    [dead]
  • Emma_Schmidt145 days ago
    [dead]
  • zenlot145 days ago
    Rust == stars in GitHub.