Hey, I made this, thanks for posting!<p>It’s purposefully high level and non-technical for a general audience - my theory was that most people who aren’t into tech/AI don’t care too much about training, or how the system got to be the way that it is.<p>But they do have some interest in how it actually operates once you’ve typed in a prompt.<p>Happy to answer any questions or take on board feedback
Loved the writeup!<p>Found the manual latent space exploration part really interesting.<p>Too many LLM/diffusion explanations fall in the proverbial “how to draw an owl” meme without giving a taste as to what’s going on.
I enjoyed this a lot.<p>The interpolations between butterfly and snail were pretty horrifying. But something like Z-Image you could basically concatenate the text and end up with a normal image of both. Is the latent space for "butterfly and snail" just well off the path between the two individually?<p>It's hard to imagine what is nearby in latent space and how text contributes, so I did really like the section adding words to the prompt 1-by-1.
Oh I particularly loved that you made the prompts themselves interchangeable. Very well done!
Scrolling through pics on mobile is difficult. Wanted to see all 29 steps but couldnt scroll it reliably.
Amazing explanations!! I absolutely love this. In 10 minutes it’s given me a huge boost in my intuition on diffusion, which I’ve been missing for years.