61 comments

  • jlhawn19 hours ago
    Now I can&#x27;t stop thinking about _The Experience Machine_ by Andy Clark. It theorizes that this is how humans navigate and experience the real world: Our brains generate what we think the world around is like and our senses don&#x27;t so much directly process visual information but instead act like a kind of loss function for our internal simulations. Then we use that error to update our internal model of the world.<p>In this view, we are essentially living inside a high-fidelity generative model. Our brains are constantly &#x27;hallucinating&#x27; a predicted reality based on past experience and current goals. The data from our senses isn&#x27;t the source of the image; it&#x27;s the error signal used to calibrate that internal model. Much like Genie 3 uses latent actions and frames to predict the next state of a world, our brains use &#x27;Active Inference&#x27; to minimize the gap between what we expect and what we experience.<p>It suggests that our sense of &#x27;reality&#x27; isn&#x27;t a direct recording of the world, but a highly optimized, interactive simulation that is continuously &#x27;regularized&#x27; by the photons hitting our retinas.
    • namanyayg11 hours ago
      This is one of my fundamental beliefs about the nature of consciousness.<p>We are never able to interact with the physical world directly, we first perceive it and then interpret those perceptions. More often than not, our interpretation ignores and modifies those perceptions, so we really are just living in a world created by our own mental chatter.<p>This is one of the core tenets of Buddhism, and it&#x27;s also expounded on Greg Egan&#x27;s short novel &quot;Learning to Be Me&quot;. He&#x27;s one of my favorite sci-fi authors and this particular short led me down a deep rabbit hole of reading many of his works within a few months.<p>I found a copy online, if you haven&#x27;t read it, do yourself a favor and check it out. You won&#x27;t be able to put it down and the ending is sublime. <a href="https:&#x2F;&#x2F;gwern.net&#x2F;doc&#x2F;fiction&#x2F;science-fiction&#x2F;1995-egan.pdf" rel="nofollow">https:&#x2F;&#x2F;gwern.net&#x2F;doc&#x2F;fiction&#x2F;science-fiction&#x2F;1995-egan.pdf</a>
      • keeeba9 hours ago
        “ This is one of my fundamental beliefs about the nature of consciousness. We are never able to interact with the physical world directly, we first perceive it and then interpret those perceptions. More often than not, our interpretation ignores and modifies those perceptions, so we really are just living in a world created by our own mental chatter.”<p>This is an orthodox position in modern philosophy, dating back to at least Locke, strengthened by Kant and Schopenhauer. It’s held up to scrutiny for the past ~400 years.<p>But really it’s there in Plato too, so 2300+ years. And maybe further back
        • prox8 hours ago
          It’s the Allegory of the Cave, isn’t it?
          • ethbr15 hours ago
            Afaik, there&#x27;s a difference between classical philosophy (which opines on the divide between an objective world and the perceived word) and more modern philosophy (which generally does away with that distinction while expanding on the idea that human perception can be fallible).<p>The idea that there&#x27;s an objective but imperceivable world (except by philosophers) is... a slippery slope to philosophical excess.<p>It&#x27;s easy to spin whatever fancy you want when nobody can falsify it.
      • grumbelbart29 hours ago
        This is absolutely what happens. It&#x27;s even more tricky since our sensory inputs have different latencies which the brain must compile back into something consistent. While doing so it interprets and filters out a lot of unsurprising, expected data.<p><a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=wo_e0EvEZn8" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=wo_e0EvEZn8</a>
      • ghtbircshotbe1 hour ago
        I&#x27;m not sure this is unique to consciousness (whatever that is). What would it even mean to directly interact with the physical world? Even the most precise scientific experiments are a series of indirect measurements of something that perhaps in some sense is fundamentally unknowable.
      • byronvickers10 hours ago
        Thank you for linking this! I&#x27;m a big fan of Egan but had never read this particular short story. I feel like Egan is perhaps the only contemporary author who actually _gets_ consciousness.
    • tracerbulletx18 hours ago
      I think this is pretty well established as far as neurologists are concerned and explains a lot of things. Like dreaming for instance.. just something like the model running without sensory input constraining it.
      • magospietato16 hours ago
        Always wondered if dreaming is some kind of daily memory consolidation function. Logged short-term&#x2F;episodic memory being filtered and the important bits baked by replaying in a limited simulacrum.
        • direwolf2015 hours ago
          There was once a neural network that used dreaming phases for regularisation. It would run in reverse on random data and whatever activated was down–weighted.
          • ericdfoley14 hours ago
            That&#x27;s the wake sleep algorithm for undirected graphical models.<p>Hinton had a course on Coursera around 2015 that covered a lot of pre NN deep learning. Sadly I don&#x27;t think it&#x27;s up anymore.
            • petethomas13 hours ago
              There is this with a 2012 date:<p><a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;playlist?list=PLoRl3Ht4JOcdU872GhiYWf6jwrk_SNhz9" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;playlist?list=PLoRl3Ht4JOcdU872GhiYW...</a>
      • kingstoned17 hours ago
        Could you please give some sources - books or articles or videos on that topic? It&#x27;s really fascinating
        • tracerbulletx17 hours ago
          <a href="https:&#x2F;&#x2F;onlinelibrary.wiley.com&#x2F;doi&#x2F;abs&#x2F;10.1111&#x2F;phib.12268?utm_source=chatgpt.com" rel="nofollow">https:&#x2F;&#x2F;onlinelibrary.wiley.com&#x2F;doi&#x2F;abs&#x2F;10.1111&#x2F;phib.12268?u...</a><p><a href="https:&#x2F;&#x2F;pubmed.ncbi.nlm.nih.gov&#x2F;23663408&#x2F;" rel="nofollow">https:&#x2F;&#x2F;pubmed.ncbi.nlm.nih.gov&#x2F;23663408&#x2F;</a><p><a href="https:&#x2F;&#x2F;royalsocietypublishing.org&#x2F;rstb&#x2F;article&#x2F;371&#x2F;1708&#x2F;20160007&#x2F;42206&#x2F;Active-interoceptive-inference-and-the-emotional?utm_source=chatgpt.com" rel="nofollow">https:&#x2F;&#x2F;royalsocietypublishing.org&#x2F;rstb&#x2F;article&#x2F;371&#x2F;1708&#x2F;201...</a><p><a href="https:&#x2F;&#x2F;pubmed.ncbi.nlm.nih.gov&#x2F;20068583&#x2F;" rel="nofollow">https:&#x2F;&#x2F;pubmed.ncbi.nlm.nih.gov&#x2F;20068583&#x2F;</a>
        • furyofantares11 hours ago
          I&#x27;ll also recommend Being You by Seth Anil. It makes a lot of sense of consciousness to me. It certainly doesn&#x27;t answer the question but it&#x27;s not just throw your hands up and &quot;we have no idea why qualia&quot;, and it&#x27;s also not just &quot;here&#x27;s a list of neural correlates of consciousness and we won&#x27;t even discuss qualia&quot;.<p>It goes through how sensations fit into this highly constrained, highly functional hallucination that models the outside world as a sort of bayesian prediction about the world as they relate to your concerns and capabilities as a human, and then it has a very interesting discussion about emotions as they relate to inner bodily sensations.
        • jlhawn17 hours ago
          the book I mentioned (_The Experience Machine_ by Andy Clark) talks about this.
        • voxic1113 hours ago
          <a href="https:&#x2F;&#x2F;slatestarcodex.com&#x2F;2017&#x2F;09&#x2F;05&#x2F;book-review-surfing-uncertainty&#x2F;" rel="nofollow">https:&#x2F;&#x2F;slatestarcodex.com&#x2F;2017&#x2F;09&#x2F;05&#x2F;book-review-surfing-un...</a>
    • shagie18 hours ago
      A kurzgesagt on this: Why Your Brain Blinds You For 2 Hours Every Day <a href="https:&#x2F;&#x2F;youtu.be&#x2F;wo_e0EvEZn8" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;wo_e0EvEZn8</a> and the sources for that video - <a href="https:&#x2F;&#x2F;sites.google.com&#x2F;view&#x2F;sources-reality-is-not-real&#x2F;" rel="nofollow">https:&#x2F;&#x2F;sites.google.com&#x2F;view&#x2F;sources-reality-is-not-real&#x2F;</a>
    • psychoslave17 hours ago
      Like, &quot;Your Brain Hallucinates Your Conscious Reality&quot; as exposed by Anil Seth[1]? Found that one while searching for something like &quot;the illusion of the self&quot; a few years ago.<p>It’s also easy to find this treated in various philosophy&#x2F;religion through time and space. And anyway as consciousness is eager to project whatever looks like a possible fit, elements of suggesting prior arts can be inferred back as far as traces can be found.<p>[1] <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=lyu7v7nWzfo" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=lyu7v7nWzfo</a>
    • alastair18 hours ago
      Also check out The Case Against Reality by Donald Hoffman
      • AIorNot16 hours ago
        Yes!<p>Also See essentia foundation videos<p><a href="https:&#x2F;&#x2F;youtube.com&#x2F;@essentiafoundation?si=aD-RmB8DF4M_Oc7w" rel="nofollow">https:&#x2F;&#x2F;youtube.com&#x2F;@essentiafoundation?si=aD-RmB8DF4M_Oc7w</a>
    • Gehinnn7 hours ago
      Doesn&#x27;t this have some implications for P vs NP?<p>How much compute do you need to convince a brain its environment is &quot;real&quot;?<p>What happens if I build a self replicating super computer in this environment that finds solutions to some really big SAT instances that I can verify?<p>Dreams run into contradictions quite quickly.
    • cfiggers18 hours ago
      Another analogy that kinda fits in with what you&#x27;re saying is the post-processing on smartphone &quot;photos.&quot;<p>At what point does the processing become so strong that it&#x27;s less a photograph and more a work of computational impressionism?
      • direwolf2015 hours ago
        At the point where Samsung detects a photo of a white circle while the phone is pointing upwards and substitutes a high resolution picture of the moon.<p>This actually happened.
    • krzat8 hours ago
      It&#x27;s kinda obvious if you think about this:<p>- How come we have 2 eyes but see one 3d world?<p>- We hear sounds and music coming from various directions, but all of this is created from 2 vibrating eardrums
    • soulofmischief10 hours ago
      This is easily corroborated by taking hallucinogens. Your subjective experience is a simulation, augmented by your senses.<p>Personally I often catch myself making reading mistakes and knowing for a fact that the mistake wasn&#x27;t just conceptual, but an actual visual error where my brain renders the wrong word. Sometimes it&#x27;s very obvious because the effect will last for seconds before my vision &quot;snaps&quot; back into reality and the word&#x2F;phrase changes.<p>I first noticed this phenomenon in my subjective experience whenever I was 5 and started playing Pokémon. For many months, I thought Geodude was spelled and pronounced Gordude, until my neighbor said the name correctly one day and it &quot;unlocked&quot; my brain&#x27;s ability to see the word spelled correctly.<p>The effect is so strong sometimes that I can close my eyes and imagine a few different moments in my life, even as a child, where my brain suddenly &quot;saw&quot; the right word while reading and it changed before my eyes.
      • thoughtpeddler9 hours ago
        Just want to say this is a really good description of our brain&#x27;s simulation, and I have experienced the same catching-the-misread-word phenomenon, and it&#x27;s a subtle reminder about how this is all working. But does this mean our wires are crossed in a particular way that is uncommon? I haven&#x27;t heard others share a similar experience.
        • soulofmischief9 hours ago
          I&#x27;m not sure. At times I&#x27;ve wondered if I have something similar to dyslexia. There are few common failure modes with me such as flipping consonants or vowels between adjacent words, or writing down a word and it being the wrong one.<p>My brain seems to store&#x2F;recall words phonetically, possibly because I taught myself to read at age 3 with my own phonetic approach, but also possibly due to how I trained myself out of a long spell of aphasia during high school by consciously relearning how to speak in a way that engaged the opposite hemisphere of my brain; thinking in pitches, intonation, rhyme, rhythm, etc. and turning speaking into a musical expression. I&#x27;d read about this technique and after months of work I managed to make it work for me. So in that aspect, there really might be some crossed wires out of necessity.<p>I was homeless in high school and thus too poor to visit doctors and get scans done, so I&#x27;m really not sure if the assumed damage to my left hemisphere which I experienced was temporary or permanent, or even detectable. The aphasia was coupled with years of intense depersonalization and derealization as well. The brain is a very strange thing and many events in my life such as the ones described above have only reinforced to me how subjective my experience really is.
    • eli_gottlieb14 hours ago
      Yeah, this kind of thing was part of the subject of my PhD, first postdoc, and ongoing scientific work. The question is how to produce generative models and inverse-inference algorithms that are powerful enough to work in tens to hundreds of milliseconds in high dimensionality :-&#x2F;
    • AIorNot16 hours ago
      Let me introduce you to Idealism<p>And more specifically Analytic Idealism<p><a href="https:&#x2F;&#x2F;youtu.be&#x2F;P-rXm7Uk9Ys?si=q7Kefl7PbYfGiChZ" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;P-rXm7Uk9Ys?si=q7Kefl7PbYfGiChZ</a><p>Google DeepMind’s Project Genie is being framed as a “world model.” Given a text prompt, it generates a coherent, navigable, photorealistic world in real time. An agent can move through it, act within it, and the world responds consistently. Past interactions are remembered. Physics holds. Cause and effect persist.<p>From a technical standpoint, this is impressive engineering. From a philosophical standpoint, it’s an unexpectedly clean metaphor.<p>In analytic idealism, the claim is not that the physical world is fake or arbitrary. The claim is that what we call the “physical world” is how reality appears from a particular perspective. Experience is primary. The world is structured appearance.<p>Genie makes this intuitive.<p>There is no “world” inside Genie in the classical sense. There is no pre-existing ocean, mountain, fox, or library. There is a generative substrate that produces a coherent environment only when a perspective is instantiated. The world exists as something navigable because there is a point of view moving through it.<p>Change the character, and the same environment becomes a different lived reality. Change the prompt, and an entirely different universe appears. The underlying system remains, but the experienced world is perspective-dependent.<p>This mirrors a core idealist intuition: reality is not a collection of objects waiting to be perceived. It is a structured field of possible experiences, disclosed through perspectives.<p>The interesting part is not that Genie “creates worlds.” It’s that the worlds only exist as worlds for an agent. Without a perspective, there is no up, down, motion, danger, beauty, or meaning. Just latent structure.<p>Seen this way, Genie is not a model of consciousness. It’s a model of how worlds arise from viewpoints.<p>If you replace “agent” with “local mind,” and “world model” with “cosmic mental process,” the analogy becomes hard to ignore. A universal consciousness need not experience everything at once. It can explore itself through constrained perspectives, each generating a coherent, law-bound world from the inside.<p>That doesn’t prove idealism. But it makes the idea less mystical and more concrete. We are already building systems where worlds are not fundamental, but perspectival.<p>And that alone is worth sitting with.
      • Rzor12 hours ago
        It&#x27;s pretty clear you used an LLM to write that, given your post history. I&#x27;m not sure that&#x27;s allowed here, but at least put a disclaimer.
        • AIorNot10 hours ago
          Yes I used an LLM - to post my thoughts from my phone as typing that down and spell checking&#x2F;grammar cleanup is hell on mobile -<p>- but does it mean I don’t believe all the words written above are valid? No absolutely not.<p>I reviewed and copyedited what I posted and the meaning is exactly what I intended to post so I’m not sure what’s the issue here<p>If we use LLMs to expound on our own thoughts is it a crime? They are literal masters of wordplay and rote clarification on complex topics so I think this is a very legitimate use-case for them, since I was going for clarity as an objective- esp considering the topic<p>Also none of my previous posts were LLM written (including this one)<p>People are a little over-sensitive on this topic these days
      • soulofmischief10 hours ago
        Consciousness and perspective are temporally stable fixed points in the universe. You come to understand yourself as &quot;you&quot; or &quot;I&quot; because it&#x27;s the only thing in the world around you that does not immediately change under many transformations.<p>For example, you can spin around, or change position, or close your eyes, and you&#x27;re still you. As you navigate and interact with the evolving universe, the only continual, relatively unchanging part of the experience is what your brain uses to differentiate itself from the rest of your perceptions.
  • in-silico22 hours ago
    Everyone here seems too caught up in the idea that Genie is the product, and that its purpose is to be a video game, movie, or VR environment.<p>That is not the goal.<p>The purpose of world models like Genie is to be the &quot;imagination&quot; of next-generation AI and robotics systems: a way for them to simulate the outcomes of potential actions in order to inform decisions.
    • benlivengood21 hours ago
      Agreed; everyone complained that LLMs have no world model, so here we go. Next logical step is to backfill the weights with encoded video from the real world at some reasonable frame rate to ground the imagination and then branch the inference on possible interventions (actions) in the near future of the simulation, throw the results into a goal evaluator and then send the winning action-predictions to motors. Getting timing right will probably require a bit more work than literally gluing them together, but probably not much more.
      • patapong5 hours ago
        This is the most convincing take of what might actually get us to AGI I&#x27;ve heard so far :)
    • avaer22 hours ago
      Soft disagree; if you wanted imagination you don&#x27;t need to make a video model. You probably don&#x27;t need to decode the latents at all. That seems pretty far from information-theoretic optimality, the kind that you want in a good+fast AI model making decisions.<p>The whole <i>reason</i> for LLMs inferencing human-processable text, and &quot;world models&quot; inferencing human-interactive video, is precisely so that humans can connect in and debug the thing.<p>I think the purpose of Genie <i>is</i> to be a video game, but it&#x27;s a video game for AI researchers developing AIs.<p>I do agree that the entertainment implications are kind of the research exhaust of the end goal.
      • in-silico22 hours ago
        Sufficiently informative latents <i>can</i> be decoded into video.<p>When you simulate a stream of those latents, you <i>can</i> decode them into video.<p>If you were trying to make an impressive demo for the public, you probably <i>would</i> decode them into video, even if the real applications don&#x27;t require it.<p>Converting the latents to pixel space also makes them compatible with existing image&#x2F;video models and multimodal LLMs, which (without specialized training) can&#x27;t interpret the latents directly.
        • soulofmischief9 hours ago
          At which point you&#x27;re training another model on top of the first, and it becomes clear you might as well have made one model from the start!
      • NitpickLawyer21 hours ago
        &gt; I think the purpose of Genie is to be a video game, but it&#x27;s a video game for AI researchers developing AIs.<p>Yeah, I think this is what the person above was saying as well. This is what people at google have said already (a few podcasts on gdm&#x27;s channel, hosted by Hannah Fry). They have their &quot;agents&quot; play in genie-powered environments. So one system &quot;creates&quot; the environment for the task. Say &quot;place the ball in the basket&quot;. Genie creates an env with a ball and a basket, and the other agent learns to wasd its way around, pick up the ball and wasd to the basket, and so on. Pretty powerful combo if you have enough compute to throw at it.
      • SequoiaHope22 hours ago
        Didn’t the original world models paper do some training in latent space? (Edit: yes[1])<p>I think robots imagining the next step (in latent space) will be useful. It’s useful for people. A great way to validate that a robot is properly imagining the future is to make that latent space renderable in pixels.<p>[1] “By using features extracted from the world model as inputs to an agent, we can train a very compact and simple policy that can solve the required task. We can even train our agent entirely inside of its own hallucinated dream generated by its world model, and transfer this policy back into the actual environment.”<p><a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1803.10122" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1803.10122</a>
      • sailingparrot19 hours ago
        &gt; you don&#x27;t need to make a video model. You probably don&#x27;t need to decode the latents at all.<p>If you don&#x27;t decode, how do you judge quality in a world where generative metrics are famously very hard and imprecise? How do you go about integrating RLHF&#x2F;RLAF in your pipeline if you don&#x27;t decode, which is not something you can skip anymore to get SotA?<p>Just look at the companies that are explicitly aiming for robotics&#x2F;simulation, they *are* doing video models.
      • magospietato16 hours ago
        I wonder what training insights could be gained by having proven general intelligences actively navigate a generative world model?
      • abraxas19 hours ago
        &gt; if you wanted imagination you don&#x27;t need to make a video model. You probably don&#x27;t need to decode the latents at all.<p>Soft disagree. What is the purpose of that imagination if not to map it to actual real world outfcomes. For this to compare them to the real world and possibly backpropagate through them you&#x27;ll need video frames.
      • ACCount3721 hours ago
        If you train a video model, you by necessity train a world model for 3D worlds. Which can then be reused in robotics, potentially.<p>I do wonder if I can frankenstein together a passable VLA using pretrained LTX-2 as a base.
      • koolala21 hours ago
        What model do you need then? If you want 3D real-time understanding of how realities work? Are you focusing on &quot;imagination&quot; in a different abstract way?
      • thegabriele21 hours ago
        Sure, but at some point you want humans in the loop i guess?
      • thegabriele21 hours ago
        Sure, but at some point you want humans in the loop i guess?
      • empath7521 hours ago
        I am not sure we are at the &quot;efficiency&quot; phase of this.<p>Even if you just wire this output (or probably multiples running different counterfactuals) into a multimodal LLM that interprets the video and uses it to make decisions, you have something new.
    • wasmainiac11 hours ago
      Have a source for that?<p>I think you are anthropomorphising the AI too much. Imagination is inspired by reality, which AI does not have. Introducing a reality which the AI fully controls (looking beyond issues of vision and physics simulation) would only induce psychosis in the AI itself since false assumptions would only be amplified.
      • ForceBru5 hours ago
        &gt; psychosis in the AI itself<p>I think you&#x27;re anthropomorphising the AI too much: what does it mean for an LLM to have psychosis? This implies that LLMs have a soul, or a consciousness, or a psyche. But... do they?<p>Speaking of reality, one can easily become philosophical and say that we humans don&#x27;t exactly &quot;have&quot; a reality either. All we have are sensor readings. LLMs&#x27; sensors are texts and images they get as input. They don&#x27;t have the &quot;real&quot; world, but they do have access to tons of _representations_ of this world.
        • ericmcer35 minutes ago
          Psychosis is obviously being used in this context to reference the very well documented &quot;hallucinations&quot; that LLMs experience.
        • wasmainiac2 hours ago
          &gt; I think you&#x27;re anthropomorphising the AI too much<p>I don’t get it. Is that supped to be a gotchya? Have you tried maliciously messing with an LLM? You can get it into a state that resembles psychosis. I mean you give it a context that is removed from reality, yet close enough to reality to act on and it willl give you crazy output.
    • oceanplexian19 hours ago
      Yeah and the goal of Instagram was to share quirky pictures you took with your friends. Now it’s a platform for influencers and brainrot; arguably it has done more damage than drugs to younger generations.<p>As soon as this thing is hooked up to VR and reaches a tipping point with the general public we all know exactly what is going to happen. The creation of the most profitable, addictive and ultimately dystopian technology Big Tech has ever come up with.
      • ceejayoz18 hours ago
        The good news is we’ll finally have an answer for the Fermi Paradox.
        • jacquesm16 hours ago
          What&#x27;s interesting is that that has gone from an interesting paradox to something where we now have a multitude of very plausible answers in a very short time.
        • dryarzeg17 hours ago
          Your positive mindset impresses me, honestly. In a good way.
        • Ozymandias-914 hours ago
          wait ... how?
          • cellular13 hours ago
            Yeah how? A solution means EVERYONE gets sucked into vr world.<p>Surely a small percentage, at least, would go on to colonize.
    • seydor9 hours ago
      Creating robots for an imaginary universe? Who needs those
      • subscribed21 minutes ago
        Me! Me! I want to drive a tiny robot though the generated world.<p>Read &quot;Stars don&#x27;t dream&quot; by Chi Hui (vol1 of &quot;Think weirder&quot;) :)
      • ForceBru5 hours ago
        The military. The robots will roam the battlefield, imagine consequences of shooting people and performing actions that maximize the probability of success according to the results of their &quot;imagination&quot;&#x2F;simulation.
    • rzmmm17 hours ago
      I feel that this is too costly for that kind of usage. Probably quote different architecture is needed for robotics.
    • holografix13 hours ago
      Correct and the more you interact the more you create training data
    • reactordev21 hours ago
      Still cool though…
    • pizzafeelsright21 hours ago
      Environment mapping to AI generated alternative outcomes is the holodeck.<p>I prefer real danger as living in the simulation is derivative.
    • whytaka20 hours ago
      I think this is the key component of developing subjective experience.
      • realmadludite16 hours ago
        I think a subjective experience is impossible to explain by any substrate independent phenomenon, which includes software running on a computer.
    • echelon21 hours ago
      Whoa, whoa, whoa. That&#x27;s just one angle. Please don&#x27;t bin that as the only use case for &quot;world models&quot;!<p>First of all, there are a variety of different types of world models. Simulation, video, static asset, etc. It&#x27;s a loaded term, just as the use cases are widespread.<p>There are world models you can play in your browser inferred entirely by your CPU:<p><a href="https:&#x2F;&#x2F;madebyoll.in&#x2F;posts&#x2F;game_emulation_via_dnn&#x2F;" rel="nofollow">https:&#x2F;&#x2F;madebyoll.in&#x2F;posts&#x2F;game_emulation_via_dnn&#x2F;</a> (my favorite, from 2022!)<p><a href="https:&#x2F;&#x2F;madebyoll.in&#x2F;posts&#x2F;world_emulation_via_dnn&#x2F;" rel="nofollow">https:&#x2F;&#x2F;madebyoll.in&#x2F;posts&#x2F;world_emulation_via_dnn&#x2F;</a> (updated, in 3D)<p>There are static asset generating world models, like WorldLabs&#x27; Marble. These are useful for video games, previz, and filmmaking.<p><a href="https:&#x2F;&#x2F;marble.worldlabs.ai&#x2F;" rel="nofollow">https:&#x2F;&#x2F;marble.worldlabs.ai&#x2F;</a><p>I wrote open source software to leverage marble for filmmaking (I&#x27;m a filmmaker, and this tech is extremely useful for scene consistency):<p><a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=wJCJYdGdpHg" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=wJCJYdGdpHg</a><p><a href="https:&#x2F;&#x2F;github.com&#x2F;storytold&#x2F;artcraft" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;storytold&#x2F;artcraft</a><p>There are playable video-oriented models, many of which are open source and will run on your 3080 and above:<p><a href="https:&#x2F;&#x2F;diamond-wm.github.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;diamond-wm.github.io&#x2F;</a><p><a href="https:&#x2F;&#x2F;github.com&#x2F;Robbyant&#x2F;lingbot-world" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;Robbyant&#x2F;lingbot-world</a><p>There are things termed &quot;world models&quot; that really shouldn&#x27;t be:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;Tencent-Hunyuan&#x2F;HunyuanWorld-1.0" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;Tencent-Hunyuan&#x2F;HunyuanWorld-1.0</a><p>There are robotics training oriented world models:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;leggedrobotics&#x2F;robotic_world_model" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;leggedrobotics&#x2F;robotic_world_model</a><p>Genie is not strictly robotics-oriented.
      • in-silico21 hours ago
        The entertainment industry, as big as it is, just doesn&#x27;t have as much profit potential as robots and AI agents that can replace human labor. Just look at how Nvidia has pivoted from gaming and rendering to AI.<p>The other examples you&#x27;ve given are neat, but for players like Google they are mostly an afterthought.
        • echelon21 hours ago
          Robotics: $88B TAM<p>Gaming: $350B TAM<p>All media and entertainment: $3T TAM<p>Manufacturing: $5T TAM<p>Roughly the same story.<p>This tech is going to revolutionize &quot;films&quot; and gaming. The entire entertainment industry is going to transform around it.<p>When people aren&#x27;t buying physical things, they&#x27;re distracting themselves with media. Humans spend more time and money on that than anything else. Machines or otherwise.<p>AI impact on manufacturing will be huge. AI impact on media and entertainment will be huge. And these world models can be developed in a way that you develop exposure and competency for both domains.<p>edit: You can argue that manufacturing will boom when we have robotics that generalize. But you can also argue that entertainment will boom when we have holodecks people can step into.
          • thecupisblue3 hours ago
            Not so sure around gaming. While it opens some interesting &quot;generate quest on demand&quot; and &quot;quick demo&quot; cases, an infinite world generator wouldn&#x27;t really vibe with people.<p>They would try it once, think its cool and stop there. You would probably have a niche group of &quot;world surfers&quot; that would keep playing with it.<p>Most people do not have an idea on what they would want to play and how it would look like - they want a curated experience. As games adapted to the mass market, they became more and more curated experiences with lots of hand-holding the player.<p>Yeah, a holodeck would be popular, but that&#x27;s a whole different technology ballpark and akin to talking about flying cars in this context.<p>This will have a giant impact on robotics and general models tho, as now they can simulate action&#x2F;reaction inside a world in parallel, choosing the best course, by just having a picture of the world and probably a generated image of the end result or &quot;validators&quot; to check if task is accomplished.<p>And while robotics is $88B TAM nowadays, expect it to hit $888B in the next 5-10 years, with world simulators like this being one of the reasons.<p>From the team side, gotta be cool to build this, feels like one of those things all devs dream about.
          • in-silico21 hours ago
            The <i>current</i> robotics industry is $88B. You have to take into account the potential <i>future</i> industry of general purpose robots that replace a big chunk of blue-collar work.<p>Robots is also just one example. A hypothetically powerful AI agent (which might also use a world model) that controls a mouse and keyboard could replace a big chunk of white-collar work too.<p>Those are worth 10&#x27;s of trillions of dollars. You can argue about whether they are actually possible, but the people backing this tech think they are.
          • dingnuts20 hours ago
            [dead]
    • dyauspitr22 hours ago
      That’s part of it but if you could actually pull out 3D models from these worlds, it would massively speed up game development.
      • avaer22 hours ago
        You already can, check out Marble&#x2F;World Labs, Meshy, and others.<p>It&#x27;s not really as much of a boon as you&#x27;d think though, since throwing together a 3D model is not the bottleneck to making a sellable video game. You&#x27;ve had model marketplaces for a long time now.
        • dyauspitr14 hours ago
          It definitely is. Model marketplaces don’t have ready to go custom models for a custom game. You have to pay a real person a significant amount of money for 100s of a models a truly custom game requires.
        • echelon21 hours ago
          &gt; It&#x27;s not really as much of a boon as you&#x27;d think though<p>It is for filmmaking! They&#x27;re perfect for constructing consistent sets and blocking out how your actors and props are positioned. You can freely position the camera, control the depth of field, and then storyboard your entire scene I2V.<p>Example of doing this with Marble: <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=wJCJYdGdpHg" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=wJCJYdGdpHg</a>
          • avaer19 hours ago
            This I definitely agree with, before you had to massage the I2I and now you can just drag the camera.<p>Marble definitely changes the game if the game is &quot;move the camera&quot;, just most people would not consider that a game (but hey there&#x27;s probably a good game idea in there!)
    • cyanydeez20 hours ago
      Like LLMs, though: Do you really think a simulation will get them to all the corner cases robots&#x2F;AI needs to know about, or will it be largely the same problem -- they&#x27;ll be just good enough to fool the engineers and make the business ops drool and they&#x27;ll be put into production and suddenly we&#x27;ll see in a year or two stories about robots crushing peoples hands, stepping in drains and falling over or falling off roofs cause of some bizarre miscommunication between training and reality.<p>So, like, it&#x27;s very important to understand the lineage of training and not just the &quot;this is it&quot;
    • slashdave21 hours ago
      This is a video model, not a world model. Start learning on this, and cascading errors will inevitably creep into all downstream products.<p>You cannot invent data.
      • kingstnap19 hours ago
        Related: <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2601.03220" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2601.03220</a><p>This is a paper that recently got popular ish and discusses the counter to your viewpoint.<p>&gt; Paradox 1: Information cannot be increased by deterministic processes. For both Shannon entropy and Kolmogorov complexity, deterministic transformations cannot meaningfully increase the information content of an object. And yet, we use pseudorandom number generators to produce randomness, synthetic data improves model capabilities, mathematicians can derive new knowledge by reasoning from axioms without external information, dynamical systems produce emergent phenomena, and self-play loops like AlphaZero learn sophisticated strategies from games<p>In theory yes, something like the rules of chess should be enough for these mythical perfect reasoners that show up in math riddles to deduce everything that *can* be known about the game. And similarly a math textbook is no more interesting than a book with the words true and false and a bunch of true =&gt; true statements in it.<p>But I don&#x27;t think this is the case in practice. There is something about rolling things out and leveraging the results you see that seems to have useful information in it even if the roll out is fully characterizable.
        • slashdave18 hours ago
          Interesting paper, thanks! But, the authors escape the three paradoxes they present by introducing training limits (compute, factorization, distribution). Kind of a different problem here.<p>What I object to are the &quot;scaling maximalists&quot; who believe that if enough training data were available, that complicated concepts like a world model will just spontaneously emerge during training. To then pile on synthetic data from a general-purpose generative model as a solution to the lack of training data becomes even more untenable.
      • andy12_4 hours ago
        How is it not a world model? The latents of the model apparently encode enough information to represent a semi-consistent interactuable world. Seems enough world-modely to me.<p>Besides, we already know that agents can be trained with these world models successfully. See[1]:<p>&gt; By learning behaviors in imagination, Dreamer 4 is the first agent to obtain diamonds in Minecraft purely from offline data, without environment interaction. Our work provides a scalable recipe for imagination training, marking a step towards intelligent agents<p>[1] <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;pdf&#x2F;2509.24527" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;pdf&#x2F;2509.24527</a>
      • 2bitencryption21 hours ago
        Given that the video is fully interactive and lets you move around (in a “world” if you will) I don’t think it’s a stretch to call it a world model. It must have at least some notion of physics, cause and effect, etc etc in order to achieve what it does.
        • slashdave20 hours ago
          No, it actually needs none of that.
          • in-silico11 hours ago
            How would it do what it does without those things?
            • slashdave10 hours ago
              Like all these models work, by simple interpolation.
              • in-silico7 hours ago
                But <i>how</i> does it interpolate?
      • whytaka20 hours ago
        They have a feature where you can take a photo and create a world from that.<p>If instead of a photo you have a video feed, this is one step closer to implementing subjective experience.
        • realmadludite16 hours ago
          It&#x27;s not a subjective experience. It&#x27;s the mimicry of a subjective experience.
  • ollin23 hours ago
    Really great to see this released! Some interesting videos from early-access users:<p>- <a href="https:&#x2F;&#x2F;youtu.be&#x2F;15KtGNgpVnE?si=rgQ0PSRniRGcvN31&amp;t=197" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;15KtGNgpVnE?si=rgQ0PSRniRGcvN31&amp;t=197</a> walking through various cities<p>- <a href="https:&#x2F;&#x2F;x.com&#x2F;fofrAI&#x2F;status&#x2F;2016936855607136506" rel="nofollow">https:&#x2F;&#x2F;x.com&#x2F;fofrAI&#x2F;status&#x2F;2016936855607136506</a> helicopter &#x2F; flight sim<p>- <a href="https:&#x2F;&#x2F;x.com&#x2F;venturetwins&#x2F;status&#x2F;2016919922727850333" rel="nofollow">https:&#x2F;&#x2F;x.com&#x2F;venturetwins&#x2F;status&#x2F;2016919922727850333</a> space station, <a href="https:&#x2F;&#x2F;x.com&#x2F;venturetwins&#x2F;status&#x2F;2016920340602278368" rel="nofollow">https:&#x2F;&#x2F;x.com&#x2F;venturetwins&#x2F;status&#x2F;2016920340602278368</a> Dunkin&#x27; Donuts<p>- <a href="https:&#x2F;&#x2F;youtu.be&#x2F;lALGud1Ynhc?si=10ERYyMFHiwL8rQ7&amp;t=207" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;lALGud1Ynhc?si=10ERYyMFHiwL8rQ7&amp;t=207</a> simulating a laptop computer, moving the mouse<p>- <a href="https:&#x2F;&#x2F;x.com&#x2F;emollick&#x2F;status&#x2F;2016919989865840906" rel="nofollow">https:&#x2F;&#x2F;x.com&#x2F;emollick&#x2F;status&#x2F;2016919989865840906</a> otter airline pilot with a duck on its head walking through a Rothko inspired airport
    • llmthrow082717 hours ago
      These are extremely impressive from a technological progression standpoint, and at the same time not at all compelling, in the same way AI images and LLM prose are and are not.<p>It&#x27;s neat I guess that I can use a few words and generate the equivalent of an Unreal 5 asset flip and play around in it. Also I will never do that, much less pay some ongoing compute cost for each second I&#x27;m doing it.
      • bschwindHN13 hours ago
        Yeah, the future I see from this is just shitty walking video games that maybe look nice but have ridiculous input lag, stuttery frame rates, and no compelling gameplay loop or story. Oh and another tool to fill up facebook with more fake videos to make people angry. Oh well, I guess this is what we&#x27;ve decided to direct all our energy towards.
      • Thorentis6 hours ago
        Exactly. People are getting so excited that all this stuff is possible, and forgetting that we are burning through innumerable finite resources just to prove something is possible.<p>They were too concerned with whether or not they could, they never stopped to think if they should.
    • msabalau18 hours ago
      I was lucky enough to be an early tester, here&#x27;s a brief video walking through the process of creating worlds, showing examples--walking on the moon, with Nasa photo as part of the prompt, being in 221B Baker street with Holmes and Watson, wandering through a night market in Taipei as a giant boba milk tea (note how the stalls are different, and sell different foods), and also exploring the setting of my award-nominated tabletop RPG.<p><a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=FyTHcmWPuJE" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=FyTHcmWPuJE</a><p>It&#x27;s an experimental research prototype, but it also feels like a hint of the future. Feel free to ask any questions.
    • RaftPeople21 hours ago
      I liked that first one and I hope someone creates one of going back to dinosaur age, i want to see that.
      • post-it19 hours ago
        One step closer to the science-based dinosaur MMO we were promised.
      • echelon21 hours ago
        Tim is awesome.<p>Ironically, he covered PixVerse&#x27;s world model last week and it came close to your ask: <a href="https:&#x2F;&#x2F;youtu.be&#x2F;SAjKSRRJstQ?si=dqybCnaPvMmhpOnV&amp;t=371" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;SAjKSRRJstQ?si=dqybCnaPvMmhpOnV&amp;t=371</a><p>(Earlier in the video it shows him live prompting.)<p>World models are popping up everywhere, from almost every frontier lab.
    • Valk3_20 hours ago
      Any thoughts about Project Genie?
      • ollin18 hours ago
        On a technical level, this looks like the same diffusion transformer world model design that was shown in the Genie 3 post (text&#x2F;memory&#x2F;d-pad input, video output, 60sec max context, 720p, sub-10FPS control latency due to 4-frame temporal compression). I expect the public release uses a cheaper step-distilled &#x2F; quantized version. The limitations seen in Genie 3 (high control latency, gradual loss of detail and drift towards videogamey behavior, 60s max rollout length) are still present. The editing&#x2F;sharing tools, latency, cost, etc. can probably improve over time with this same model checkpoint, but new features like audio input&#x2F;output, higher resolution, precise controls, etc. likely won&#x27;t happen until the next major version.<p>From a product perspective, I still don&#x27;t have a good sense of what the market for WMs will look like. There&#x27;s a tension between serious commercial applications (robotics, VFX, gamedev, etc. where you want way, way higher fidelity and very precise controllability), vs current short-form-demos-for-consumer-entertainment application (where you want the inference to be cheap-enough-to-be-ad-supported and simple&#x2F;intuitive to use). Framing Genie as a &quot;prototype&quot; inside their most expensive AI plan makes a lot of sense while GDM figures out how to target the product commercially.<p>On a personal level, since I&#x27;m also working on world models (albeit very small local ones <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=43798757">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=43798757</a>), my main thought is &quot;oh boy, lots of work to do&quot;. If everyone starts expecting Genie 3 quality, local WMs need to become a lot better :)
  • WarmWash22 hours ago
    The actual breakthrough with Genie is being able to turn around and look back, and seeing the same scene that was there before. A few other labs have similar world simulators, but they all struggle badly with keeping coherence of things not in view. Hence why they always walk forwards and never look around.
    • postexitus5 hours ago
      They achieve that by not generating the scene you see, but a lens warped version of 360 degree view. So you turning the other way doesn&#x27;t delete what&#x27;s happening &#x2F; generated on your back side. However I expect it to breakdown if you put a blocker in between and remove it. i.e. go behind a wall and come back, or enter and exit a building. Would be nice to play with.
    • nozbufferHere22 hours ago
      Still amazed it took ML people so long to realize they needed and explicit representation to cache stuff.
      • Legend244021 hours ago
        Genie does not use an explicit representation:<p>&gt;Genie 3’s consistency is an emergent capability. Other methods such as NeRFs and Gaussian Splatting also allow consistent navigable 3D environments, but depend on the provision of an explicit 3D representation. By contrast, worlds generated by Genie 3 are far more dynamic and rich because they’re created frame by frame based on the world description and actions by the user.
      • emmettm19 hours ago
        The representation is learned. Also, see Sutter&#x27;s &quot;Bitter Lesson&quot; essay
    • abraxas19 hours ago
      What about Fei Fei Li&#x27;s lab? I think they are generating true 3D worlds rather than frames of a video?<p>Although that probably precludes her from having animations in those worlds...
    • sfn4222 hours ago
      And what if I go somewhere then go back there a week later?
      • jsheard22 hours ago
        Best they can do is 60 seconds, for now at least.
        • autonomousErwin21 hours ago
          Makes you wonder what the TTL caching for our universe is.
          • dabbz20 hours ago
            Whatever the speed of light is I would imagine
            • jacquesm16 hours ago
              No, that&#x27;s just an optimization that saved on computing resources. It effectively allows the party that runs this simulation to have a limited world to simulate. Dark matter is the other half of that trick. Both were invented by one Bebele Zropaxhodb after a particularly interesting party in the universe just above this one...
            • autonomousErwin2 hours ago
              That&#x27;s the rendering speed.
    • forrestthewoods13 hours ago
      Can they? Is there a video of someone standing in place and spinning the camera 1080 degrees?
  • unicorn_cowboy1 hour ago
    Can someone explain the actual use cases for this that folks are excited about?<p>Nobody wants to be responsible for writing the film they&#x27;re watching, as they watch it. Nobody wants to engage with a virtual world that has no story, narrative, meaning, purpose.<p>What is the actual use case for this, once we move beyond the tech demo stage?<p>The idea it will wholesale replace existing 3D graphics&#x2F;rendering pipelines and processes any time soon seems so far-fetched to me (even just logistically) that I can&#x27;t wrap my head around what people are thinking here.<p>This strikes me as a fancy parlor trick. Unless I&#x27;m mistaken, we need proper world models to do the things people are erroneously assuming this to be capable of one day.
    • caspar1 hour ago
      Presumably others will write the prompts (or equivalent directing mechanism) that will steer the generation, such that you can act out whatever fantasies interest you.
  • krunck22 hours ago
    The more of this I see the more I want to spend time away from screens and doing those things I love to do in the real world.
    • MillionOClock22 hours ago
      I love AI but I also hope it will paradoxically make people realize the value of real life experiences and human relationships.
      • qingcharles17 hours ago
        It won&#x27;t. This is digital heroin of the purest kind. The reason we&#x27;ve never heard from any alien species is that they invented this technology and then the entire population disappeared into their goonspheres and never emerged.<p>Quite how they stopped a line forming three decks long outside every holodeck on the Enterprise is a mystery to me.
        • z3t49 hours ago
          &gt; Quite how they stopped a line forming three decks long outside every holodeck on the Enterprise is a mystery to me.<p>You probably need captain approval for NSFW content. I wonder if there will ever be an AI service that is not &quot;Enterprise&quot; filtered.
      • avaer21 hours ago
        Or maybe it will just make people realize the value of fake life experiences and human relationships.
        • boogrpants20 hours ago
          The American public is no different than an American corporation; trying to extract as much allegiance and loyalty as possible for as little compensation as possible.<p>Your neighbors in the street protesting for comprehensive single payer healthcare? Yeah they&#x27;re perfectly fine leaving your existence up to &quot;market forces&quot;.<p>Copy-paste office workers everywhere reciting memorized platitudes and compliance demands.<p>You&#x27;re telling me I could interact even <i>less</i> with such selfish (and often useless given their limited real skillset) people? Deal.<p>America needs to rethink the compensation package if it wants to survive as a socio-political meme. Happy to call myself Canadian or Chinese if their offer is better. No bullets needed.
          • jplusequalt20 hours ago
            &gt;I think we&#x27;ll eventually get to the point where these are real time and have consistent representations<p>You have a dangerously low opinion of your fellow man, and while I sympathize with your frustration, I would humbly suggest you direct that anger at owners of companies&#x2F;politicians, rather than aim it at your everyday citizen.
            • boogrpants19 hours ago
              Your suggestion is meaningless semantic differentiation.<p>Those owners and politicians are the result of exposure to American communities, schools, other institutions; they do not spontaneously exist.<p>Americans prop up the system as such Americans will defer or their faith was misplaced to begin with. And that ain&#x27;t right; they&#x27;re America! So the awfulness will continue until moral improves!<p>Atheist semantics while living theist like devotion to civil religion memes.
          • switchbak20 hours ago
            Maybe some folks (ahem) disappearing into virtual worlds is a good thing for those left behind.
      • koolala21 hours ago
        It is only detaching people from reality more. The internet used to be a window into the outside world and now AI is making it counterfeit manipulated fantasy.
      • Sol-19 hours ago
        Dunno, I want to agree, but at the same time it&#x27;s spoken like someone to whom these experiences and human relationship come easily. There are many people out there who, for some reason (anxiety, etc.), cannot easily access this part of the human condition, unfortunately.<p>Perhaps better to roam a virtual reality than be starved in the real world.
      • anon-398816 hours ago
        Like what happened with the Internet. Oh wait no it didn&#x27;t happen. Yea...
      • uyribackgy16 hours ago
        [dead]
    • alex_c20 hours ago
      Ironically, this brings us one step closer to believing the simulation hypothesis might be true... In which case, maybe there is no real world anyway ;)
    • isodev17 hours ago
      Same here. The moment we see the person with the keyboard on the video it suddenly becomes kind of painful to watch. All these scenes generated from actual footage someone made… it’s also kind of sad.
    • slashdave21 hours ago
      Huh, yeah, the sky is blue outside and the sun is shining.<p>Although, I am feeling a bit lazy so let me see if I can simulate a walk.
      • xarope14 hours ago
        Will join you in a bit, let me just pause the simulation to adjust the settings for &quot;knee pain when walking&quot; and &quot;need coffee to jump start day&quot;
    • TacoCommander21 hours ago
      After a lifetime career in tech, I want to turn it all off.
      • switchbak20 hours ago
        Seriously though. These cute distractions have turned into world and culture eating monsters.
        • shimman18 hours ago
          Can&#x27;t wait to see this being abused to show that &quot;the prisoners were treated humanely&quot; or that &quot;there are no protests in the capital.&quot; At what point to we hold engineers responsible for creating these machines? How many must suffer because no one was willing to say no?
      • isodev16 hours ago
        I used to think the only really essential component of a home is to have internet. It’s 20 years later, and I feel anything can be a home as long as it gets regular paper mail every couple of days.
    • moomoo1121 hours ago
      I think this would be good for the people living in overcrowded, polluted and dirty cities in the world (let&#x27;s be honest, they actually exist regardless of how that happened or why).<p>Maybe they can unplug from 500+ AQI pollution and spend time with their loved ones and friends in a simulated clean world?<p>Imagine working for 10-12 hours a day, and you come home to a pod (and a building could house thousands of pods, paid for by the company) where you plug in and relax for a few hours. Maybe a few more decades of breakthroughs can lead to simulated sleep as well so they get a full rest.<p>Wake up, head to the factory to make whatever the developed world needs.<p>(holy fuck that is a horrible existence but you know some people would LOVE for that to be real)
      • switchbak20 hours ago
        Hook it up to an always-on fle*light, and I&#x27;m sure you&#x27;d have millions of paying customers.<p>Except you&#x27;ll never have to leave your pod. Extract the $$ from their attention all day, then sell them manufactured virtual happiness all night. It&#x27;s just a more streamlined version of how many people live right now.<p>I&#x27;ll be running away from that hellscape, thanks.
      • trenning19 hours ago
        Sounds similar to the black mirror ep <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Fifteen_Million_Merits" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Fifteen_Million_Merits</a>
      • jplusequalt20 hours ago
        There are posters in this very thread chain who want this reality to come to pass.
        • moomoo1118 hours ago
          I mean if I&#x27;m totally honest, it could be beneficial to me if something like this comes to be. Even in the developed world, we have a bunch of annoying people who complain&#x2F;cry constantly about dumb things. They do that instead of doing something, whereas I can excuse the people trapped in hellholes overseas because it really isn&#x27;t their fault they were oppressed and mistreated.<p>They&#x27;d have their own economy and &quot;life&quot; and leave the rest of us alone. It would be completely transactional, so I&#x27;d have zero reason to feel bad if they do it voluntarily.<p>If they can be happy in a simulated world, and others can be happy in the real world, then everyone wins!
    • adventured21 hours ago
      Most people don&#x27;t have access to anything particularly nice in the real world.<p>It&#x27;s reality privilege. Most of humanity will yearn for the worlds that AI will cook up for them, customized to their whims.
      • jplusequalt20 hours ago
        &gt;Most people don&#x27;t have access to anything particularly nice in the real world.<p>What data&#x2F;metric are you drawing from to arrive at this conclusion? How could you even realistically make such a statement?
      • lins190918 hours ago
        What a nonsensical take, good lord.
    • echelon21 hours ago
      The more I see of this, the faster I want it to go!<p>I&#x27;m developing filmmaking tools with World Labs&#x27; Marble world model:<p><a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=wJCJYdGdpHg" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=wJCJYdGdpHg</a><p><a href="https:&#x2F;&#x2F;github.com&#x2F;storytold&#x2F;artcraft" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;storytold&#x2F;artcraft</a><p>I think we&#x27;ll eventually get to the point where these are real time and have consistent representations. I&#x27;ve been excited about world models since I saw the in-the-browser Pokemon demo:<p><a href="https:&#x2F;&#x2F;madebyoll.in&#x2F;posts&#x2F;game_emulation_via_dnn&#x2F;demo&#x2F;" rel="nofollow">https:&#x2F;&#x2F;madebyoll.in&#x2F;posts&#x2F;game_emulation_via_dnn&#x2F;demo&#x2F;</a><p>At some point, we&#x27;ll have the creative Holodeck. If you&#x27;ve seen what single improv performers can do with AI, it&#x27;s ridiculously cool. I can imagine watching entertainers in the future that summon and create entire worlds before us:<p><a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=MYH3FIFH55s" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=MYH3FIFH55s</a><p>(If you haven&#x27;t seen CodeMiko, she&#x27;s an incredibly talented engineer and streamer. She develops mocap + AI streams.)
      • jplusequalt20 hours ago
        &gt;I think we&#x27;ll eventually get to the point where these are real time and have consistent representations<p>Just like how people in the 50s thought we would have flying cars and nuclear fusion by 2000.
      • bschwindHN13 hours ago
        I dunno, this all just looks like...garbage? And tools to help generate that garbage faster.
  • sy2623 hours ago
    I have been confused for a long time why FB is not motivated enough to invest in world models, it IS the key to unblock their &quot;metaverse&quot; vision. And instead they let go Yann LeCun.
    • observationist23 hours ago
      LeCun wasn&#x27;t producing results. He was obstinate and insistent on his own theories and ideas which weren&#x27;t, and possibly aren&#x27;t, going anywhere. He refused to engage with LLMs and compete in the market that exists, and spent all his effort and energy on unproven ideas and research, which split the company&#x27;s mission and competitiveness. They lost their place as one of the top 4 AI companies, and are now a full generation behind, in part due to the split efforts and lack of enthusiastic participation by all the Meta AI team. If you look at the chaos and churn at the highest levels across the industry, there&#x27;s not a lot of room for mission creep by leadership, and LeCun thoroughly demonstrated he wasn&#x27;t suited for the mission desired by Meta.<p>I think he&#x27;s lucky he got out with his reputation relatively intact.
      • qwertyi0k22 hours ago
        To be fair, this was his job description: Fundamental AI Research (FAIR) lab. Not AI products division. You can&#x27;t expect marketable products from a fundamental AI research lab.
        • YetAnotherNick21 hours ago
          It&#x27;s &quot;Facebook Artificial Intelligence Research&quot;, not fundamental. So basically involves both fundamental and applied research.<p>[1]: <a href="https:&#x2F;&#x2F;engineering.fb.com&#x2F;category&#x2F;ai-research&#x2F;" rel="nofollow">https:&#x2F;&#x2F;engineering.fb.com&#x2F;category&#x2F;ai-research&#x2F;</a>
          • qwertyi0k21 hours ago
            Ref: Yann lecun post on linkedin, 3years ago: FAIR now stand for &quot;Fundamental AI Research&quot;
            • YetAnotherNick11 hours ago
              I literally linked the official site and it currently says Facebook. I have known FAIR for many years and I have always know it as Facebook. Can you link any official source of changing the name.
              • stephenjayakar1 hour ago
                I gotcha. <a href="https:&#x2F;&#x2F;www.linkedin.com&#x2F;posts&#x2F;yann-lecun_big-changes-for-ai-rd-at-meta-fair-remains-activity-6938183698311766016-2Yh0" rel="nofollow">https:&#x2F;&#x2F;www.linkedin.com&#x2F;posts&#x2F;yann-lecun_big-changes-for-ai...</a>, <a href="https:&#x2F;&#x2F;ai.meta.com&#x2F;blog&#x2F;fair-10-year-anniversary-open-science-meta&#x2F;" rel="nofollow">https:&#x2F;&#x2F;ai.meta.com&#x2F;blog&#x2F;fair-10-year-anniversary-open-scien...</a>
      • halfmatthalfcat22 hours ago
        Were you there or just an attentive outsider?
        • observationist22 hours ago
          Attentive outsider and acquaintance of a couple people who are or were employed there. Nothing I&#x27;m saying is particularly inside baseball, though, it&#x27;s pretty well covered by all the blogs and podcasts.
          • richard___22 hours ago
            What podcast?
            • observationist22 hours ago
              Machine Learning Street Talk and Dwarkesh are excellent. Various discord communities, forums, and blogs downstream of the big podcasts, and following researchers on X keeps you in the loop on a lot of these things, and then you can watch for random interviews and presentations on youtube when you know who the interesting people and subjects are.
        • qwertyi0k22 hours ago
          Most serious researchers want to work on interesting problems like reinforcement learning or robotics or RNN or dozen other avant-garde subjects. None want to work on &quot;boring&quot; LLM technology, requiring significant engineering effort and huge dataset wrangling effort.
          • observationist22 hours ago
            This is true - Ilya got an exit and is engaged in serious research, but research is by its nature unpredictable. Meta wanted a product and to compete in the AI market, and JEPA was incompatible with that. Now LeCun has a lab and resources to pursue his research, and Meta has refocused efforts on LLMs and the marketplace - it remains to be seen if they&#x27;ll be able to regain their position. I hope they do - open models and relatively open research are important, and the more serious AI labs that do this, the more it incentivizes others to do the same, and keeps the ones that have committed to it honest.
      • mapmeld21 hours ago
        In an industry of big bets, especially considering the company has poured resources and renamed itself to secure a place in the VR world... staking your reputation on everyone&#x27;s LLMs having peaked and shifting focus to finding a new path to AI is a pretty interesting bet, no?
      • ezst22 hours ago
        Since a hot take is as good as the next one: LLMs are by the day more and more clearly understood as a &quot;local maximum&quot; with flawed capabilities, limited efficiency, a $trillion + a large chunk of the USA&#x27;s GDP wasted, nobody even turning a profit from that nor able to build something that can&#x27;t be reproduced for free within 6 months.<p>When the right move (strategically, economically) is to not compete, the head of the AI division acknowledging the above and deciding to focus on the next breakthrough seems absolutely reasonable.
        • throw31082218 hours ago
          You really need to be obstinate in your convictions if you can dismiss LLMs at the time when everyone&#x27;s job is being turned around by them. Everywhere I look, everyone I talk to, is using LLMs more and more to do their job and dramatically increase their productivity. It&#x27;s one of the most successful technologies I&#x27;ve ever witnessed arriving on the market, and it&#x27;s only just started- it&#x27;s just three years old.
          • acedTrex17 hours ago
            What are you seeing people do with it? To my eyes everyone is in the same amount of meetings lol.
            • throw31082217 hours ago
              For one, since last month, AI is writing about 95% of my code and that of my colleagues. I just describe what I want and how it should be implemented and the AI takes care of all the details, solves bugs, configuration issues, etc. Also I use it to research libraries, dig into documentation (and then write implementations based on that), discuss architectural alternatives, etc.<p>Non-developers I know use them to organise meetings, write emails, research companies, write down and summarise counselling sessions (not the clients, the counselor), write press reports, help with advertising campaigns management, review complex commercial insurance policies, fix translations... The list of uses is endless, really. And I&#x27;m only talking of work-related usage, personal usage goes of course well beyond this.
              • ezst9 hours ago
                &gt; You really need to be obstinate in your convictions if you can dismiss LLMs at the time when everyone&#x27;s job is being turned around by them.<p>I&#x27;m factual. You are the one with the extraordinary claim that LLMs will find new substantial markets&#x2F;go through transformative breakthrough.<p>&gt; Everywhere I look, everyone I talk to, is using LLMs<p>And everywhere I look, I don&#x27;t. It might be the case that you stand right in the middle of an LLMs niche. Never did I say that one doesn&#x27;t exist or that LLMs are inadequate at parroting existing code.<p>&gt; Non-developers I know use them […]<p>among those are:<p>- things that have nothing to do with LLMs&#x2F;AI<p>- things that you should NOT use LLMs for the reason that they will give you confidently wrong and&#x2F;or random answers (because it&#x27;s not in their training data&#x2F;cut-off window, it&#x27;s non-public information, they don&#x27;t have the computing abilities to produce meaningful results)<p>- things that are low-value&#x2F;low-stakes for which people won&#x27;t be willing to pay for when asked to<p>&gt; The list of uses is endless<p>no, it is not<p>&gt; And I&#x27;m only talking of work-related usage<p>and we will get to see rather sooner than later how much business actually value LLMs when the real costs will be finally passed on to them.
                • throw3108227 hours ago
                  &gt; things that have nothing to do with LLMs&#x2F;AI<p>These are things that have to do with <i>intelligence</i>. Human or LLM doesn&#x27;t matter.<p>&gt; things that you should NOT use LLMs for &#x2F; parroting existing code &#x2F; not in their training data&#x2F;cut-off window, it&#x27;s non-public information, they don&#x27;t have the computing abilities to produce meaningful results<p>Sorry, but I just get the picture that you have no clue of what you&#x27;re talking about- though most probably you&#x27;re just in denial. This is one on the most surprising things about the emergence of AI: the existence of a niche of people that is hell-bent on denying its existence.
                  • ezst4 hours ago
                    &gt; intelligence. Human or LLM doesn&#x27;t matter.<p>Being enthusiastic about a technology isn&#x27;t incompatible with objective scrutiny. Throwing-up an ill-defined &quot;intelligence&quot; in the air certainly doesn&#x27;t help with that.<p>Where I stand is where measured and fact-driven (aka. scientists) people do, operating with the knowledge (derived from practical evidence¹) that LLMs have no inherent ability to reason, while making a convincing illusion of it as long as the training data contains the answer.<p>&gt; Sorry, but I just get the picture that you have no clue of what you&#x27;re talking about- though most probably you&#x27;re just in denial.<p>This isn&#x27;t a rebuttal. So, what is it? An insult? Surely that won&#x27;t help make your case stronger.<p>You call me clueless, but at least I don&#x27;t have to live with the same cognitive dissonances as you, just to cite a few:<p>- &quot;LLMs are intelligent, but when given a trivially impossible task, they happily make stuff up instead of using their `intelligence` to tell you it&#x27;s impossible&quot;<p>- &quot;LLMs are intelligent because they can solve complex highly-specific tasks from their training data alone, but when provided with the algorithm extending their reach to generic answers, they are incapable of using their `intelligence` and the supplemented knowledge to generate new answers&quot;<p>¹: <a href="https:&#x2F;&#x2F;arstechnica.com&#x2F;ai&#x2F;2025&#x2F;06&#x2F;new-apple-study-challenges-whether-ai-models-truly-reason-through-problems&#x2F;" rel="nofollow">https:&#x2F;&#x2F;arstechnica.com&#x2F;ai&#x2F;2025&#x2F;06&#x2F;new-apple-study-challenge...</a>
                    • throw3108223 hours ago
                      &gt; This isn&#x27;t a rebuttal.<p>I don&#x27;t really think it&#x27;s possible to convince you. Basically everyone I talk to is using LLMs for work, and in some cases- like mine- I know for a fact that they do produce enormous amounts of value- to the point that I would pay quite some money to keep using them if my company stopped paying for them.<p>Yes LLMs have well known limitations, but at they&#x27;re still a brand new technology in its very early stages. ChatGPT appeared little more than three years ago, and in the meantime it went from barely useful autocomplete to writing autonomously whole features. There&#x27;s already plenty of software that has been 100% coded by LLMs.<p>&quot;Intelligence&quot;, &quot;understanding&quot;, &quot;reasoning&quot;.. nobody has clear definitions for these terms, but it&#x27;s a fact that LLMs in many situations act as if they understood questions, problems and context, and provide excellent answers (better than the average human). The most obvious is when you ask an LLM to analyse some original artwork or poem (or some very recent online comic, why not?)- something that can&#x27;t be in its training data- and they come up with perfectly relevant and insightful analyses and remarks. We don&#x27;t have an algorithm for that, we don&#x27;t even begin to understand how those questions can be answered in any &quot;mechanical&quot; sense, and yet it works. This is intelligence.
                      • AnimalMuppet3 hours ago
                        You know what this reminds me of? Language X comes out (e.g., Lisp or Haskell), and people try it, and it&#x27;s this wonderful, magical experience, and something just &quot;clicks&quot;, and they tell everyone how wonderful it is.<p>And other people try it - really sincerely try it - and they don&#x27;t &quot;get it&quot;. It doesn&#x27;t work for them. And those who &quot;get it&quot; tell those who don&#x27;t that they just need to <i>really</i> try it, and keep trying it until they get it. And some people never get it, and are told that they didn&#x27;t try enough (and also it gets implied that they are stupid if they really can&#x27;t get it).<p>But I think that at least part of it is in how peoples&#x27; brains work. People think in different ways. Some languages just work for some people, and really don&#x27;t work very well for other people. If a language doesn&#x27;t work for you, it doesn&#x27;t mean either that it&#x27;s a bad language or that you&#x27;re stupid (or just haven&#x27;t tried). It can just be a bad fit. And that&#x27;s fine. Find a language that fits you better.<p>Well, I wonder if that applies to LLMs, and especially to LLMs doing coding. It&#x27;s a tool. It has capabilities, and it has limitations. If it works for you, it can <i>really</i> work for you. And if it doesn&#x27;t, then it doesn&#x27;t, and that doesn&#x27;t mean that it&#x27;s a bad tool, or that you are stupid, or that you haven&#x27;t tried. It can just be a bad fit for how you think or for what you&#x27;re trying to do.
                        • throw31082228 minutes ago
                          &gt; You know what this reminds me of? Language X comes out (e.g., Lisp or Haskell), and people try it, and it&#x27;s this wonderful, magical experience, and something just &quot;clicks&quot;, and they tell everyone how wonderful it is.<p>I can relate to this. And I can understand that, depending on how and what you code, LLMs might have different value, or even none. Totally understand.<p>At the same time.. well, let&#x27;s put it this way. I&#x27;ve been fascinated with programming and computers for decades, and &quot;intelligence&quot;, whatever it is, for me has always been the holy grail of what computers can do. I&#x27;ve spent a stupid amount of time thinking about how intelligence works, how a computer program could unpack language, solve its ambiguities, understand the context and nuance, notice patterns that nobody told it were there, etc. Until ten years ago these problems were all essentially unsolved, despite more than half a century of attempts, large human curated efforts, funny chatbots that produced word salads with vague hints of meaning and infuriating ones that could pass for stupid teenagers for a couple of minutes provided they selected sufficiently vague answers from a small database... I&#x27;ve seen them all. In 1968&#x27;s A Space Odyssey there&#x27;s a computer that talks (even if &quot;experts prefer to say that it mimics human intelligence&quot;) and in 2013&#x27;s Her there&#x27;s another one. In between, in terms of actual results, there&#x27;s <i>nothing</i>. &quot;Her&quot; is as much science fiction as it is &quot;2001&quot;, with the aggravating factor that in Her the AI is presented as a novel consumer product: absurd. As if anything like that were possible without a complete societal disruption.<p>All this to say: I can&#x27;t for the life of me understand people who act blasé when they can just talk to a machine and the machine appears to understand what they mean, doesn&#x27;t fall for trivial language ambiguities but will actually even make some meta-fun about it if you test them with some well known example; a machine that can read a never-seen-before comic strip, see what happens in it, read the shaky lettering and finally explain correctly <i>where the humour lies</i>. You can repeat to yourself a billion times &quot;transformers something-something&quot; but that doesn&#x27;t change the fact that what you are seeing is intelligence, that&#x27;s exactly what we always called intelligence- the ability to make sense of messy inputs, see patterns, see the meanings behind the surface, and communicate back in clear language. Ah, and this technology is only a few years old- little more than three if we count from ChatGPT. These are the first baby steps.<p>So it&#x27;s not working for you right now? Fine. You don&#x27;t see the step change, the value in general and in perspective? Then we have a problem.
      • Der_Einzige13 hours ago
        It&#x27;s insane that you can argue this in a world where facebook continues to be state of the art (and it&#x27;s not even close) on semantic segmentation. Those SAM models they produce deliver more value than a hypothetical competitive llama5 model coming out tomorrow.<p>I&#x27;m banning my wife from ever buying any Alexander Wang clothing, because his leadership is so poor in comparison that he&#x27;s going to also devalue the name-collision fashion brand that he shares a name with. That&#x27;s how bad his leadership is going to be in comparison to Yann. Scale AI was only successful for the same reason Langchain was. Easy to be a big fish in a pond with no other fishes.
      • anonnon20 hours ago
        This sounds similar to the arc of Carpathy, who also managed to preserve his reputation despite sending Tesla down a FSD deadend and missing the initial LLM boat.
    • qwertox22 hours ago
      Isn&#x27;t it more like this: JEPA looks at the video, &quot;a dog walks out of the door, the mailman comes, dog is happy&quot; and the next frame would need to look like &quot;mailman must move to mailbox, dog will run happily towards him&quot;, which then an image&#x2F;video generator would need to render.<p>Genie looks at the video, &quot;when this group of pixels looks like this and the user presses &#x27;jump&#x27;, I will render the group different in this way in the next frame.&quot;<p>Genie is an artist drawing a flipbook. To tell you what happens next, it must draw the page. If it doesn&#x27;t draw it, the story doesn&#x27;t exist.<p>JEPA is a novelist writing a summary. To tell you what happens next, it just writes &quot;The car crashes.&quot; It doesn&#x27;t need to describe what the twisted metal looks like to know the crash happened.
    • energy12313 hours ago
      Is Project Genie a &quot;world model&quot; as defined by Yann LeCun? Doesn&#x27;t &quot;world model&quot; mean that the model generates things from a theory of the world, rather than the colloquial meaning of generating 3d navigable scenes (using a temporal ViT or whatever)?
    • general_reveal22 hours ago
      You are beyond correct. World models is what saves their Reality Labs investment. I would say if Reality Labs cannot productize World Models, then that entire project needs to be scrapped.
    • slashdave21 hours ago
      Failures are not publicly reported, in general. Do you we know what they have invested in?
    • phailhaus23 hours ago
      Most people don&#x27;t like putting on VR headsets, no matter what the content is. It just never broke out of the tech enthusiast niche.
  • phailhaus23 hours ago
    I have no idea why Google is wasting their time with this. Trying to hallucinate an entire world is a dead-end. There will never be enough predictability in the output for it to be cohesive in any meaningful way, by design. Why are they not training models to help <i>write</i> games instead? You wouldn&#x27;t have to worry about permanence and consistency at all, since they would be enforced by the code, like all games today.<p>Look at how much prompting it takes to vibe code a prototype. And they want us to think we&#x27;ll be able to prompt a whole world?
    • whalee19 hours ago
      This was a common argument against LLMs, that the space of possible next tokens is so vast that eventually a long enough sequence will necessarily decay into nonsense, or at least that compounding error will have the same effect.<p>Problem is, that&#x27;s not what we&#x27;ve observed to happen as these models get better. In reality there is some metaphysical coarse-grained substrate of physics&#x2F;semantics&#x2F;whatever[1] which these models can apparently construct for themselves in pursuit of ~whatever~ goal they&#x27;re after.<p>The initially stated position, and your position: &quot;trying to hallucinate an entire world is a dead-end&quot;, is a sort of maximally-pessimistic &#x27;the universe is maximally-irreducible&#x27; claim.<p>The truth is much much more complicated.<p>[1] <a href="https:&#x2F;&#x2F;www.arxiv.org&#x2F;abs&#x2F;2512.03750" rel="nofollow">https:&#x2F;&#x2F;www.arxiv.org&#x2F;abs&#x2F;2512.03750</a>
      • post-it19 hours ago
        And going back a little further, it was thought that backpropagation would be impractical, and trying to train neural networks was a dead end. Then people tried it and it worked just fine.
      • phailhaus19 hours ago
        &gt; Problem is, that&#x27;s not what we&#x27;ve observed to happen as these models get better<p>Eh? Context rot is extremely well known. The longer you let the context grow, the worse LLMs perform. Many coding agents will pre-emptively compact the context or force you to start a new session altogether because of this. For Genie to create a consistent world, it needs to maintain context of <i>everything</i>, <i>forever</i>. No matter how good it gets, there will always be a limit. This is not a problem if you use a game engine and code it up instead.
        • CamperBob217 hours ago
          The models, not the context. When it comes to weights, &quot;quantity has a quality all its own&quot; doesn&#x27;t even begin to describe what happens.<p>Once you hit a billion or so parameters, rocks suddenly start to think.
          • phailhaus3 hours ago
            We&#x27;re talking about context here though. The first couple seconds of Genie are great, but over time it degrades. It will always degrade, because it&#x27;s hallucinating a world and needs to keep track of too many things.
            • CamperBob237 minutes ago
              That has traditionally been the problem with these types of models, but Genie is supposed to maintain coherence up to 60 seconds.<p>I&#x27;ve tried using it a couple of times, but can&#x27;t get in. It is either down or hopelessly underprovisioned by Google. Do you have any links to videos showing that the quality degrades after only a few seconds?<p>Edit: no, it just doesn&#x27;t work in Firefox. It works incredibly well, at least in Chrome, and it does not lose coherence to any great extent. The controls are terrible, though.
    • seedie22 hours ago
      Imo they explain pretty well what they are trying to achieve with SIMA and Genie in the Google Deepmind Podcast[1]. They see it as <i>the</i> way to get to AGI by letting AI agents learn for themselves in simulated worlds. Kind of like how they let AlphaGo train for Go in an enormous amount of simulated games.<p>[1] <a href="https:&#x2F;&#x2F;youtu.be&#x2F;n5x6yXDj0uo" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;n5x6yXDj0uo</a>
      • phailhaus20 hours ago
        That makes even less sense, because an AI agent cannot learn effectively from a hallucinated world without internal consistency guarantees. It&#x27;s an even stronger case for leveraging standard game engines instead.
      • hmry19 hours ago
        &quot;I need to go to the kitchen, but the door is closed. Easy. I&#x27;ll turn around and wait for 60 seconds.&quot; -AI agent trained in this kind of world
      • arionmiles21 hours ago
        If that&#x27;s the goal, the technology for how these agents &quot;learn&quot; would be the most interesting one, even more than the demos in the link.<p>LLMs can barely remember the coding style I keep asking it to stick to despite numerous prompts, stuffing that guideline into my (whatever is the newest flavour of product-specific markdown file). They keep expanding the context window to work around that problem.<p>If they have something for long-term learning and growth that can help AI agents, they should be leveraging it for competitive advantage.
    • asim22 hours ago
      Take the positive spin. What if you could put in all the inputs and it can simulate real world scenarios you can walk through to benefit mankind e.g disaster scenarios, events, plane crashes, traffic patterns. I mean there&#x27;s a lot of useful applications for it. I don&#x27;t like the framing at this time, but I also get where it&#x27;s going. The engineer in me is drawn to it, but the Muslim in me is very scared to hear anyone talk about creating worlds.... But again I have to separate my view from the reality that this could have very positive real world benefits when you can simulate scenarios. So I could put in a 2 pager or 10 page scenario that gets played out or simulated and allow me to walk through it. Not just predictive stuff but let&#x27;s say things that have happened so I can map crime scenes or anything. In the end this performance art is because they are a product company being Benchmarked by wall street and they&#x27;ll need customers for the technology but at the same time they probably already have uses for it internally.
      • jsheard22 hours ago
        &gt; What if you could put in all the inputs and it can simulate real world scenarios you can walk through to benefit mankind e.g disaster scenarios, events, plane crashes, traffic patterns.<p>This is only a useful premise if it can do any of those things accurately, as opposed to dreaming up something kinda plausible based on an amalgamation of every vaguely related YouTube video.
      • q3k22 hours ago
        &gt; What if you could put in all the inputs and it can simulate real world scenarios you can walk through to benefit mankind e.g disaster scenarios, events, plane crashes, traffic patterns.<p>What&#x27;s the use? Current scientific models clearly showing natural disasters and how to prevent them are being ignored. Hell, ignoring scientific consensus is a fantastic political platform.
    • MillionOClock22 hours ago
      An hybrid approach could maybe work, have a more or less standard game engine for coherence and use this kind of generative AI more or less as a short term rendering and physics sim engine.
      • elfly21 hours ago
        I&#x27;ve thought about this same idea but it probably gets very complicated.<p>Let&#x27;s say, you simulate a long museum hallway with some vases in it. Who holds what? The basic game engine has the geometry, but once the player pushes it and moves it, it needs to inform the engine it did, and then to draw the next frame, read from the engine first, update the position in the video feed, then again feed it back to the engine.<p>What happens if the state diverges. Who wins? If the AI wins then...why have the engine at all?<p>It is possible but then who controls physics. The engine? or the AI? The AI could have a different understanding of the details of the base. What happens if the vase has water inside? who simulates that? what happens if the AI decides to break the vase? who simulates the AI.<p>I don&#x27;t doubt that some sort of scratchpad to keep track of stuff in game would be useful, but I suspect the researchers are expecting the AI to keep track of everything in its own &quot;head&quot; cause that&#x27;s the most flexible solution.
        • MillionOClock19 hours ago
          Then maybe the engine should be less about really simulating the 3D world and just trying best to preserve consistency, more about providing memory and saving context for consistency than truly simulating a lot besides higher level concerns (at which point we might wonder if it couldn&#x27;t be directly part of the model somehow), but writing those lines I realize there would probably still be many edge cases exactly like what you are describing...
    • jimmar21 hours ago
      As a kid in the early 1980s, I spent a lot of time experimenting with computers by playing basic games and drawing with crude applications. And it was fun. I would have loved to have something like Google&#x27;s Genie to play with. Even if it never evolved, the product in the demos looks good enough for people to get value from.
    • anigbrowl20 hours ago
      It&#x27;s been very profitable for drug dealers for centuries, who wouldn&#x27;t want a piece of that market?
      • phailhaus20 hours ago
        Because games already exist, and it would be easier for LLMs to write games rather than hallucinate videos.
    • godelski22 hours ago
      <p><pre><code> &gt; Why are they not training models to help write games instead? </code></pre> Genie isn&#x27;t about making games... Granted, they for some reason they don&#x27;t put this at the top. Classic Google, not communicating well...<p><pre><code> | It simulates physics and interactions for dynamic worlds, while its breakthrough consistency enables the simulation of any real-world scenario — from robotics and modelling animation and fiction, to exploring locations and historical settings. </code></pre> The key part is simulation. That&#x27;s what they are building this for. Ignore everything else.<p>Same with Nvidia&#x27;s Earth 2 and Cosmos (and a bit like Isaac). Games or VR environments are not the primary drive, the primary drive is training robots (including non-humanoids, such as Waymo) and just getting the data. It&#x27;s exactly because of this that perfect physics (or let&#x27;s be honest, realistic physics[0,1]). Getting 50% of the way there in simulation really does cut down the costs of development, even if we recognize that cost steepens as we approach &quot;there&quot;. I really wish they didn&#x27;t call them &quot;world models&quot; or more specifically didn&#x27;t shove the word &quot;physics&quot; in there, but hey, is it really marketing if they don&#x27;t claim a golden goose can not only lay actual gold eggs but also diamonds and that its honks cure cancer?<p>[0] Looking right does not mean it is right. Maybe it&#x27;ll match your intuition or undergrad general physics classes with calculus but talk to a real physicist if you doubt me here. Even one with just an undergrad will tell you this physics is unrealistic and any one worth their salt will tell you how unintuitive physics ends up being as you get realistic, even well before approaching quantum. Go talk to the HPC folks and ask them why they need superocmputers... Sorry, physics can&#x27;t be done from observation alone.<p>[1] Seriously, I mean look at their demo page. It really is impressive, don&#x27;t get me wrong, but I can&#x27;t find a single video that doesn&#x27;t have major physics problems. That &quot;A high-altitude open world featuring deformable snow terrain.&quot; looks like it is simulating Legolas[2], not a real person. The work is impressive, but it isn&#x27;t anywhere near realistic <a href="https:&#x2F;&#x2F;deepmind.google&#x2F;models&#x2F;genie&#x2F;" rel="nofollow">https:&#x2F;&#x2F;deepmind.google&#x2F;models&#x2F;genie&#x2F;</a><p>[2] <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=O4ZYzbKaVyQ" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=O4ZYzbKaVyQ</a>
      • phailhaus20 hours ago
        But it&#x27;s not simulating, is it? It&#x27;s hallucinating videos with an input channel to guide what the video looks like. Why do that instead of just picking Unreal, Unity, etc and having it <i>actually</i> simulated for a fraction of the effort?
        • godelski17 hours ago
          Depends on your definition of simulation but yeah, I think you understand.<p>I think it really comes down to dev time and adaptability. But honestly I&#x27;m fairly with you. I don&#x27;t think this is a great route. I have a lot of experience in synthetic data generation and nothing beats high quality data. I do think we should develop world models but I wouldn&#x27;t all something a world model unless it actually models a physics. And I mean &quot;a physics&quot; not &quot;what people think of as &#x27;physics&#x27;&quot; (i.e. the real world). I mean having a counterfactual representation of an environment. Our physics equations are an extremely compressed representation of our reality. You can&#x27;t generate these representations through observation alone, and that is the naive part of the usual way to develop world models. But we&#x27;d need to go into metaphysics and that&#x27;s a long conversation not well suited for HN.<p>These simulations are helping but they have a clear limit to their utility. I think too many people believe that if you just feed the models enough data it&#x27;ll learn. Hyperscaleing is a misunderstanding of the Bitter Lesson that slows development despite showing some progress.
    • dyauspitr22 hours ago
      Why is it a dead end, you don’t meaningfully explain that. These models look like you can interact with them and they seem to replicate physics models.
      • phailhaus20 hours ago
        They don&#x27;t though, they&#x27;re hallucinated videos. They&#x27;re feeding models tons and tons of 2D videos and hoping they figure out physics from them, instead of just using a game engine and having the LLM write something up that works 100% of the time.
        • dyauspitr10 hours ago
          On the flip side, the emergent properties that come from some of these wouldn’t be replicable by an engine. A moss covered rock realistically shedding moss as it rolls down a hill. Condensation aggregating into beads and rivulets on glass. An ant walking on a pitcher plant and being able to walk inside it and see bugs drowned from its previous meal. You’re missing the forest for the trees.
          • phailhaus2 hours ago
            And then the rivulets disappear or change completely because you looked away. The reason this is a dead end is because computationally, there is absolutely no way for the model to keep track of everything that it decided. Everything is kept &quot;in its head&quot; rather than persisted. So what you get is a dream world, useless for training world models. It&#x27;s great for prototyping, terrible for anything more durable.
            • dyauspitr2 hours ago
              A dreamworld where the overwhelming number of things are consistent is better that something low detail and always consistent.
        • uyribackgy16 hours ago
          [dead]
  • montebicyclelo23 hours ago
    Reminds me of this [1] HN post from 9 months ago, where the author trained a neural network to do world emulation from video recordings of their local park — you can walk around in their interactive demo [2].<p>I don&#x27;t have access to the DeepMind demo, but from the video it looks like it takes the idea up a notch.<p>(I don&#x27;t know the exact lineage of these ideas, but a general observation is that it&#x27;s a shame that it&#x27;s the norm for blog posts &#x2F; indie demos to not get cited.)<p>[1] <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=43798757">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=43798757</a><p>[2] <a href="https:&#x2F;&#x2F;madebyoll.in&#x2F;posts&#x2F;world_emulation_via_dnn&#x2F;demo&#x2F;" rel="nofollow">https:&#x2F;&#x2F;madebyoll.in&#x2F;posts&#x2F;world_emulation_via_dnn&#x2F;demo&#x2F;</a>
    • ollin21 hours ago
      Yup, similar concepts! Just at two opposite extremes of the compute&#x2F;scaling spectrum.<p>- That forest trail world is ~5 million parameters, trained on 15 minutes of video, scoped to run on a five-year-old iPhone through a twenty-year old API (WebGL GPGPU, i.e OpenGL fragment shaders). It&#x27;s the smallest &#x27;3D&#x27; world model I&#x27;m aware of.<p>- Genie 3 is (most likely) ~100 billion parameters trained on millions of hours of video and running across multiple TPUs. I would be shocked if it&#x27;s not the largest-scale world model available to the public.<p>There are lots of neat intermediate-scale world models being developed as well (e.g. LingBot-World <a href="https:&#x2F;&#x2F;github.com&#x2F;robbyant&#x2F;lingbot-world" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;robbyant&#x2F;lingbot-world</a>, Waypoint 1 <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;blog&#x2F;waypoint-1" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;blog&#x2F;waypoint-1</a>) so I expect we&#x27;ll be able to play something of Genie quality locally on gaming GPUs within a year or two.
    • danielwmayer8 hours ago
      I was immediately struck when I looked down at just the boardwalk how similar it felt to being on LSD. I am continually astounded with how similar these systems end up seeming to how our brain works. May just be happy coincidences but I am pretty sold on there being true parallels that will only become more and more apparent.
      • ollin1 hour ago
        A lot of people mentioned this! The &quot;dreamlike&quot; comparison is common as well. In both cases, you have a network of neurons rendering an image approximating the real world :) so it sort of makes sense.<p>Regarding the specific boiling-textures effect: there&#x27;s a tradeoff in recurrent world models between jittering (constantly regenerating fine details to avoid accumulating error) and drifting (propagating fine details as-is, even when that leads to accumulating error and a simplified&#x2F;oversaturated&#x2F;implausible result). The forest trail world is tuned way towards jittering (you can pause with `p` and step frame-by-frame with `.` to see this). So if the effect resembles LSD, it&#x27;s possible that LSD applies some similar random jitter&#x2F;perturbation to the neurons within your visual cortex.
  • cipherself3 hours ago
    It&#x27;s quite exciting how far we&#x27;ve come from the modern exposition of world models by David Ha and Jürgen Schmidhuber in 2018 <a href="https:&#x2F;&#x2F;worldmodels.github.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;worldmodels.github.io&#x2F;</a>
  • emsign8 hours ago
    The holodeck in Star Trek seemed so cool, but when I see this I get some very bad feelings somehow. Probably because society in Star Trek seemed so much more mature and beneficial than BigTech America.
  • 0xcb023 hours ago
    I keep on repeating myself, but it feels like I&#x27;m living in the future. Can&#x27;t wait to hook this up to my old Oculus glasses and let Genie create a fully realistic sailing simulator for me, where I can train sailing with realistic conditions. On boats I&#x27;d love to sail.<p>If making games out of these simulations work, it&#x27;t be the end for a lot of big studios, and might be the renaissance for small to one person game studios.
    • jsheard23 hours ago
      Isn&#x27;t this still essentially &quot;vibe simulation&quot; inferred from videos? Surface-level visual realism is one thing, but expecting it to figure out the exact physical mechanics of sailing just by watching boats, and usefully abstract that into a gamified form, is another thing entirely.
      • JeremyNT22 hours ago
        Yeah I have a whole lot of trouble imagining this replacing traditional video games any time soon; we have actually very good and performant representations of how physics work, and games are tuned for the player to have an enjoyable experience.<p>There&#x27;s obviously something insanely impressive about these google experiments, and it certainly feels like there&#x27;s some kind of use case for them somewhere, but I&#x27;m not sure exactly where they fit in.
      • falcor8422 hours ago
        Why wouldn&#x27;t it just hook it into something like physx?
        • jsheard21 hours ago
          Google has made it clear that Genie doesn&#x27;t maintain an explicit 3D scene representation, so I don&#x27;t think hooking in &quot;assists&quot; like that is on the table. Even if it were, the AI layer would still have to infer things like object weight, density, friction and linkages correctly. Garbage in, garbage out.
          • Morromist17 hours ago
            Google could build try to build an actual 3d scene with ai using meshes or metaballs or something. That would allow for more persistance, but I expect makes the ai more brittle and limited, and, because it doesn&#x27;t really understand the rules for the 3d meshes it created, it doesn&#x27;t know how to interact with them. It can only be fluffy-mushy dream images.
    • nsilvestri23 hours ago
      The bottleneck for games of any size is always whether they are good. There are plenty of small indies which do not put out good games. I don&#x27;t see world models improving game design or fun factors.<p>If I am wrong, then the huge supply of fun games will completely saturate demand and be no easier for indie game devs to stand out.
    • bdbdbdb23 hours ago
      It&#x27;s very impressive tech but subject to the same limitations as other generative AI: Inconsistency, inaccurate physics, limited time, lag, massively expensive computation.<p>You COULD create a sailing sim but after ten minutes you might be walking on water, or in the bath, and it would use more power than a small ferry.<p>There&#x27;s no way this tech can run on a PS5 or anything close to it.
      • WarmWash22 hours ago
        Five years is nothing to wait for tech like this. I&#x27;m sure we will see the first crop of, however small, &quot;terminally plugged in&quot; humans on the back of this in the relatively near future.
      • ziofill23 hours ago
        You raise good points, but I think the “it’s not good enough” stance won’t last for long.
    • hagbarth21 hours ago
      &gt; and might be the renaissance for small to one person game studios.<p>Indie games are already bigger than ever as far as I know.
    • avaer22 hours ago
      &gt; If making games out of these simulations work, it&#x27;t be the end for a lot of big studios, and might be the renaissance for small to one person game studios.<p>I mean, if making a game eventually boils down to cooking a sufficient prompt (which to be clear, I&#x27;m not talking about text, these prompts are probably going to be more like video databases) then I&#x27;m not sure if it will be a renaissance for &quot;one person game studios&quot; any more than AI image generation has been a renaissance for &quot;one person artists&quot;.<p>I want to be optimistic but it&#x27;s hard to deny the massive distribution stranglehold that media publishing landscape has, and that has nothing to do with technology.
    • Avicebron22 hours ago
      Honestly getting a Sunfish is probably cheaper than the a VR headset if you want to &quot;train sailing&quot;
    • neom23 hours ago
      ...and then, the pneumatics in your living room.
  • ofrzeta23 hours ago
    I don&#x27;t know ... it&#x27;s impressive and all but the result always looks kind of dead.
    • saberience22 hours ago
      This reminds me of the comments by programmers roughly two years ago:<p>&quot;Sure it can write a single function but the code is terrible when it tries to write a whole class...&quot;
      • sfn4220 hours ago
        You say that as if those programmers aren&#x27;t still right
    • api22 hours ago
      It&#x27;s super cool but I see it as a much more flexible open ended take on the idea of procedurally generated worlds where hard-coded deterministic math and rendering parameters are replaced by prompt-able models.<p>The deadness you&#x27;re talking about is there in procedural worlds too, and it stems from the fact that there&#x27;s not actually much &quot;there.&quot; Think of it as a kind of illusion or a magic trick with math. It replicates some of the macro structure of the world but the true information content is low.<p>Search YouTube for procedural landscape examples. Some of them are actually a lot more visually impressive than this, but without the interactivity. It&#x27;s a popular topic in the demo scene too where people have made tiny demos (e.g. under 1k in size) that generate impressive scenes.<p>I expect to see generative AI techniques like this show up in games, though it might take a bit due to their high computational cost compared to traditional procedural generation.
  • consumer45117 hours ago
    Related:<p>&gt; Diego Rivas, Shlomi Fruchter, and Jack Parker-Holder from the Project Genie team join host Logan Kilpatrick for an in-depth discussion on Google DeepMind’s latest breakthrough in world models. Project Genie is an experimental research prototype that allows users to generate, explore, and interact with infinitely diverse, photorealistic worlds in real-time. Learn more about the shift from passive video generation to interactive media, the technical challenges of maintaining world consistency and memory, and how these models serve as an essential training ground for AI agents.<p><a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=Ow0W3WlJxRY" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=Ow0W3WlJxRY</a>
  • BatteryMountain9 hours ago
    Can&#x27;t wait to have a game that can be an all-in-one game: rpg, roleplaying, rts, space, orcs, magic, cyberworlds with infinite story lines&#x2F;worlds, items, dialogs etc. Ready Player One vibes.<p>Like I want to take my skyrim character, drop it into Diablo 2, then drop Diablo (the demon) into Need for Speed, then have my need for speed car show up on another planet and upgrade it into a space ship, while the space ship takes me to fight some mega aliens. All while offering a coherent &amp; unique experience. As you play, the game saves major forks in your story &amp; game genre, so you can invite&#x2F;share your game recipe with other humans to enjoy.<p>Also, when are we getting a new Spore game? This game is a sleeping giant waiting to be awaken.
  • meetpateltech1 day ago
    Google Deepmind Page: <a href="https:&#x2F;&#x2F;deepmind.google&#x2F;models&#x2F;genie&#x2F;" rel="nofollow">https:&#x2F;&#x2F;deepmind.google&#x2F;models&#x2F;genie&#x2F;</a><p>Try it in Google Labs: <a href="https:&#x2F;&#x2F;labs.google&#x2F;projectgenie" rel="nofollow">https:&#x2F;&#x2F;labs.google&#x2F;projectgenie</a><p>(Project Genie is available to Google AI Ultra subscribers in the US 18+.)
  • jacquesm17 hours ago
    Isn&#x27;t that more or less the theme of the movie &#x27;the Thirteenth floor?&#x27;<p><a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=Cbjhr-H2nxQ" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=Cbjhr-H2nxQ</a>
  • nickandbro23 hours ago
    This could be the future of film. Instead of prompting where you don&#x27;t know what the model will produce, you could use fine-grained motion controls to get the shot you are looking for. If you want to adjust the shot after, you could just checkpoint the model there, by taking a screenshot, and rerun. Crazy.
    • JKCalhoun23 hours ago
      I feel like people are already currently doing this. Essentially storyboarding first.<p>This guy a month ago for example: <a href="https:&#x2F;&#x2F;youtu.be&#x2F;SGJC4Hnz3m0" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;SGJC4Hnz3m0</a>
  • pedalpete20 hours ago
    This is what we were building in 2018 with Ayvri, starting from 3d tiles with the aim of building a real-world view by using AI to essentailly re-paint and add detail to what was essentially a high-resolution and faster loading Google Earth (for outside cities, we didn&#x27;t have building data).<p>We saw a very diverse group of users, the common uses was paragliders, gliders, and pilots who wanted to view their or other peoples flights. Ultramarathons, mountain bike and some road-races where it provided an interactive way to visualize the course from any angle and distance. Transportation infrastructure to display train routes to be built. The list goes on.
  • artisin20 hours ago
    Best case, Google DeepMind cracks AGI by letting agents learn for themselves inside simulated worlds. Worst case, they&#x27;ve invented the greatest, most expensive screensaver generator in human history.
    • rook_line_sinkr13 hours ago
      Could you explain how AGI is a best case? Last time I checked, &quot;if anyone builds it everyone dies&quot;.<p>Oh, is this the joke?
  • speak_on21 hours ago
    Compared to DeepMind&#x27;s Genie 3 demo, this appears to have more morphing issues and less user interactivity with environmental consistency. Is this a stripped down version?
    • qingcharles17 hours ago
      Yes. Some features were removed from the version demoed last year.
  • mosquitobiten23 hours ago
    Every character goes forward only, permanence is still out of reach apparently.
    • mikelevins23 hours ago
      I&#x27;ve been experimenting with that from a slightly different angle: teaching Claude how to play and referee a pencil-and-paper RPG that I developed over about 20 years starting in the mid 1970s. Claude can&#x27;t quite do it yet for reasons related to permanence and learning over time, but it can do surprisingly well up until it runs into those problems, and it&#x27;s possible to help it past some obstacles.<p>The game is called &quot;Explorers&#x27; Guild&quot;, or &quot;xg&quot; for short. It&#x27;s easier for Claude to act as a player than a director (xg&#x27;s version of a dungeon master or game master), again mainly because of permance and learning issues, but to the extent that I can help it past those issues it&#x27;s also fairly good at acting as a director. It does require some pretty specific stuff in the system prompt to, for example, avoid confabulating stuff that doesn&#x27;t fit the world or the scenario.<p>But to really build a version of xg on Claude it needs better ways to remember and improve what it has learned about playing the game, and what it has learned about a specific group of players in a specific scenario as it develops over time.
    • matt_LLVW19 hours ago
      Looks good to me: <a href="https:&#x2F;&#x2F;youtu.be&#x2F;15KtGNgpVnE?t=648&amp;si=urWJGEFWuN5veh43" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;15KtGNgpVnE?t=648&amp;si=urWJGEFWuN5veh43</a>
      • mosquitobiten11 hours ago
        the castles multiply and change positions in the background
  • ge9622 hours ago
    Damn that was crazy the picture of the tabletop setup&#x2F;cardboard robot and it becomes 3D interactive.
  • reneberlin13 hours ago
    The subscribers to simulations from the pr0n-industry and the billions of lonely humanoids will suffocate in their VR-headsets, if we don&#x27;t think about sensors to watch their oxygen-levels.
  • z3t49 hours ago
    When I try to run it I get: &quot;We don&#x27;t know what you&#x27;re looking for, but we hope you find it&quot;
  • bpiche20 hours ago
    This is the plot of The Peripheral, right? Love the way the second half of that book turned out. Never finished Agency..
    • qingcharles16 hours ago
      Agency was great. I read it when it came out and never realized until a couple of weeks ago that it was a sequel to The Peripheral.
  • Havoc18 hours ago
    Are world models from the perspective of an observer in the world or zoomed out?<p>Or in gaming terms do these models think FPS or RTS?<p>Text models and pixel grid vision models is easy but struggling to wrap my head around what world model &quot;sees&quot; so to speak.
  • 0x1ceb00da17 hours ago
    Someone please create a world with this: <a href="https:&#x2F;&#x2F;giphy.com&#x2F;gifs&#x2F;6pUjuQQX9kEfSe604w" rel="nofollow">https:&#x2F;&#x2F;giphy.com&#x2F;gifs&#x2F;6pUjuQQX9kEfSe604w</a>
  • binsquare21 hours ago
    It&#x27;s ability to simulate physics intact is actually a huge breakthrough.<p>I can&#x27;t even fathom what it would be like for the future of simulation and physical world when it gets far more accurate and realistic.
    • Bjorkbat20 hours ago
      Ironically the physics are kind of my biggest criticism. They call these &quot;world models&quot;, but I think it&#x27;s more accurate to call them &quot;video game models&quot; because they employ &quot;video game physics&quot; rather than real world physics, among other things<p>This is most evident in the way things collide.
      • binsquare20 hours ago
        It&#x27;s getting better staggeringly fast, just a year ago I wouldn&#x27;t expect it to be at even video game physics level so quickly.<p>If there is a possibility where it continue to improve at a similar rate with llms. A way to simulate fluid dynamics or structural dynamics with reasonable accuracy and speed can unlock much faster pace of innovation in the physical world. (And validated with rigorous scientific methods)
        • MITSardine15 hours ago
          Numerical simulation is a well explored field, we know how to do all sorts of things, the issues lie rather in the tooling and robustness of it all put together (from geometry to numerical results) than in conceptual barriers. Finite Differences have existed since the 1700&#x27;s! What hadn&#x27;t for the longest time, is the computational power to crunch billions of operations per simulation.<p>A nice thing about numerical simulation from first principles, is it innately supports arbitrary speed&#x2F;precision, that&#x27;s in fact the backbone of the mathematical analysis for why it works.<p>In some cases, as is the case for CFD, we&#x27;re actually mathematically screwed because you just have to resolve the small scales to get the macro dynamics. So the standard remains a kind of hack, which is to introduce additional equations (turbulence models) that steer the dynamics in place of the small (unresolved) scales. We know how to do better though (DNS), but it costs an arm and a leg (like years to milenia on a super computer).
  • RivieraKid23 hours ago
    This would be really cool if polished and integrated with VR.
    • cpeth20 hours ago
      Exactly this, it would essentially be a STTNG Holodeck
      • qingcharles16 hours ago
        The engineers said in an interview today that full reality simulation is their goal, aka holodeck.
  • lurker61613 hours ago
    Are there any open source projects like this?
  • srameshc23 hours ago
    What’s the endgame here? For a small gaming studio, what are the actual implications?
    • xyzsparetimexyz23 hours ago
      It means you should go the other way. Open world winning against smaller, handcrafted environments and stories was generally a mistake, and so is this.
      • mediaman22 hours ago
        What does it mean, that open world winning was a mistake? That the market is wrong, and peoples&#x27; preferences were incorrect, and they should prefer small handcrafted environments instead of what they seem to actually buy?
    • in-silico22 hours ago
      The endgame has nothing to do with gaming.<p>The goal of world models like Genie is to be a way for AI and robots to &quot;imagine&quot; things. Then, they could practice tasks inside of the simulated world or reason about actions by simulating their outcome.
    • hiccuphippo23 hours ago
      It seems to be generating images in real time, not 3d scenes. It might still be useful for prototyping.
      • saberience22 hours ago
        There are collisions though and physics seemingly, so it doesn&#x27;t seen to be a huge stretch that this could be used for games.
    • aurumque23 hours ago
      I would think that building a environment which can be managed by a game engine is the first pass. In a few years when we are able to render more than 60 seconds it could very well replace the game engine entirely by just rendering everything in realtime based on user interactions. The final phase is just prompts which turn directly into interactive games, maybe even multiplayer. When I see the progress we&#x27;ve made on things like DOOM, where it can infer the proper rendering of actions like firing weapons and even updating scores on hits and such it doesn&#x27;t feel like we&#x27;re very far off, a few years at most. For a game studio that could mean cutting out almost everything between keyboard and display, but for now just replacing the asset pipeline is huge.
      • mikewittie22 hours ago
        We seem to think that Genie is good at the creative part, but bad at the consistency and performance part. How hard would it be to take 60 seconds of Genie output and pipe it into a model that generates a consistent and performant 3D environment?
    • educasean23 hours ago
      I understand the ultimate end goal to be simulation of life. A near perfect replica of the real world we can use to simulate and test medicine, economy, and social impact.
    • rvz21 hours ago
      Screensavers for robots?
  • brador5 hours ago
    It’s a controllable camera traversing a 3D world.<p>Notice we didn’t see the cat go behind the couch. Maladaptive cut.<p>They also don’t mention that the longer it runs over around 15 seconds the more hallucination until it’s a garbled mess.
  • bigblind20 hours ago
    Anyone else going to try it and just keep getting a 404 page?
    • CamperBob215 hours ago
      It came up for me and accepted a photo, but it has just been stuck in the &quot;Don&#x27;t go anywhere, it&#x27;s almost ready&quot; state for 10+ minutes.<p>No idea how long it is supposed to take. They can pull a 3D world out of thin air but they apparently can&#x27;t vibe-code a progress bar...<p>Edit: Now it&#x27;s saying &quot;We&#x27;ll notify you when it&#x27;s ready, and you&#x27;ll have 30 seconds to enter your world. You are 37th in the queue.&quot; Go to restroom, come back 1 minute later: &quot;The time to enter your world ran out.&quot; Lame-o.
      • CamperBob211 minutes ago
        Edit 2: It works, but not in Firefox. You have to use Chrome, and no, it doesn&#x27;t tell you this. I don&#x27;t know what I expected...
  • dominick-cc20 hours ago
    Finally all my anime figurines will come to life
  • artur_makly19 hours ago
    let&#x27;s reboot Leisure Suit Larry ;-)
  • dangoodmanUT17 hours ago
    this will go crazy for kids - being able to run around as a doll or action figure in their room
  • sebasv_19 hours ago
    I am stumped. Am I misreading, or are the folks at Google deliberately confounding two interpretations of &quot;world model&quot;? Dont get me wrong, this is really cool, and it will undoubtedly have its use. But what I am seeing is an LLM that can generate textures to be fed into a human-coded 3d engine (the &quot;world model&quot; that is demonstrated), and I fail to see how that brings us closer to AGI. For AGI we need &quot;world models&quot; as in &quot;belief systems&quot;. The AI model must be able to reason about (learned) dynamics, which I dont see reflected in the text or video.
    • littlekey18 hours ago
      &gt;an LLM that can generate textures to be fed into a human-coded 3d engine<p>I&#x27;m not certain but I think the LLM is also generating the physics itself. It&#x27;s generating rules based on its training data, e.g. watch a cat walk enough and you can simulate how the cat moves in the generated &quot;world&quot;.
  • spullara18 hours ago
    has the person who designed the movement control ever played a video game?
  • numerogeek5 hours ago
    Am I the only one who thinks it&#x27;s as beautiful as it is useless?
  • adventured21 hours ago
    This is as good of a place to mark it as any.<p>Humanity goes into the box and it never comes back out. It&#x27;s better <i>in there</i> than it is <i>out there</i> for 99% of the population.
  • cloudflare72822 hours ago
    We will probably see Ready Player One in a few decades. Hoping to stay alive till then.
    • lexandstuff22 hours ago
      The mass-poverty and climate changed ravaged world parts, I could definitely see.
    • HardCodedBias22 hours ago
      Decades?<p>I mean, yes, the probability of having that level of tech in decades is quite high.<p>But the technology is moving very fast right now. It sounds crazy, but I think that there is a 50% chance of having ready player one level technology before 2030.<p>It&#x27;s absolutely possible it will take more time to become economical.
      • qingcharles16 hours ago
        Right. This thing has come light years in two years and money is pouring into the space now. Never mind the sheer amount of competition that is driving innovation in world simulation. Just within a couple of years this thing will be wild.
  • forrestthewoods13 hours ago
    SPIN THE CAMERA 1080 DEGREES YOU COWARDS<p>The only test I ever want to see with these frame-gen models is a full 1080 degree camera spin. Miss me with that 30 degree back and forth crap. I want multiple full turns. Some jitter and a little back-and-forth wobble is fine. But I want multiple full spins.<p>I’m pretty sure I know why they don’t do this :)
    • pcthrowaway2 hours ago
      &gt; SPIN THE CAMERA 1080 DEGREES YOU COWARDS<p>&gt; a full 1080 degree camera spin<p>Do you mean 3 full turns, or do you mean 180 (one half-turn)?
  • anxtyinmgmt23 hours ago
    Demis stays cooking
  • lysace15 hours ago
    Prediction: In true Google fashion they will never spend enough time on this really cool tech demo to make it really useful in any way.<p>In 6-12 months they will announce another really cool tech demo. And so on.<p>They have been doing this for decades. To us this seems like the starting point of something really cool. To them it&#x27;s a delivery; finally time to move on to something else.
  • kittikitti16 hours ago
    This is really great and to me, it feels like another ChatGPT moment. Thank you Google! This product can easily leap them over the competition. I had originally dismissed Yan Lecun&#x27;s take on world models and I now feel foolish.
  • hn_user_987617 hours ago
    This is a very interesting development. The implications for interactive world-building are quite significant.
  • user_hn_82718 hours ago
    This is a fascinating project. The idea of infinite interactive worlds is a huge leap for gaming and simulation.
  • throwaway31415518 hours ago
    The &quot;How we&#x27;re building responsibly&quot; section has nothing to do with acting responsibly. It should be called &quot;Limitations&quot; instead. Section reads LLM generated honestly.
  • moohaad22 hours ago
    everyone will make his own game now
  • cyrusradfar17 hours ago
    Now let&#x27;s cross this with the game of life with a lot more processing and see what happens.
  • JaiRathore22 hours ago
    I now believe we live in a simulation
  • mupuff123420 hours ago
    If only Google had the technology for game streaming... Oh wait<p>RIP Stadia.
    • WhereIsTheTruth19 hours ago
      Stadia was lightyears ahead, but pro-Microsoft media assassinated it with FUD<p>While &quot;journalists&quot; were busy bootlicking a laggy 720p Android only xCloud beta, Stadia was already delivering flawless 4K@60FPS in a web browser<p>They killed the only platform that actually worked just to protect Microsoft<p>This will be a textbook case study in how a legacy monopoly kills innovation to protect its own mediocrity<p>Microsoft won&#x27;t survive the century, they are a dinosaur on borrowed time that has already lost the war in mobile, AI, and robotics<p>They don&#x27;t create,, they just buy marrket share to suffocate the competition and ruin every product they touch<p>Even their cloud dominance is about to end, as they are already losing their grip on the European market to antitrust and sovereign alternatives
      • reneberlin13 hours ago
        I remember a video where they went through scyscrapers zooming into a room, where life was moving on and there was a screen inside a room and there was something running on it. I never understood how this was tanked. It was revolutionary.
  • almosthere18 hours ago
    So what is it doing in the real world, microwaving an elephant on high with 80kw every second and pouring out all the water in an sub-saharan African well every 4 minutes?
  • gambiting20 hours ago
    &gt;&gt;How we’re building responsibly<p>How are you justifying the enormous energy cost this toy is using, exactly?<p>I don&#x27;t find anything &quot;responsible&quot; about this. And it doesn&#x27;t even seem like something that has any actual use - it&#x27;s literally just a toy.
    • Morromist17 hours ago
      I think everyone has forgotten who google is here. Google makes cool things and then destroys them. Like Stadia, as someone else mentioned. This thing is already just hovering over the trash bin.<p>Of course, maybe its a bridge to something else, but all I see is an image generator that&#x27;s working really fast, so nothing novel.
  • analog837422 hours ago
    If creating an infinite world is so trivially easy (relatively speaking) then occam suggests that this world is generated.
    • avaer21 hours ago
      I don&#x27;t know if that&#x27;s the simplest explanation, considering how insanely complex the generation is (these world models might literally be the most complex things to ever be created).<p>But I do think it&#x27;s a partial existence proof.
      • analog837421 hours ago
        One big simplifier is to only render what you&#x27;re looking at. I wonder how one might demonstrate that.
  • uyribackgy16 hours ago
    [dead]
  • seanmozeik18 hours ago
    [dead]
  • TacoCommander21 hours ago
    [dead]
  • rationalfaith22 hours ago
    [dead]