I made some assumptions based on H100s and models around the 4o size. Running them locally changes the equation, of course - any sort of compute that can be distributed is going to enjoy economies of scale and benefit from well worn optimizations that won't apply to locally run single user hardware.<p>Also, for AI specifically, depending on MoE and other sparsity tactics, caching, hardware hacks, regenerative capture at the datacenter, and a bajillion other little things, the actual number is variable. Model routing like OpenAI does further obfuscates the cost per token - a high capabilities 8B model is going to run more efficiently than a 600B model across the board, but even the enormous 2T models can generate many tokens for the equivalent energy of burning µL of gasoline.<p>If you pick a specific model and gpu, or Google's TPUs, or whatever software/hardware combo you like, you can get to the specifics. I chose µL of gasoline to drive the point across, tokens are incredibly cheap, energy is enormously abundant, and we use many orders of magnitude more energy on things we hardly ever think about, it just shows up in the monthly power bill.<p>AC and heating, computers, household appliances, lights, all that stuff uses way more energy than AI. Even if you were talking with AI every waking moment, you're not going to be able to outpace other, far more casual expenditures of energy in your life.<p>A wonderful metric would be average intelligence level per token generated, and then adjust the tokens/Joule with an intelligence rank normalized against a human average, contrasted against the cost per token. That'd tell you the average value per token compared to the equivalent value of a human generated token. Should probably estimate a ballpark for human cognitive efficiency, estimate token/Joule of metabolism for contrast.<p>Doing something similar for image or music generation would give you a way of valuing the relative capabilities of different models, and a baseline for ranking human content against generations. A well constructed meme clip by a skilled creator, an AI song vs a professional musician, an essay or article vs a human journalist, and so on. You could track the value over context length, length of output, length of video/audio media, size of image, and so on.<p>Suno and nano banana and Veo and Sora all far exceed the average person's abilities to produce images and videos, and their value even exceeds that of skilled humans in certain cases, like the viral cat playing instrument on the porch clips, or ghiblification, or bigfoot vlogs, or the AI country song that hit the charts. The value contrasted with the cost shows why people want it, and some scale of quality gives us an overall ranking with slop at the bottom up to major Hollywood productions and art at the Louvre and Beethoven and Shakespeare up top.<p>Anyway, even without trying to nail down the relative value of any given token or generation, the costs are trivial. Don't get me wrong, you don't want to usurp all a small town's potable water and available power infrastructure for a massive datacenter and then tell the residents to pound sand. There are real issues with making sure massive corporations don't trample individuals and small communities. Local problems exist, but at the global scale, AI is providing a tremendous ROI.<p>AI doombait generally trots out the local issues and projects them up to a global scale, without checking the math or the claims in a rigorous way, and you end up with lots of outrage and no context or nuance. The reality is that while issues at scale do exist, they're not the issues that get clicks, and the issues with individual use are many orders of magnitude less important than almost anything else any individual can put their time and energy towards fixing.