4 comments

  • jdw641 hour ago
    Interpreting these metrics is quite interesting.<p>One thing for sure is that while Claude is currently taking the #1 spot in mentions, it carries a lot of negative sentiment due to API pricing policies and frequent server downtime. On the other hand, the runner-up, GPT-5.5, actually seems to have more positive feedback.<p>Personally, my experience with Codex wasn&#x27;t as good as with Claude Code (Codex freezes on Windows more often than you&#x27;d expect), so this is a bit surprising. That said, the more defensive GPT is definitely better in terms of sheer code-writing capability. However, GPT actually has quite a few issues with text corruption when generating in Korean or Chinese—something English-speaking users probably don&#x27;t notice. In terms of model capabilities, when given the same agent.md (CLAUDE.md) file, I think GPT is better at writing code, while Claude is better at writing text during code reviews.<p>Looking at the bottom right, Qwen and DeepSeek are open-source, so they are largely mentioned in the context of guarding against vendor lock-in, which drives positive sentiment. Considering that Hacker News occasionally shows negative sentiment toward China, the fact that they are viewed this positively—unlike US models—shows that being open-source is a massive advantage in itself.<p>Anyway, one thing for sure is that Gemini is pretty much unusable.
  • Jabbles50 minutes ago
    Please fix your graph so the names of the models are readable
    • marcuskaz32 minutes ago
      Also, the stacked graph only allows you to quickly see total mentions, really hard to compare negative or positive sentiment across models at a glance.
    • yunusabd39 minutes ago
      [dead]
  • yakkomajuri49 minutes ago
    &quot;Prompts an LLM&quot; -&gt; which LLM?<p>I saw you&#x27;re using Gemini for the sentiment rating (which I guess you picked because it&#x27;s not often mentioned and thus &quot;neutral&quot;? lol)<p>But would be interesting to get more details overall
  • ranger_danger30 minutes ago
    Just FYI this article seems to define &quot;start of the art&quot; as &quot;popular&quot;, as measured by &quot;total mentions and user sentiment&quot;, without any bearing on the technical abilities or actual usage of the model.
    • mellosouls18 minutes ago
      That&#x27;s pretty much exactly what the title says.<p>The technical abilities and usage are derived from the commenters usage reflections.