7 comments

  • ghostbrainalpha8 minutes ago
    Is &quot;CRIME&quot; an acronym?<p>Or is this actually a law enforcement related example?
    • apwheele2 minutes ago
      Crime De-coder is my consulting firm (not an acronym), but the book is not specific to crime analysis -- it is more general.
  • schnau_software1 hour ago
    I want to buy this book! But the price is too high. Can you offer some kind of HN discount?
    • apwheele1 hour ago
      You can use `LLMDEVS` for 50% off of epub (that was the coupon I sent to folks on my newsletter).
  • clemailacct11 hour ago
    I’m always curious why local models aren’t being pushed more for certain types of data the person is handling. Data leakage to a 3rd party LLM is top on my list of concerns.
    • pkress21 hour ago
      Worth noting that AWS Bedrock makes it easy to have zero retention with premier claude models. Not quite local, but it feels local-adjacent for security while getting affordable access to top-performing models... GCP appears to be a bit harder to set this up.
      • apwheele48 minutes ago
        IMO Google Vertex is not any harder than AWS. AWS biggest pain is figuring out IAM roles for some of the services (batching and S3 Vectors -- I actually cut out Knowledge Bases in the book because it was too complicated and expensive). Have not personally had as big an issue figuring out Vertex.<p>I do have a follow up post planned on some reliability issues with the APIs I uncovered with compiling the book so much -- I would not use Google Maps grounding in production!
    • apwheele1 hour ago
      I am not as concerned with that with API usage as I am with the GUI tools.<p>Most of the day gig is structured extraction and agents, which the foundation LLMs are much better than any of the small models. (And I would not be able to provision necessary compute for large models given our throughput.)<p>I do have on the ToDo list though evaluating Textract vs the smaller OCR models (in the book I show using docling, their are others though, like the newer GLM-OCR). Our spend for that on AWS is large enough and they are small enough for me to be able to spin up resources sufficient to meet our demand.<p>Part of the reason the book goes through examples with AWS&#x2F;Google (in additiona to OpenAI&#x2F;Anthropic) is that I suspect many individuals will be stuck with the cloud provider that their org uses out of the box. So I wanted to have as wide of coverage as possible for those folks.
    • iririririr1 hour ago
      but they claim your data is private and they will totally not share any of it with their advertising partners!
  • cranberryturkey2 hours ago
    Biggest gap I see in most &quot;LLM for practitioners&quot; guides is they skip the evaluation piece. Getting a prompt working on 5 examples is easy — knowing if it actually generalizes across your domain is the hard part. Especially for analysts who are used to statistical rigor, the vibes-based evaluation most LLM tutorials teach feels deeply unsatisfying.<p>Does this guide cover systematic eval at all?
    • apwheele2 hours ago
      Totally agree it is critical. Each of chapters 4&#x2F;5&#x2F;6 have specific sections demonstrating testing. For structured outputs it goes through an example ground truth and calculating accuracy, demoing an example comparing Haiku 3 vs 4.5.<p>For Chapter 5 on RAG, it goes through precision&#x2F;recall (with emphasis typically on recall for RAG systems).<p>For Chapter 6, I show a demo of LLM as a judge (using structured outputs to have specific errors it looks for) to evaluate a more fuzzy objective (writing a report based on table output).
  • Schlagbohrer3 hours ago
    thought it said Large Lagrange Models
  • nimbus-hn-test3 minutes ago
    [dead]
  • nimbus-hn-test20 minutes ago
    [dead]