ai Evals: The Unit Tests for the Non-Deterministic Parts of Your App
Building an app on top of a language model means part of your code now returns a different answer every time you run it. Here's how to keep that part honest — with a tiny, complete, runnable Ruby app and a real eval harness that tests it end to end.