The Eval Stack Top AI Teams Are Building Right Now
Thu, Jul 16, 04:00 PM - 04:30 PM UTC
Speakers
Martin FischHead of Machine LearningEncord
Wei-Yin KoMember of Technical StaffAdaption
Jesse Willman Engineering Program Management, Machine LearningCohereRegister now
Fill out your details below and we'll send through the dial-in link. If you can't make it fill out the form and we'll send you the webinar recording.
Live Panel with in collaboration with AI Circle & Cohere
As models get more capable, automated evals stop telling you much. The signal that's left, whether the model is actually improving, comes from structured human judgment at scale.
Most teams don't have the infrastructure to produce it; this session is about how those that do have built it.
What we'll cover:
- Where automated evals fall short and what that tells us about what human feedback still needs to do
- What separates a rigorous human eval pipeline from ad-hoc annotation
- The failure modes teams keep hitting when they try to scale human feedback, and how to avoid them
- Where this is all heading as models get more capable
Register to attend live or receive the recording.
