Introduction to interpretability
Mechanistic interpretability is the art/science of understanding what's going on inside AI models. This video gives a good introduction to interpretability also for non-technical people.
I listen to some 20 podcast episodes each week, and share the most valuable ones here on AI Podcast Picks. This is one of those episodes. A full list of AI related podcasts that I follow can be found here. Se the Falk AI Substack for my writings on AI.
Episode title: Interpretability: Understanding how AI models think
Podcast: Anthropic
Release date: 2025-08-16
Ideal for: Anyone who wants to start to understand what mechanistic interpretability is, and why it is useful.
Links: YouTube (no audio podcast available)

