Cookie Consent: We use cookies to give you the best online experience, for analytics, performance, and to tailor the experience towards your interests.

Skip to content
Pic of CoRecursive: Coding Stories

The Bitter Lesson: The history of reinforcement learning

June 13, 2026
0 comments

CoRecursive: Coding Stories

Description

I've been trying to understand how machine learning actually works. Not use it, understand it, down to the ifs and loops. How does a program built out of plain conditionals get better on its own?

So late one night I sent Don a paper. Three words in the title: reward is enough. The claim is that all of intelligence, the whole thing, comes down to a system maximizing a reward. Don thought that was far too reductive. I wanted to pull it apart and see if it held up. We backed up through the history to find out how far "reward is enough" really goes: B.F. Skinner training pigeons, a backgammon program that taught itself, the Go move no human would have played.

It's a story about machine learning, and what that leaves for the rest of us who still do it by hand.