One program that learned to play Atari games from the pixels

orig. “Playing Atari with Deep Reinforcement Learning” · Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller

Reinforcement Learning Intermediate 4 min read Written, reviewed by Marginalia Editorial

In the margin

Area

Reinforcement Learning, Learning by trial and error: an agent takes actions, gets rewards, and figures out a strategy that works.

This system learned to play dozens of Atari games on its own, using nothing but the screen and the score.

What's going on

Most game-playing programs are told the rules. This one was not. It saw only the raw pixels and the score, then learned by trial and error which actions led to more points. It paired that trial-and-error approach, called reinforcement learning, with a deep network that read the screen. On several games it reached or passed human skill, all from the same setup with no game-specific tuning.

Why it matters

This was a striking proof that one general method could learn many different tasks from scratch. It launched a decade of work on agents that learn by doing, leading to systems that play Go, control robots, and tune other AI. It is the cleanest early example of learning a skill purely from feedback.

Source

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller, DeepMind

View on arXiv ↗ PDF ↗

We write original plain-language summaries and link to the source. We never republish the paper.

Still fuzzy on a sentence?

Paste it and we'll explain it even more simply.

Test your understanding

Pass all three to earn the “read & understood” stamp (+10 pts).

Member notes Sign in ↗

ME Marginalia Editorial TEAM

We read the full paper and rewrote it in plain language. Leave your own note below.

Leaderboard · this week

Pass quizzes and leave notes to climb your chapter's board. No chapters are running yet, so this one is wide open.

Start a chapter to compete →