The moment computers learned to see

orig. “ImageNet Classification with Deep Convolutional Neural Networks” · Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton

Deep Learning Beginner 4 min read AI-assisted, reviewed by Marginalia Editorial

In the margin

Area

Deep Learning, A family of methods that stack many simple layers so a model can learn rich patterns — the engine behind most modern AI.

we keep the jargon out here so the middle stays easy to read

This research changed how computers see the world.

What's going on

For a long time, it was hard to train computers to tell apart pictures of cats and dogs.

A group of researchers trained a deep convolutional neural network (a type of neural network (a computer program that learns from examples) that looks at an image in small pieces, then combines those pieces to find shapes and objects) on a huge set of labeled pictures. They used a set of one million labeled pictures. Each picture had a label that told what was in it.

They trained the neural network on graphics cards (GPUs), which are much faster at maths than normal computer parts. This let them train the network much faster. They entered their network in a famous contest called ImageNet. Their network beat the old record by a lot. It got 84.7% of the pictures right. The old record was 73.2%.

This work changed how people thought about training computers to see. Before this, people had to write rules for the computer to follow. After this, people started to use neural networks more. Now, neural networks are used in many places. They are used in medical imaging, self-driving cars, and more.

Why it matters

Doctors use neural networks to spot diseases in X-rays (pictures of the inside of your body) more accurately.
Your phone uses a neural network to recognize your face when you unlock it.
Self-driving cars use neural networks to see the road and avoid obstacles.
Neural networks help translate languages by recognizing patterns in text.
They can help blind people by describing what's in a picture.
They can spot fraud by looking at patterns in money transactions.

Source

Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, University of Toronto

View the paper ↗

We write original plain-language summaries and link to the source. We never republish the paper.

Still fuzzy on a sentence?

Paste it and we'll explain it even more simply.

Test your understanding

Pass all three to earn the “read & understood” stamp (+10 pts).

Member notes Sign in ↗

ME Marginalia Editorial TEAM

We read the full paper and rewrote it in plain language. Leave your own note below.

JK Jordan K. MEMBER

The real unlock here was the GPUs. The idea had been around for years, it just got fast enough to actually work.

AN Aisha N. MEMBER

This is the before-and-after moment for AI vision. A lot of self-driving research traces back to roughly here.

LM Leo M. MEMBER

Reminder that telling a cat from a dog used to be a genuinely hard research problem. We forget how big this jump was.

AD Alex Dong MEMBER

Shout out to me myself.

Leaderboard · this week

your chapter

↑ pass the quiz and watch yourself climb. it's stupidly addictive.