Today's Margin
Thursday, June 18
The idea that taught machines to read: meet the Transformer
orig. “Attention Is All You Need” · Ashish Vaswani, Noam Shazeer, Niki Parmar
This could make your translation apps much better.
Imagine you're trying to translate a sentence from English to French. You might look up each word in a dictionary. But that's not enough. You need to understand how the words fit together. That's what this paper is about.
The researchers wanted to make computers better at understanding sentences. They built a new kind of neural network (a computer program that learns from examples). This neural network is called a transformer (a kind of AI model that learns by weighing which words matter most to each other). The transformer looks at whole sentences at once. It figures out which words are most important to each other. This is called attention (a way for the AI to focus on the most important parts of the input).
Other AI models look at words one at a time. They remember what they saw before. But this is slow. The transformer is different. It looks at all the words at once. This makes it much faster. The researchers tested the transformer on two language tasks. One was translating English to German. The other was translating English to French. The transformer did better than other models. It got a score of 28.4 on the German task. This was better than the old best score. On the French task, it got a score of 41.8. This was the best single-model score ever.
The transformer also works on other tasks. The researchers tried it on a task called English constituency parsing. This is like breaking down a sentence into its parts. The transformer did well on this task too. It worked with both lots of training data and little training data.
- Your phone's translation app could get much better. It could translate sentences faster and more accurately.
- Doctors could use it to translate medical notes quickly (medical notes are full of specialized terms).
- Students could use it to translate school books into their own language.
- Travelers could use it to talk to people in other countries. It could make trips smoother and more enjoyable.
- News websites could use it to translate articles into many languages quickly. This could help spread information faster.
- It could make computers understand us better. This could help with voice assistants and chatbots.
The annotated version adds the source PDF, a quick quiz, and member notes.