AIExplainer

What is an Attention?

A mechanism that lets a model focus on the most relevant parts of its input when producing an output, weighting what matters most in context.

Attention is a mechanism that lets a model focus on the most relevant parts of its input when producing an output. Instead of treating every word or pixel equally, it weighs what matters most in context.

It solved a major bottleneck in processing long sequences and enabled the Transformer architecture.

Attention is like reading a sentence and instinctively emphasising the words that carry the meaning, while barely registering filler words. Your focus shifts depending on what the sentence is actually about.

When translating "The bank by the river" versus "The bank approved the loan," an attention-based model focuses on different words to resolve ambiguity.

Attention is the foundation of Transformers — the architecture behind GPT, Claude, and most modern language models. It also helps translation systems align words across languages.

Attention is not human-like focus or consciousness — it is a mathematical weighting scheme over inputs.

self-attention