Terminology #

Overview #

Background #

Why Transformers #

Attention #

Types of Attention #

Using images as an example here

Self-Attention #

Understanding Q, K, V #

Encoder-Decoder Architecture #

Karpathy Intro to Transformers #

Karpathy’s understanding of attention #

More Karpathy Transformer Notes #

Transformer Blocks (Encoder/Decoder blocks) #

Transformer Decoder Blocks Into a Word #

Visual Transformers (ViT) #

Other Implementations #

Future Needs #