Intermediate
Transformer Architecture & Attention
Understand the transformer architecture that powers modern LLMs. Learn about self-attention, multi-head attention, and the encoder-decoder structure fundamental to NLP.
Introduction
Understand the transformer architecture that powers modern LLMs. Learn about self-attention, multi-head attention, and the encoder-decoder structure fundamental to NLP.
4 Lessons
30h Est. Time
4 Objectives
1 Assessment
By completing this module you will be able to:
✓ Understand self-attention mechanism and its computational benefits
✓ Grasp the complete transformer architecture
✓ Learn about positional encoding and tokenization
✓ Work with transformer implementations in PyTorch
Lessons
Work through each lesson in order. Each one builds on the concepts from the previous lesson.
1
The Transformer Architecture
2
BERT and Encoder Models
3
GPT and Decoder Models
4
Hugging Face Transformers Library
Recommended Reading
Supplement your learning with these selected chapters from the course library.
Transformers for NLP and Computer Vision 3e
Chapters 1-6
Mastering NLP from Foundations to LLMs
Chapters 5-8
Module Assessment
Transformer Architecture & Attention
Question 1 of 3