Self-attention models lack an inherent sense of word order. This article explains positional encoding in Transformers from first principles, showing how sine–cosine functions encode absolute and relative positions efficiently and enable sequence understanding.