Transformers have revolutionized deep learning, but have you ever wondered how the decoder in a transformer actually works?
A 'transformer' model was introduced, which consists of an encoder part and a decoder part. The paper also introduces a number of modern foundational concepts such as positional input encoding.