I coded this Transformer from scratch for learning. It is based on The Annotated Transformer by Harvard NLP, which uses PyTorch.

I tested this with a toy problem so that data loading, tokenizing, etc. code is not needed.

🧸 The toy problem is to reverse a given sequence whilst replacing every even repetition of a digit with a special token ( X ). For example,

input = 0 1 5 9 0 3 5 2 5 input after replacing even repetitions: 0 1 5 9 X 3 X 2 5 reversed = 5 2 X 3 X 9 5 1 0

🎫 If someone reading this has any questions or comments please find me on Twitter, @vpj.