The architecture of the neural weather model, MetNet. The input satellite and radar images first pass through a spatial downsampler to reduce memory consumption. They are then processed by a convolutional LSTM at 15 minute intervals over the 90 minutes of input data. Then axial attention layers are used to make the network see the entirety of the input images.