We propose a new model architecture xTFT (Extended Temporal Fusion Transformer) based on the vanilla TFT with a drop-in replacement of the LSTM with the newly published xLSTM.

The implementation released by NX-AI (https://github.com/NX-AI/xlstm) has an optimized kernel for computing the mLSTM in matrix form and does not support outputting the hidden states for the default implementation of the forward method, making this implementation unsuitable for use in he TFT where the hidden state is passed between encoder and decoder LSTM. Thus it needs to be

  • Investigate xLSTM hidden state computation to return the last hidden state [priority:: medium] [completion:: 2024-10-14]
  • Implement xTFT with xLSTM as backend [priority:: medium] [completion:: 2024-10-14]