Notes

  • Restart TransformerLSTM hyperparameter tuning with causal_attention=False, remove MSE regularization parameters, increase num_transformer_blocks to include 6 and 8, and increase context length to 3000 or 2000 to see if it increases performance.