Notes
- Restart TransformerLSTM hyperparameter tuning with
causal_attention=False, remove MSE regularization parameters, increasenum_transformer_blocksto include 6 and 8, and increase context length to 3000 or 2000 to see if it increases performance.