TFLSTM-51

Model summary

TransformerLSTM trained with a longer context (2040 samples as opposed to 1020 samples), but with reduced batch size (32) to fit into the A100 memory. causal_attention=True.