TFLSTM-51
Model summary
TransformerLSTM trained with a longer context (2040 samples as opposed to 1020 samples), but with reduced batch size (32) to fit into the A100 memory. causal_attention=True.
Model summary
TransformerLSTM trained with a longer context (2040 samples as opposed to 1020 samples), but with reduced batch size (32) to fit into the A100 memory. causal_attention=True.