TFLSTM-38

┏━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃   ┃ Name                     ┃ Type                 ┃ Params ┃
┡━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ 0 │ criterion                │ WeightedMSELoss      │      0 │
│ 1 │ model                    │ TransformerLSTMModel │  4.3 M │
│ 2 │ model.encoder_grn        │ GatedResidualNetwork │  199 K │
│ 3 │ model.decoder_grn        │ GatedResidualNetwork │  199 K │
│ 4 │ model.encoder            │ LSTM                 │  1.1 M │
│ 5 │ model.decoder            │ LSTM                 │  1.1 M │
│ 6 │ model.transformer_blocks │ ModuleList           │  1.8 M │
│ 7 │ model.output_head        │ Linear               │    257 │
└───┴──────────────────────────┴──────────────────────┴────────┘
Trainable params: 257
Non-trainable params: 0
Total params: 4.3 M
Total estimated model params size (MB): 17

TFLSTMPRE-3 first stage transfer to Dipole datasets v9 with only output_head unlocked.

Losses compared to TFLSTM-18 (green)