Using a model pre-trained on field simulations, we fine-tune on measured data from the SPS Main Dipoles using a Version 2.
The fine-tuning is broken down into 2 steps:
- Tune head of the network. We freeze the entire network except for
attn_grn,attn_gate2,attn_norm2andoutput_layer, and train to convergence. This step tunes the output layers to the new data distribution, and requires use of a very small learning rate (1e-5) - Unfreeze the entire model and train with a higher learning rate (1e-4).
In our experiments we additionally train different models with different quantiles for the Quantile loss, and train a baseline model with a single TFT on the same dataset, albeit without the pretraining step.
We evaluate on the validation dataset, as well as an unseen test dataset (including dynamic economy), in both normal and autoregressive mode.
Models
All models are trained with a Temporal Fusion Transformer, but with different data and quantiles.
The quantiles used are the following:
TFTMBI-5: Baseline, no pretraining, TRATFTMBI-8: Step 1, pre-trained on noise-free data, TRATFTMBI-11: Step 2: pre-trained on noise-free data, TRATFTMBI-12: Step 1, pre-trained on noisy data, TRATFTMBI-13: Step 2: pre-trained on noisy data, TRATFTMBI-15: Step 1, pre-trained on noisy data, TRATFTMBI-16: Step 2: pre-trained on noisy data, TRATFTMBI-17: Step 1, pre-trained on noisy data, TRATFTMBI-18: Step 2: pre-trained on noisy data,
Evaluations
Validation (normal)
| Model | MSE | MAE | RMSE | MAPE | SMAPE |
|---|---|---|---|---|---|
| TFTMBI-5 | 1.93e-09 | 2.22e-05 | 4.40e-05 | 8.83e-05 | 8.83e-05 |
| TRATFTMBI-8 | 1.12e-08 | 7.22e-05 | 1.06e-04 | 2.58e-04 | 2.58e-04 |
| TRATFTMBI-11 | 2.37e-10 | 1.02e-05 | 1.54e-05 | 5.32e-05 | 5.32e-05 |
| TRATFTMBI-12 | 8.51e-09 | 5.89e-05 | 9.22e-05 | 1.80e-04 | 1.80e-04 |
| TRATFTMBI-13 | 2.18e-10 | 1.09e-05 | 1.48e-05 | 6.03e-05 | 6.03e-05 |
| TRATFTMBI-15 | 6.68e-09 | 5.67e-05 | 8.17e-05 | 2.30e-04 | 2.31e-04 |
| TRATFTMBI-16 | 2.13e-10 | 1.06e-05 | 1.46e-05 | 6.61e-05 | 6.61e-05 |
| TRATFTMBI-17 | 9.03e-09 | 6.47e-05 | 9.50e-05 | 2.20e-04 | 2.20e-04 |
| TRATFTMBI-18 | 2.25e-10 | 1.09e-05 | 1.50e-05 | 5.80e-05 | 5.80e-05 |
Test (normal)
| Model | MSE | MAE | RMSE | MAPE | SMAPE |
|---|---|---|---|---|---|
| TFTMBI-5 | 2.29e-09 | 2.91e-05 | 4.79e-05 | 1.08e-04 | 1.08e-04 |
| TRATFTMBI-8 | 5.13e-09 | 5.02e-05 | 7.16e-05 | 2.14e-04 | 2.14e-04 |
| TRATFTMBI-11 | 2.57e-09 | 2.92e-05 | 5.07e-05 | 9.18e-05 | 9.18e-05 |
| TRATFTMBI-12 | 4.30e-09 | 4.59e-05 | 6.56e-05 | 2.08e-04 | 2.08e-04 |
| TRATFTMBI-13 | 2.30e-09 | 2.89e-05 | 4.80e-05 | 9.00e-05 | 9.00e-05 |
| TRATFTMBI-15 | 5.82e-09 | 5.43e-05 | 7.63e-05 | 2.66e-04 | 2.66e-04 |
| TRATFTMBI-16 | 2.79e-09 | 3.16e-05 | 5.28e-05 | 1.05e-04 | 1.05e-04 |
| TRATFTMBI-17 | 6.81e-09 | 5.56e-05 | 8.25e-05 | 2.17e-04 | 2.17e-04 |
| TRATFTMBI-18 | 2.37e-09 | 2.88e-05 | 4.86e-05 | 9.15e-05 | 9.15e-05 |
Test (normal, funky)
| Model | MSE | MAE | RMSE | MAPE | SMAPE |
|---|---|---|---|---|---|
| TFTMBI-5 | 1.02e-08 | 5.33e-05 | 1.01e-04 | 1.54e-04 | 1.54e-04 |
| TRATFTMBI-8 | 2.37e-08 | 1.07e-04 | 1.54e-04 | 2.93e-04 | 2.93e-04 |
| TRATFTMBI-11 | 9.49e-09 | 4.41e-05 | 9.74e-05 | 1.01e-04 | 1.01e-04 |
| TRATFTMBI-12 | 2.11e-08 | 9.59e-05 | 1.45e-04 | 2.62e-04 | 2.62e-04 |
| TRATFTMBI-13 | 9.43e-09 | 4.41e-05 | 9.71e-05 | 1.05e-04 | 1.05e-04 |
| TRATFTMBI-15 | 2.17e-08 | 1.01e-04 | 1.47e-04 | 2.93e-04 | 2.93e-04 |
| TRATFTMBI-16 | 1.09e-08 | 4.92e-05 | 1.04e-04 | 1.10e-04 | 1.10e-04 |
| TRATFTMBI-17 | 2.16e-08 | 1.02e-04 | 1.47e-04 | 2.91e-04 | 2.91e-04 |
| TRATFTMBI-18 | 1.22e-08 | 5.58e-05 | 1.10e-04 | 1.31e-04 | 1.31e-04 |