Hyperparameters

The following hyperparameters are fixed for modeling the MBIs, and other parameters like architecture can be varied.

  • downsample: 20
  • freq_cutoff: 25
    • Cutoff frequency set to half of the sampling frequency, since higher harmonics will not be visible anyways
  • ctxt_seq_len: 1020
    • Effective context length 20400 ms, equivalent to 1 LHC1
  • tgt_seq_len: 540
    • Effective prediction length 10800 ms, equivalent to 1 SFTPRO1
  • min_ctxt_seq_len: 180
    • Effective context length 3600 ms, equivalent to 1 MD1
  • min_tgt_seq_len: 180
    • Effective context length 3600 ms, equivalent to 1 MD1

Results

Training the model on Version 3 and evaluating on its validation and tests sets, we get the following metrics:

Validation set

ModelMSEMAERMSEMAPESMAPE
TFTMBI-51.91e-092.23e-054.37e-059.32e-059.32e-05
TFTMBI-73.95e-106.97e-061.99e-053.28e-053.28e-05

Test set 1

ModelMSEMAERMSEMAPESMAPE
TFTMBI-52.31e-098.59e-064.80e-054.58e-054.58e-05
TFTMBI-74.25e-091.47e-056.52e-057.38e-057.38e-05

Test set 2

ModelMSEMAERMSEMAPESMAPE
TFTMBI-51.10e-086.70e-051.05e-042.52e-042.53e-04
TFTMBI-72.12e-091.50e-054.60e-057.20e-057.20e-05
  • Evaluate TFT autoregressively, and cycle-by-cycle [priority:: high] [due:: 2025-02-20] [completion:: 2025-02-26]
  • Train TFT on MBI data on different quantiles [priority:: low] [completion:: 2025-02-27]
  • Train TFT from pre-trained model [due:: 2025-02-21] [priority:: medium] [start:: 2025-02-21] [completion:: 2025-05-19]

Fixed attention and patched dataset v3