Hysteresis Compensation

❯

❯

❯

2025-07-22

Jul 22, 20251 min read

Notes

Big trouble finetuning PRETFTMBI-124 related to Torch errors on compile dynamic shapes, but even when patched seems to have issues with torch compilation and random sequence length.

I switch off randomize_seq_len to train TRATFTMBI-53 on ml4, and keep it on in TRATFTMBI-50 on ml3, since we desperately need a model for the Dedicated MD 2025-07-23.

We further finetune TRATFTMBI-53 to TFTMBI-174 with ctxt_seq_len fixed to 540 (SFTPRO1 length)

Maurus’ presentation

Missing slide numbers Feedback control does not go through LSA

Graph View

Notes
Maurus’ presentation

Created with Quartz v4.5.1 © 2026

GitHub
Discord Community