Time series compression reduces data volume while preserving essential patterns for neural network training. Critical for processing high-frequency B-Train data (1 kHz) before feeding to models like Temporal Fusion Transformer.
Problem Statement
The sampling rate of raw B-Train data is 1 kHz, meaning one second of data is 1000 points. For 10s predictions, this becomes 10,000 points, which is significant for large neural networks, especially LSTMs where computation time scales directly with number of samples.
Compression Methods
Adaptive Algorithms
- Swinging Door Algorithm - Threshold-based door method
- Ramer-Douglas-Peucker Algorithm - Recursive line simplification
Fixed-Rate Methods
- Regular interval downsampling - introduces artifacts in high rate-of-change regions
Consequences
Non-uniform Time Axis
Adaptive downsampling creates irregularly spaced samples. Must add temporal information as features:
- Relative time indices ()
- Absolute time (), normalized
Applications
- Adaptive downsampling - Primary preprocessing step
- Neural network preprocessing for MBI data
- Real-time prediction systems requiring fast inference
Related Concepts
- Irregular Time Indices in Neural Networks - Handling compressed data
- Data Pre-Processing - Broader preprocessing pipeline
- Drift-reduction on MBI data - Related data quality improvements