The Swinging Door algorithm is a data compression technique for time-series data that preserves essential trends while reducing volume. It creates “doors” that encompass data points falling within a certain deviation from the initial point.
Algorithm
When a data point exceeds the threshold, the door closes, its endpoints are recorded, and a new door begins.
import numpy as np
def swinging_door(data: np.ndarray, epsilon: float, delta_t: float = None) -> np.ndarray:
"""Applies the Swinging Door algorithm to compress time-series data.
Args:
data: A numpy array representing the time-series data with shape (n, 2), where each row is (time, value).
epsilon: The deviation threshold (maximum allowed deviation from the door's starting point).
delta_t: Optional. Maximum time interval for each door (in the same units as the data's time dimension).
Returns:
A numpy array of compressed data points, where each point is a tuple (time, value).
"""
compressed_data = [] # List to store compressed data
door_start = data[0, 1]
t_start = data[0, 0]
for t, value in data:
if np.abs(value - door_start) > epsilon or (delta_t is not None and t - t_start >= delta_t):
compressed_data.append((t_start, door_start)) # Tuple representing a data point
door_start = value
t_start = t
compressed_data.append((t_start, door_start)) # Add the last door
return np.array(compressed_data)Parameters
- Deviation threshold (ε): Maximum allowed deviation from door’s starting point
- Time interval (Δt): Optional maximum time interval for each door
Applications
Used in Adaptive downsampling for B-Train data compression before neural network training.
Related Concepts
- Ramer-Douglas-Peucker Algorithm - Alternative compression method
- Time Series Compression - Broader category of techniques
- Irregular Time Indices in Neural Networks - Handling non-uniform sampling