-
David Schäfer authored
naming scheme
ac6ad522
BreakDetection.md 4.66 KiB
Break Detection
Index
breaks_flagSpektrumBased
breaks_flagSpektrumBased(thresh_rel=0.1, thresh_abs=0.01,
first_der_factor=10, first_der_window="12h",
scnd_der_ratio_range=0.05, scnd_der_ratio_thresh=10,
smooth=True, smooth_window=None, smooth_poly_deg=2)
parameter | data type | default value | description |
---|---|---|---|
thresh_rel | float | 0.1 |
Minimum relative difference between two values to consider the latter as a break candidate. See condition (1) |
thresh_abs | float | 0.01 |
Minimum absolute difference between two values to consider the latter as a break candidate. See condition (2) |
first_der_factor | float | 10 |
Multiplication factor for arithmetic mean of the first derivatives surrounding a break candidate. See condition (3). |
first_der_window | offset string | "12h" |
Window around a break candidate for which the arithmetic mean is calculated. See condition (3) |
scnd_der_ratio_range | float | 0.05 |
Maximum deviation from one of the ratio of the second derivatives of a break candidate and its preceding value. See condition (5) |
scnd_der_ratio_thresh | float | 10.0 |
Threshold for the ratio of the second derivatives of a break candidate and its succeeding value. See condition (5) |
smooth | bool | True |
Smooth the time series before differentiation using the Savitsky-Golay filter |
smooth_window | offset string | None |
Size of the smoothing window of the Savitsky-Golay filter. The default value None results in a window of two times the sampling rate (i.e. three values) |
smooth_poly_deg | integer | 2 |
Degree of the polynomial used for smoothing with the Savitsky-Golay filter |
The function flags breaks (jumps/drops) by evaluating the derivatives of a time series.
A value
x_k
of a time series x_t
with timestamps t_i
, is considered to be a break, if:
-
x_krepresents a sufficiently large relative jump:|\frac{x_k - x_{k-1}}{x_k}| >
thresh_rel
-
x_krepresents a sufficient absolute jump:|x_k - x_{k-1}| >
thresh_abs
-
The dataset
X = x_i, ..., x_{k-1}, x_{k+1}, ..., x_j, with|t_{k-1} - t_i| = |t_j - t_{k+1}| =first_der_window
fulfills the following condition:|x'_k| >first_der_factor
\cdot \bar{X}where
\bar{X}denotes the arithmetic mean ofX. -
The ratio (last/this) of the second derivatives is close to 1:
1 -scnd_der_ratio_range
< |\frac{x''_{k-1}}{x_{k''}}| < 1 +scnd_der_ratio_range
-
The ratio (this/next) of the second derivatives is sufficiently height:
|\frac{x''_{k}}{x''_{k+1}}| >scnd_der_ratio_thresh
NOTE:
- Only works for time series
- The time series is expected to be harmonized to an equidistant frequency grid
This Function is a generalization of the spectrum based spike flagging mechanism as presented in [1].
References
[1] Dorigo,W. et al.: Global Automated Quality Control of In Situ Soil Moisture Data from the international Soil Moisture Network. 2013. Vadoze Zone J. doi:10.2136/vzj2012.0097.