Implementation of two pattern recognition algorithms:
- Dynamic Time Warping (dtw) [1]
- Pattern recognition via wavelets [2]
The steps are:
1. Get the frequency of partitions, in which the time series has to be divided (for example: a pattern occurs daily, or every hour)
1. Get the frequency of partitions, in which the time series has to be divided (for example: a pattern occurs daily,
or every hour)
2. Compare each partition with the given pattern
3. Check if the compared partition contains the pattern or not
4. Flag partition if it contains the pattern
:param data: pandas dataframe. holding the data
:param field: fieldname in `data`, which holds the series to be checked for patterns
:param flagger: flagger.
:param reference_field: fieldname in `data`, which holds the pattern
:param method: str. Pattern Recognition method to be used: 'dtw' or 'wavelets'. Default: 'dtw'
:param partition_freq: str. Frequency, in which the pattern occurs. If only "days" or "months" is given, then precise length of partition is calculated from pattern length. Default: "days"
:param partition_offset: str. If partition frequency is given, and pattern starts after a timely offset (e.g., partition frequency is "1 h", pattern starts at 10:15, then offset is "15 min"). Default: 0
:param max_distance: float. For dtw. Maximum dtw-distance between partition and pattern, so that partition is recognized as pattern. Default: 0.03
:param normalized_distance: boolean. For dtw. Normalizing dtw-distance (see [1]). Default: True
:param open_end: boolean. For dtw. End of pattern is matched with a value in the partition (not necessarily end of partition). Recommendation of [1]. Default: True
:param widths: tuple of int. For wavelets. Widths for wavelet decomposition. [2] recommends a dyadic scale. Default: (1,2,4,8)
:param waveform: str. For wavelets. Wavelet to be used for decomposition. Default: 'mexh'
Literature:
Parameters
----------
data : dios.DictOfSeries
A dictionary of pandas.Series, holding all the data.
field : str
The fieldname of the column, holding the data-to-be-flagged.
flagger : saqc.flagger
A flagger object, holding flags and additional Informations related to `data`.
reference_field : str
Fieldname in `data`, that holds the pattern
method : {'dtw', 'wavelets'}, default 'dtw'.
Pattern Recognition method to be used.
partition_freq : str, default 'days'
Frequency, in which the pattern occurs.
Has to be an offset string or one out of {"days", "months"}. If 'days' or 'months' is passed,
then precise length of partition is calculated from pattern length.
partition_offset : str, default '0'
If partition frequency is given by an offset string and the pattern starts after a timely offset, this offset
is given by `partition_offset`.
(e.g., partition frequency is "1h", pattern starts at 10:15, then offset is "15min").
ax_distance : float, default 0.03
Only effective if method = 'dtw'.
Maximum dtw-distance between partition and pattern, so that partition is recognized as pattern.
(And thus gets flagged.)
normalized_distance : bool, default True.
For dtw. Normalizing dtw-distance (Doesnt refer to statistical normalization, but to a normalization that
makes comparable dtw-distances for probes of different length, see [1] for more details).
open_end : boolean, default True
Only effective if method = 'dtw'.
Weather or not, the ending of the probe and of the pattern have to be projected onto each other in the search
for the optimal dtw-mapping. Recommendation of [1].
widths : tuple[int], default (1,2,4,8)
Only effective if method = 'wavelets'.
Widths for wavelet decomposition. [2] recommends a dyadic scale.
waveform: str, default 'mexh'
Only effective if method = 'wavelets'.
Wavelet to be used for decomposition. Default: 'mexh'
Returns
-------
data : dios.DictOfSeries
A dictionary of pandas.Series, holding all the data.
flagger : saqc.flagger
The flagger object, holding flags and additional Informations related to `data`.
Flags values may have changed relatively to the flagger input.
[2] Maharaj, E.A. (2002): Pattern Recognition of Time Series using Wavelets. In: Härdle W., Rönz B. (eds) Compstat. Physica, Heidelberg, 978-3-7908-1517-7.
[2] Maharaj, E.A. (2002): Pattern Recognition of Time Series using Wavelets. In: Härdle W., Rönz B. (eds) Compstat.