Skip to content
Snippets Groups Projects
Commit 5d665fad authored by Bert Palm's avatar Bert Palm 🎇
Browse files

added a calibration plot to flag_pattern, because its hard to find a suitable...

added a calibration plot to flag_pattern, because its hard to find a suitable threshold without any feedback.
parent b67d86fb
No related branches found
No related tags found
No related merge requests found
Pipeline #37722 failed with stage
in 1 minute and 59 seconds
...@@ -182,7 +182,15 @@ def calculateDistanceByDTW( ...@@ -182,7 +182,15 @@ def calculateDistanceByDTW(
@flagging(masking="field", module="pattern") @flagging(masking="field", module="pattern")
def flagPatternByDTW( def flagPatternByDTW(
data, field, flags, ref_field, max_distance=0.0, normalize=True, flag=BAD, **kwargs data,
field,
flags,
ref_field,
max_distance=0.0,
normalize=True,
plot=False,
flag=BAD,
**kwargs
): ):
"""Pattern Recognition via Dynamic Time Warping. """Pattern Recognition via Dynamic Time Warping.
...@@ -219,6 +227,15 @@ def flagPatternByDTW( ...@@ -219,6 +227,15 @@ def flagPatternByDTW(
processing. The distances then refer to the mean distance per datapoint, processing. The distances then refer to the mean distance per datapoint,
expressed in the datas units. expressed in the datas units.
plot: bool, default False
Show a calibration plot, which can be quite helpful to find the right threshold
for `max_distance`. It works best with `normalize=True`. Do not use in automatic
setups / pipelines. The plot show three lines:
- data: the data the function was called on
- distances: the calculated distances by the algorithm
- indicator: have to distinct levels: `0` and the value of `max_distance`.
If `max_distance` is `0.0` it defaults to `1`. Everywhere where the
indicator is not `0` the data will be flagged.
Returns Returns
------- -------
...@@ -260,5 +277,12 @@ def flagPatternByDTW( ...@@ -260,5 +277,12 @@ def flagPatternByDTW(
rolling = customRoller(minima, window=winsz) rolling = customRoller(minima, window=winsz)
mask = rolling.sum() > 0 mask = rolling.sum() > 0
if plot:
df = pd.DataFrame()
df["data"] = dat
df["distances"] = distances
df["indicator"] = mask.astype(float) * (max_distance or 1)
df.plot()
flags[mask, field] = flag flags[mask, field] = flag
return data, flags return data, flags
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment