Skip to content
Snippets Groups Projects

Miscellaneous

A collection of unrelated quality check functions.

Index

flagRange

flagRange(min, max)
parameter data type default value description
min float The upper bound for valid values
max float The lower bound for valid values

The function flags all values outside the closed interval

[[
min, max
]]
.

flagSeasonalRange

flagSeasonalRange(min, max, startmonth=1, endmonth=12, startday=1, endday=31)
parameter data type default value description
min float The upper bound for valid values
max float The lower bound for valid values
startmonth integer 1 The interval start month
endmonth integer 12 The interval end month
startday integer 1 The interval start day
endday integer 31 The interval end day

The function does the same as flagRange, but only if the timestamp of the data-point lies in a defined interval, which is build from days and months only. In particular, the year is not considered in the Interval.

The left boundary is defined by startmonth and startday, the right boundary by endmonth and endday. Both boundaries are inclusive. If the left side occurs later in the year than the right side, the interval is extended over the change of year (e.g. an interval of [01/12, 01/03], will flag values in December, January and February).

NOTE: Only works for time-series-like datasets.

flagIsolated

flagIsolated(window, gap_window, group_window) 
parameter data type default value description
gap_window offset string The minimum size of the gap before and after a group of valid values, which makes this group considered as isolated. See condition (2) and (3)
group_window offset string The maximum size of an isolated group of valid data. See condition (1).

The function flags arbitrary large groups of values, if they are surrounded by sufficiently large data gaps. A gap is defined as group of missing and/or flagged values.

A continuous group of values

xk,xk+1,...,xk+nx_{k}, x_{k+1},...,x_{k+n}
with timestamps
tk,tk+1,...,tk+nt_{k}, t_{k+1}, ..., t_{k+n}
is considered to be isolated, if:

  1. tk+ntkt_{k+n} - t_{k} \le
    group_window
  2. None of the values
    xi,...,xk1x_i, ..., x_{k-1}
    , with
    tk1tit_{k-1} - t_{i} \ge
    gap_window is valid or unflagged
  3. None of the values
    xk+n+1,...,xjx_{k+n+1}, ..., x_{j}
    , with
    tjtk+n+1t_{j} - t_{k+n+1} \ge
    gap_window is valid or unflagged

flagMissing

flagMissing(nodata=NaN)
parameter data type default value description
nodata any NAN A value that defines missing data

The function flags all values indicating missing data.

flagPattern

flagPattern(ref_datafield, sample_freq = '15 Min', method = 'dtw', min_distance = None)
parameter data type default value description
ref_datafield string Name of the reference datafield = "pattern"
sample_freq string "15 Min" Sample frequency to harmonize the data
method string "dtw " "dtw" for Dynamic Time Warping (DTW), "wavelet" for Wavelet Pattern Recognition Algorithm
min_distance float None For DTW - alogrithm: the minimum distance of two graphs in order to be classified as "different"

Implementation of the pattern recognition algorithms introduced in Pattern Recognition.

clearFlags

clearFlags()

The funcion removes all previously set flags.

forceFlags

forceFlags(flag)
parameter data type default value description
flag float/flagging constant GOOD The flag that is set unconditionally

The functions overwrites all previous set flags with the given flag.

flagDummy

flagDummy()

Identity function, i.e. the function does nothing.