-
Juliane Geller authored0dc59c06
Miscellaneous
A collection of unrelated quality check functions.
Index
flagRange
flagRange(min, max)
parameter | data type | default value | description |
---|---|---|---|
min | float | The upper bound for valid values | |
max | float | The lower bound for valid values |
The function flags all values outside the closed interval
[
min
, max
]
.
flagSeasonalRange
flagSeasonalRange(min, max, startmonth=1, endmonth=12, startday=1, endday=31)
parameter | data type | default value | description |
---|---|---|---|
min | float | The upper bound for valid values | |
max | float | The lower bound for valid values | |
startmonth | integer | 1 |
The interval start month |
endmonth | integer | 12 |
The interval end month |
startday | integer | 1 |
The interval start day |
endday | integer | 31 |
The interval end day |
The function does the same as flagRange
, but only if the timestamp of the
data-point lies in a defined interval, which is build from days and months only.
In particular, the year is not considered in the Interval.
The left
boundary is defined by startmonth
and startday
, the right boundary by endmonth
and endday
. Both boundaries are inclusive. If the left side occurs later
in the year than the right side, the interval is extended over the change of
year (e.g. an interval of [01/12, 01/03], will flag values in December,
January and February).
NOTE: Only works for time-series-like datasets.
flagIsolated
flagIsolated(window, gap_window, group_window)
parameter | data type | default value | description |
---|---|---|---|
gap_window | offset string | The minimum size of the gap before and after a group of valid values, which makes this group considered as isolated. See condition (2) and (3) | |
group_window | offset string | The maximum size of an isolated group of valid data. See condition (1). |
The function flags arbitrary large groups of values, if they are surrounded by sufficiently large data gaps. A gap is defined as group of missing and/or flagged values.
A continuous group of values
x_{k}, x_{k+1},...,x_{k+n}
with timestamps t_{k}, t_{k+1}, ..., t_{k+n}
is considered to be isolated, if:
-
t_{k+n} - t_{k} \le
group_window
- None of the values
x_i, ..., x_{k-1}
, witht_{k-1} - t_{i} \ge
gap_window
is valid or unflagged - None of the values
x_{k+n+1}, ..., x_{j}
, witht_{j} - t_{k+n+1} \ge
gap_window
is valid or unflagged
flagMissing
flagMissing(nodata=NaN)
parameter | data type | default value | description |
---|---|---|---|
nodata | any | NAN |
A value that defines missing data |
flagDTW
flagDTW(refdatafield='SM1', window = 25, min_distance = 0.25, method_dtw = "fast")
parameter | data type | default value | description |
---|---|---|---|
window | int | 25 |
The number of datapoints to be included in each comparison window. |
min_distance | float | 0.5 |
The minimum distance of two graphs to be classified as "different". |
method_dtw | string | "fast" |
Implementation of DTW algorithm - "exact" for the normal implementation of DTW, "fast" for the fast implementation. |
ref_datafield | string | Name of the reference datafield ("correct" values) with which the actual datafield is compared. |
This function compares the data with a reference datafield (given in ref_datafield
) of values we assume to be correct. The comparison is undertaken window-based, i.e. the two data fields are compared window by window, with overlapping windows. The function flags those values that lie in the middle of a window that exceeds a minimum distance value (given in min_distance
).
As comparison algorithm, we use the Dynamic Time Warping (DTW) Algorithm that accounts for temporal and spacial offsets when calculating the distance. For a demonstration of the DTW, see the Wiki entry "Results for rain data set" in Pattern Recognition with Wavelets.
The function flags all values indicating missing data.
clearFlags
clearFlags()
The funcion removes all previously set flags.
forceFlags
forceFlags(flag)
parameter | data type | default value | description |
---|---|---|---|
flag | float/flagging constant | GOOD | The flag that is set unconditionally |
The functions overwrites all previous set flags with the given flag.
flagDummy
flagDummy()
Identity function, i.e. the function does nothing.