Newer
Older
Generic flagging functions provide for cross-variable quality
constraints and to implement simple quality checks directly within the configuration.
### Why?
can be explained by the dataset itself. Think of a an active, fan-cooled
measurement device: no matter how precise the instrument may work, problems
are to be expected when the fan stops working or the power supply
drops below a certain threshold. While these dependencies are easy to
[formalize](#a-real-world-example) on a per dataset basis, it is quite
### Specification
Generic flagging functions are used in the same manner as their
[non-generic counterparts](docs/FunctionIndex.md). The basic
flagGeneric(func=<expression>, flag=<flagging_constant>)
```
where `<expression>` is composed of the [supported constructs](#supported-constructs)
and `<flag_constant>` is one of the predefined
[flagging constants](ParameterDescriptions.md#flagging-constants) (default: `BAD`).
Generic flagging functions are expected to return a boolean value, i.e. `True` or `False`. All other expressions will
Flag all values of `x` where `y` falls below 0.
##### Configuration file
```
varname ; test
#-------;------------------------
x ; flagGeneric(func=y < 0)
```
Flag all values of `x` that exceed 3 standard deviations of `y`.
##### Configuration file
```
varname ; test
#-------;---------------------------------
x ; flagGeneric(func=x > std(y) * 3)
```
#### Special functions
Flag all values of `x` where: `y` is flagged and `z` has missing values.
##### Configuration file
```
varname ; test
#-------;----------------------------------------------
x ; flagGeneric(func=isflagged(y) & ismissing(z))
```
| date | meas | fan | volt |
|------------------|------|-----|------|
| 2018-06-01 12:00 | 3.56 | 1 | 12.1 |
| 2018-06-01 12:10 | 4.7 | 0 | 12.0 |
| 2018-06-01 12:20 | 0.1 | 1 | 11.5 |
| 2018-06-01 12:30 | 3.62 | 1 | 12.1 |
| ... | | | |
There are various options. We can directly implement the condition as follows:
```
varname ; test
#-------;-----------------------------------------------
meas ; flagGeneric(func=(fan == 0) \| (volt < 12.0))
```
But we could also quality check our independent variables first
and than leverage this information later on:
```
varname ; test
#-------;----------------------------------------------------
'.*' ; flagMissing()
fan ; flagGeneric(func=fan == 0)
volt ; flagGeneric(func=volt < 12.0)
meas ; flagGeneric(func=isflagged(fan) \| isflagged(volt))
```
## Generic Processing
Generic processing functions provide a way to evaluate mathmetical operations
and functions on the variables of a given dataset.
### Why
In many real-world use cases, quality control is embedded into a larger data
processing pipeline and it is not unusual to even have certain processing
requirements as a part of the quality control itself. Generic processing
functions make it easy to enrich a dataset through the evaluation of a
given expression.
### Specification
The basic signature looks like that:
```sh
procGeneric(func=<expression>)
```
where `<expression>` is composed of the [supported constructs](#supported-constructs).
All variables of the processed dataset are available within generic functions,
so arbitrary cross references are possible. The variable of interest
is furthermore available with the special reference `this`, so the second
[example](#calculations) could be rewritten as:
```
varname ; test
#-------;------------------------------------
x ; flagGeneric(func=this > std(y) * 3)
```
When referencing other variables, their flags will be respected during evaluation
of the generic expression. So, in the example above only values of `x` and `y`, that
are not already flagged with `BAD` will be used the avaluation of `x > std(y)*3`.
## Supported constructs
### Operators
The following comparison operators are available:
| Operator | Description |
|----------|----------------------------------------------------------------------------------------------------|
| `==` | `True` if the values of the operands are equal |
| `!=` | `True` if the values of the operands are not equal |
| `>` | `True` if the values of the left operand are greater than the values of the right operand |
| `<` | `True` if the values of the left operand are smaller than the values of the right operand |
| `>=` | `True` if the values of the left operand are greater or equal than the values of the right operand |
| `<=` | `True` if the values of the left operand are smaller or equal than the values of the right operand |
The following arithmetic operators are supported:
| Operator | Description |
|----------|----------------|
| `+` | addition |
| `*` | multiplication |
| `/` | division |
The bitwise operators also act as logical operators in comparison chains
| Operator | Description |
|----------|-------------------|
| `&` | binary and |
| `^` | binary xor |
| `~` | binary complement |
### Functions
All functions expect a [variable reference](#variable-references)
as the only non-keyword argument (see [here](#special-functions))
| Name | Description |
|-------------|-----------------------------------|
| `abs` | absolute values of a variable |
| `max` | maximum value of a variable |
| `min` | minimum value of a variable |
| `mean` | mean value of a variable |
| `sum` | sum of a variable |
| `std` | standard deviation of a variable |
| `len` | the number of values for variable |
#### Special Functions
| Name | Description |
|-------------|-----------------------------------|
| `ismissing` | check for missing values |
| `isflagged` | check for flags |
### Constants
Generic functions support the same constants as normal functions, a detailed
list is available [here](ParameterDescriptions.md#constants).