Skip to content
Snippets Groups Projects

Generic Functions

Generic Functions provide a way to leverage cross-variable conditions and to implement simple quality checks directly within the configuration.

Why?

The underlying idea is, that in most real world datasets many errors can be explained by the dataset itself. Think of a an active, fan-cooled measurement device: no matter how precise the instrument may work, problems are to expected when the fan stop working or the battery voltage drops below a certain threshold. While these dependencies are easy to formalize on a per dataset basis, it is quite challenging to translate them into general purpose source code.

Specification

Generic functions are used in the same manner as their non-generic counterparts. The basic signature looks like that:

flagGeneric(func=<expression>, flag=<flagging_constant>)

where <expression> is composed of the supported constructs and <flag_constant> is one of the predefined flagging constants (default: BAD)

Examples

Simple comparisons

Task

Flag all values of variable x when variable y falls below a certain threshold

Configuration file

varname test
x flagGeneric(func=y < 0)

Calculations

Task

Flag all values of variable x that exceed 3 standard deviations of variable y

Configuration file

varname test
x flagGeneric(func=this > std(y) * 3)

Special functions

Task

Flag variable x where variable y is flagged and variable x has missing values

Configuration file

varname test
x flagGeneric(func=isflagged(y) & ismissing(z))

A real world example

Let's consider a dataset like the following:

date meas fan volt
2018-06-01 12:00 3.56 1 12.1
2018-06-01 12:10 4.7 0 12.0
2018-06-01 12:20 0.1 1 11.5
2018-06-01 12:30 3.62 1 12.1
...

Task

Flag variable meas where variable fan equals 0 and variable volt is lower than 12.0.

Configuration file

We can directly implement the condition as follows:

varname test
meas flagGeneric(func=(fan == 0) | (volt < 12.0))

But we could also quality check our independent variables first and than leverage this information later on:

varname test
* missing()
fan flagGeneric(func=this == 0)
volt flagGeneric(func=this < 12.0)
meas flagGeneric(func=isflagged(fan) | isflagged(volt))

Variable References

All variables of the processed dataset are available within generic functions, so arbitrary cross references are possible. The variable of interest is furthermore available with the special reference this, so the second example could be rewritten as:

varname test
x flagGeneric(func=x > std(y) * 3)

When referencing other variables, their flags will be respected during evaluation of the generic expression. So, in the example above only previously unflagged values of x and y are used within the expression x > std(y)*3.

Supported constructs

Operators

Comparison

The following comparison operators are available:

Operator Description
== True if the values of the operands are equal
!= True if the values of the operands are not equal
> True if the values of the left operand are greater than the values of the right operand
< True if the values of the left operand are smaller than the values of the right operand
>= True if the values of the left operand are greater or equal than the values of the right operand
<= True if the values of the left operand are smaller or equal than the values of the right operand

Arithmetic

The following arithmetic operators are supported:

Operator Description
+ addition
- subtraction
* multiplication
/ division
** exponentiation
% modulus

Bitwise

The bitwise operators also act as logical operators in comparison chains

Operator Description
& binary and
| binary or
^ binary xor
~ binary complement

Functions

All functions expect a variable reference as the only non-keyword argument (see here)

Name Description
abs absolute values of a variable
max maximum value of a variable
min minimum value of a variable
mean mean value of a variable
sum sum of a variable
std standard deviation of a variable
len the number of values for variable
ismissing check for missing values
isflagged check for flags

Constants

Generic functions support the same constants as normal functions, a detailed list is available here.