Skip to content
Snippets Groups Projects
GlobalKeywords.rst 10.12 KiB

Global Keywords

Introduction to the usage of the global keywords. (Keywords that can be passed to any :py:class:`saqc.SaQC` method.)

  1. Set Up
  2. label keyword
  3. dfilter and flag keywords

Set Up

Flagging Scheme Constraint

The Tutorial currently only works when instantiating an :py:class:`~saqc.SaQC` object with the default :ref:`flagging scheme <FlagsHistoryTranslations>`, which is the :py:class:`~saqc.core.FloatScheme`.

Example Data

Lets generate some example data and plot it:

>>> import pandas as pd
>>> import numpy as np
>>> noise = np.random.normal(0, 1, 200) # some normally distributed noise
>>> data = pd.Series(noise, index=pd.date_range('2020','2021',periods=200), name='data') # index the noise with some dates
>>> data.iloc[20] = 16 # add some artificial anomalies:
>>> data.iloc[100] = -17
>>> data.iloc[160:180] = -3
>>> qc = saqc.SaQC(data)
>>> qc.plot('data') #doctest:+SKIP

Label Keyword

The label keyword can be passed with any function call and serves as label to be plotted by a subsequent call to :py:meth:`saqc.SaQC.plot`.

It is especially useful for enriching figures with custom context information, and for making results from different function calls distinguishable with respect to their purpose and parameterisation. Check out the following example:

At first, we apply some flagging functions to mark anomalies without usage of the label keyword:

>>> qc = qc.flagRange('data', max=15)
>>> qc = qc.flagRange('data', min=-16)
>>> qc = qc.flagConstants('data', window='2D', thresh=0)
>>> qc = qc.flagManual('data', mdata=pd.Series('2020-05', index=pd.DatetimeIndex(['2020-03'])))
>>> qc.plot('data') # doctest:+SKIP

In the above plot, one might want to discern the two results from the call to :py:meth:`saqc.SaQC.flagRange` with respect to the parameters they where called with, also, one might want to give some hints about what is the context of the flags "manually" determined by the call to :py:meth:`saqc.SaQC.flagManual`. Lets repeat the procedure and enrich the call with this information by making use of the label keyword:

Label Example Usage

>>> qc = saqc.SaQC(data)
>>> qc = qc.flagRange('data', max=15, label='values < 15')
>>> qc = qc.flagRange('data', min=-16, label='values > -16')
>>> qc = qc.flagConstants('data', window='2D', thresh=0, label='values constant longer than 2 days')
>>> qc = qc.flagManual('data', mdata=pd.Series('2020-05', index=pd.DatetimeIndex(['2020-03'])), label='values collected while sensor maintenance')
>>> qc.plot('data') # doctest:+SKIP

dfilter and flag keywords

The flag keyword controls a tests level of flagging f(v) for any value v. So, in short, the keyword controls the output flag level of any flagging function.

The dfilter keyword controls the threshold up to which a flagged value is masked, when passed on to any flagging function. So, in short, it controls the input threshold, up to which flagged values are visible to any function that operates on the values.

In more detail: Any value v with a flag f(v) will be masked, if f(v) >= dfilter. A masked value will appear as NaN (not a number, or missing) to the flagging function and will be numerically treated as such. (This means, its excluded from most arithmetic calculations, but may be implicitly part of operations, such as count(NaN) or isnan). Lets at first visualize this interplay with the :py:meth:`saqc.SaqC.plot` method. (We are reusing data and code from the Example Data section). First, we set some flags to the data. As pointed out in Flagging Scheme Constraint , we are referring to defaultly instantiated :py:class:`saqc.SaQC` objects, that use the :py:class:`~saqc.core.FloatScheme` , (which uses a real valued scale of flags levels, ranging from -inf to 255.0).:

>>> qc = saqc.SaQC(data)
>>> qc = qc.flagRange('data', max=15, label='flaglevel=200', flag=200)
>>> qc = qc.flagRange('data', min=-16, label='flaglevel=100', flag=100)
>>> qc = qc.flagManual('data', mdata=pd.Series('2020-05', index=pd.DatetimeIndex(['2020-03'])), label='flaglevel=0', flag=0)
>>> qc.plot('data') # doctest:+SKIP

With the dfilter Keyword, we can now control, which of the flags are passed on to the plot function. For example, if we set dfilter=50, the flags set by the :py:meth:`saqc.SaQC.flagRange` method wont get passed on and thus, the resulting plot will be cleared from the flags:

>>> qc.plot('data', dfilter=50) # doctest:+SKIP

Flags of Different Significance

We can also use the interplay between the dfilter keyword and flag keyword, to order flags priorities. By default, the dfilter keyword is set to the highest flag value of the instantiated :ref:`flagging scheme <FlagsHistoryTranslations>`, referred to, as :py:attr:`~saqc.constants.BAD`. Since the flag set by a test also defaults to :py:attr:`~saqc.constants.BAD`, the second call to :py:meth:`saqc.SaQC.flagRange` in the example below, wont get passed the values already flagged by the first call to :py:meth:`saqc.SaQC.flagRange` - so it cant check the value level and assign no additional flag by its self.

>>> qc = saqc.SaQC(data)
>>> qc = qc.flagRange('data', max=15, label='value > 15')
>>> qc = qc.flagRange('data', max=0, label='value > 0')
>>> qc.plot('data') # doctest:+SKIP

We can make the value flagged by both the flagging functions by increasing the dfilter threshold of the flagging function called second, above the default flag level of :py:attr:`~saqc.constants.BAD`. This can be achieved, by passing the flagging constant :py:attr:`~saqc.constants.FILTER_NONE`,

>>> from saqc.constants import FILTER_NONE
>>> qc = saqc.SaQC(data)
>>> qc = qc.flagRange('data', max=15, label='value > 15')
>>> qc = qc.flagRange('data', max=0, label='value > 0', dfilter=FILTER_NONE)
>>> qc.plot('data') # doctest:+SKIP

Unflagging Values

With the flag keyword it is as also possible, to revoke or unflag a flag from a value. This way, it is possible to associate flags with conditions determined by other functions. For example, if we want to flag all values below a level of 0.5, but not those that belong to a constant value course, we can achieve that, by combining the flag and the dfilter keyword. Lets first flag all the data below a level of 0.5:

>>> qc = saqc.SaQC(data)
>>> qc = qc.flagRange('data', min=0.5)
>>> qc.plot('data') #doctest:+SKIP

Now we can override the flags for the constant value course with the lowest (unflagged) flag level, which, for the :py:class:`~saqc.core.FloatScheme` is the value -np.inf. Alternatively to the explicit value, we can use the :py:attr:`~saqc.constants.UNFLAGGED` constant. Also, for the override to work, we have to rise (or deactivate) the input filter, so that the :py:meth:`saqc.SaQC.flagConstants` method gets the already flagged values passed to test them.

>>> from saqc.constants import UNFLAGGED, FILTER_NONE
>>> qc = qc.flagConstants('data', window='2D', thresh=0, dfilter=FILTER_NONE, flag=UNFLAGGED)
>>> qc.plot('data') #doctest:+SKIP