-
David Schäfer authored82a203d1
Global Keywords
Introduction to the usage of the global keywords. (Keywords that can be passed to any :py:class:`saqc.SaQC` method.)
Set Up
Flagging Scheme Constraint
The Tutorial currently only works when instantiating an :py:class:`~saqc.SaQC` object with the default :ref:`flagging scheme <FlagsHistoryTranslations>`, which is the :py:class:`~saqc.core.FloatScheme`.
Example Data
Lets generate some example data and plot it:
>>> import pandas as pd
>>> import numpy as np
>>> noise = np.random.normal(0, 1, 200) # some normally distributed noise
>>> data = pd.Series(noise, index=pd.date_range('2020','2021',periods=200), name='data') # index the noise with some dates
>>> data.iloc[20] = 16 # add some artificial anomalies:
>>> data.iloc[100] = -17
>>> data.iloc[160:180] = -3
>>> qc = saqc.SaQC(data)
>>> qc.plot('data') #doctest:+SKIP
Label Keyword
The label
keyword can be passed with any function call and serves as label to be plotted by a subsequent
call to :py:meth:`saqc.SaQC.plot`.
It is especially useful for enriching figures with custom context information, and for making results from different function calls distinguishable with respect to their purpose and parameterisation. Check out the following example:
At first, we apply some flagging functions to mark anomalies without usage of the label
keyword:
>>> qc = qc.flagRange('data', max=15)
>>> qc = qc.flagRange('data', min=-16)
>>> qc = qc.flagConstants('data', window='2D', thresh=0)
>>> qc = qc.flagManual('data', mdata=pd.Series('2020-05', index=pd.DatetimeIndex(['2020-03'])))
>>> qc.plot('data') # doctest:+SKIP
In the above plot, one might want to discern the two results from the call to :py:meth:`saqc.SaQC.flagRange` with respect to the parameters they where called with, also, one might want to give some hints about what is the context of the flags "manually" determined by the call to :py:meth:`saqc.SaQC.flagManual`. Lets repeat the procedure and enrich the call with this information by making use of the label keyword:
Label Example Usage
>>> qc = saqc.SaQC(data)
>>> qc = qc.flagRange('data', max=15, label='values < 15')
>>> qc = qc.flagRange('data', min=-16, label='values > -16')
>>> qc = qc.flagConstants('data', window='2D', thresh=0, label='values constant longer than 2 days')
>>> qc = qc.flagManual('data', mdata=pd.Series('2020-05', index=pd.DatetimeIndex(['2020-03'])), label='values collected while sensor maintenance')
>>> qc.plot('data') # doctest:+SKIP
dfilter and flag keywords
The flag
keyword controls a tests level of flagging f(v) for any value v. So,
in short, the keyword controls the output flag level of any flagging function.
The dfilter
keyword controls the threshold up to which a flagged value is masked, when passed
on to any flagging function. So, in short, it controls the input threshold, up to which flagged values are visible to
any function that operates on the values.
In more detail: Any value v with a flag f(v) will be masked, if f(v) >= dfilter
. A masked value
will appear as NaN
(not a number, or missing) to the flagging function and will be numerically treated as such.
(This means, its excluded from most arithmetic calculations, but may be implicitly part of operations, such as count(NaN) or isnan).
Lets at first visualize this interplay with the :py:meth:`saqc.SaqC.plot` method. (We are reusing data and code
from the Example Data section). First, we set some flags to the data. As pointed out in
Flagging Scheme Constraint , we are referring to defaultly instantiated :py:class:`saqc.SaQC` objects, that use the
:py:class:`~saqc.core.FloatScheme` , (which uses a real valued scale of flags levels,
ranging from -inf
to 255.0
).:
>>> qc = saqc.SaQC(data)
>>> qc = qc.flagRange('data', max=15, label='flaglevel=200', flag=200)
>>> qc = qc.flagRange('data', min=-16, label='flaglevel=100', flag=100)
>>> qc = qc.flagManual('data', mdata=pd.Series('2020-05', index=pd.DatetimeIndex(['2020-03'])), label='flaglevel=0', flag=0)
>>> qc.plot('data') # doctest:+SKIP
With the dfilter
Keyword, we can now control, which of the flags are passed on to the plot function.
For example, if we set dfilter=50
, the flags set by the :py:meth:`saqc.SaQC.flagRange` method wont get passed on
and thus, the resulting plot will be cleared from the flags:
>>> qc.plot('data', dfilter=50) # doctest:+SKIP
Flags of Different Significance
We can also use the interplay between the dfilter
keyword and flag
keyword, to order flags priorities.
By default, the dfilter
keyword is set to the highest flag value of the instantiated
:ref:`flagging scheme <FlagsHistoryTranslations>`, referred to, as :py:attr:`~saqc.constants.BAD`.
Since the flag set by a test also defaults to :py:attr:`~saqc.constants.BAD`, the second call
to :py:meth:`saqc.SaQC.flagRange` in the example below, wont get passed the values already flagged by the first call to
:py:meth:`saqc.SaQC.flagRange` - so it cant check the value level and assign no additional flag by its self.
>>> qc = saqc.SaQC(data)
>>> qc = qc.flagRange('data', max=15, label='value > 15')
>>> qc = qc.flagRange('data', max=0, label='value > 0')
>>> qc.plot('data') # doctest:+SKIP
We can make the value flagged by both the flagging functions by increasing the
dfilter
threshold of the flagging function called second, above the default flag level of
:py:attr:`~saqc.constants.BAD`. This can be achieved, by passing the flagging constant
:py:attr:`~saqc.constants.FILTER_NONE`,
>>> from saqc.constants import FILTER_NONE
>>> qc = saqc.SaQC(data)
>>> qc = qc.flagRange('data', max=15, label='value > 15')
>>> qc = qc.flagRange('data', max=0, label='value > 0', dfilter=FILTER_NONE)
>>> qc.plot('data') # doctest:+SKIP
Unflagging Values
With the flag
keyword it is as also possible, to revoke or unflag a flag from a value.
This way, it is possible to associate flags with conditions determined by other functions.
For example, if we want to flag all values below a level of 0.5, but not those that belong to a constant value
course, we can achieve that, by combining the flag
and the dfilter
keyword.
Lets first flag all the data below a level of 0.5:
>>> qc = saqc.SaQC(data)
>>> qc = qc.flagRange('data', min=0.5)
>>> qc.plot('data') #doctest:+SKIP
Now we can override the flags for the constant value course with the lowest (unflagged) flag level, which, for the
:py:class:`~saqc.core.FloatScheme` is the value -np.inf
. Alternatively to the explicit value, we can use the
:py:attr:`~saqc.constants.UNFLAGGED` constant.
Also, for the override to work, we have to rise (or deactivate) the input filter, so that the :py:meth:`saqc.SaQC.flagConstants` method
gets the already flagged values passed to test them.
>>> from saqc.constants import UNFLAGGED, FILTER_NONE
>>> qc = qc.flagConstants('data', window='2D', thresh=0, dfilter=FILTER_NONE, flag=UNFLAGGED)
>>> qc.plot('data') #doctest:+SKIP