Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • berntm/saqc
  • rdm-software/saqc
  • schueler/saqc
3 results
Show changes
Commits on Source (169)
Showing
with 329 additions and 190 deletions
......@@ -29,7 +29,7 @@ jobs:
fail-fast: false
matrix:
os: ["windows-latest", "ubuntu-latest", "macos-latest"]
python-version: ["3.9", "3.10", "3.11"]
python-version: ["3.9", "3.10", "3.11", "3.12"]
defaults:
run:
# somehow this also works for windows O.o ??
......
......@@ -30,11 +30,13 @@ stages:
- deploy
default:
image: python:3.10
image: python:3.11
before_script:
- pip install --upgrade pip
- pip install -r requirements.txt
- pip install -r tests/requirements.txt
- apt update
- apt install -y xvfb
# ===========================================================
# Compliance stage
......@@ -75,8 +77,10 @@ coverage:
stage: test
allow_failure: true
script:
- export DISPLAY=:99
- Xvfb :99 &
- pip install pytest-cov coverage
- pytest --cov=saqc tests --ignore=tests/fuzzy -Werror
- pytest --cov=saqc tests --ignore=tests/fuzzy tests/extras -Werror
after_script:
- coverage xml
# regex to find the coverage percentage in the job output
......@@ -93,7 +97,9 @@ python39:
stage: test
image: python:3.9
script:
- pytest tests -Werror --junitxml=report.xml
- export DISPLAY=:99
- Xvfb :99 &
- pytest tests -Werror --junitxml=report.xml --ignore=tests/extras
- python -m saqc --config docs/resources/data/config.csv --data docs/resources/data/data.csv --outfile /tmp/test.csv
artifacts:
when: always
......@@ -105,7 +111,9 @@ python310:
stage: test
image: python:3.10
script:
- pytest tests -Werror --junitxml=report.xml
- export DISPLAY=:99
- Xvfb :99 &
- pytest tests -Werror --junitxml=report.xml --ignore=tests/extras
- python -m saqc --config docs/resources/data/config.csv --data docs/resources/data/data.csv --outfile /tmp/test.csv
artifacts:
when: always
......@@ -116,7 +124,22 @@ python311:
stage: test
image: python:3.11
script:
- pytest tests -Werror --junitxml=report.xml
- export DISPLAY=:99
- Xvfb :99 &
- pytest tests -Werror --junitxml=report.xml --ignore=tests/extras
- python -m saqc --config docs/resources/data/config.csv --data docs/resources/data/data.csv --outfile /tmp/test.csv
artifacts:
when: always
reports:
junit: report.xml
python312:
stage: test
image: python:3.12
script:
- export DISPLAY=:99
- Xvfb :99 &
- pytest tests -Werror --junitxml=report.xml --ignore=tests/extras
- python -m saqc --config docs/resources/data/config.csv --data docs/resources/data/data.csv --outfile /tmp/test.csv
artifacts:
when: always
......@@ -125,6 +148,8 @@ python311:
doctest:
stage: test
variables:
COLUMNS: 200
script:
- cd docs
- pip install -r requirements.txt
......@@ -170,6 +195,16 @@ wheel311:
- pip install .
- python -c 'import saqc; print(f"{saqc.__version__=}")'
wheel312:
stage: build
image: python:3.12
variables:
PYPI_PKG_NAME: "saqc-dev"
script:
- pip install wheel
- pip wheel .
- pip install .
- python -c 'import saqc; print(f"{saqc.__version__=}")'
# ===========================================================
# Extra Pipeline (run with a successful run of all other jobs on develop)
......
......@@ -6,15 +6,61 @@ SPDX-License-Identifier: GPL-3.0-or-later
# Changelog
## Unreleased
[List of commits](https://git.ufz.de/rdm-software/saqc/-/compare/v2.5.0...develop)
[List of commits](https://git.ufz.de/rdm-software/saqc/-/compare/v2.6.0...develop)
### Added
- `flagPlateaus`: added function to search and flag outlierish value plateaus of certain temporal extension
- `flagUniLOF`: added dispatch to Local Outlier Probability (*LoOP*) variant
- `flaguniLOF`: made `thresh` Optional
- `flagPlateaus`: added function to search and flag anomalous value plateaus of certain temporal extension
### Changed
### Removed
### Fixed
- `flagConstants`: fixed bug where last `min_periods` will never get flagged
### Deprecated
## [2.6.0](https://git.ufz.de/rdm-software/saqc/-/tags/v2.6.0) - 2024-04-15
[List of commits](https://git.ufz.de/rdm-software/saqc/-/compare/v2.5.0...v2.6.0)
### Added
- `reindex`: base reindexer function
- `flagGeneric`, `processGeneric`: target broadcasting and numpy array support
- `SaQC`: automatic translation of incoming flags
- Option to change the flagging scheme after initialization
- `flagByClick`: manually assign flags using a graphical user interface
- `SaQC`: support for selection, slicing and setting of items by subscription on `SaQC` objects
- `transferFlags` is a multivariate function
- `plot`: added `yscope` keyword
- `setFlags`: function to replace `flagManual`
- `flagUniLOF`: added parameter `slope_correct` to correct for overflagging at relatively steep data value slopes
- `History`: added option to change aggregation behavior
- "horizontal" axis / multivariate mode for `rolling`
- Translation scheme `AnnotatedFloatScheme`
### Changed
- `SaQC.flags` always returns a `DictOfSeries`
### Removed
- `SaQC` methods deprecated in version 2.4: `interpolate`, `interpolateIndex`, `interpolateInvalid`, `roll`, `linear`,`shift`, `flagCrossStatistics`
- Method `Flags.toDios` deprecated in version 2.4
- Method `DictOfSeries.index_of` method deprecated in version 2.4
- Option `"complete"` for parameter `history` of method `plot`
- Option `"cycleskip"` for parameter `ax_kwargs` of method `plot`
- Parameter `phaseplot` from method `plot`
### Fixed
- `flagConstants`: fixed flagging of rolling ramps
- `Flags`: add meta entry to imported flags
- group operations were overwriting existing flags
- `SaQC._construct` : was not working for inherited classes
- `processgeneric`: improved numpy function compatability
### Deprecated
- `flagManual` in favor of `setFlags`
- `inverse_**` options for `concatFlags` parameter `method` in favor of `invert=True`
- `flagRaise` with delegation to better replacements `flagZScore`, `flagUniLOF`, `flagJumps` or `flagOffset`
- `flagByGrubbs` with delegation to better replacements `flagZScore`, `flagUniLOF`s
- `flagMVScore` with delegation to manual application of the steps
## [2.5.0](https://git.ufz.de/rdm-software/saqc/-/tags/v2.4.1) - 2023-06-22
## [2.5.0](https://git.ufz.de/rdm-software/saqc/-/tags/v2.5.0) - 2023-09-05
[List of commits](https://git.ufz.de/rdm-software/saqc/-/compare/v2.4.1...v2.5.0)
### Added
- WMO standard mean aggregations
- Function selection via strings for most function-expecting parameters
- `SaQC.plot`:
- enable multivariate plots
- keyword `plot_kwargs` to pass matplotlib related arguments
......
......@@ -3,7 +3,7 @@ title: SaQC - System for automated Quality Control
message: "Please cite this software using these metadata."
type: software
version: 2.0.0
doi: https://doi.org/10.5281/zenodo.5888547
doi: 10.5281/zenodo.5888547
date-released: "2021-11-25"
license: "GPL-3.0"
repository-code: "https://git.ufz.de/rdm-software/saqc"
......
......@@ -62,7 +62,7 @@ could look like [this](https://git.ufz.de/rdm-software/saqc/raw/develop/docs/res
```
varname ; test
#----------; ---------------------------------------------------------------------
SM2 ; shift(freq="15Min")
SM2 ; align(freq="15Min")
'SM(1|2)+' ; flagMissing()
SM1 ; flagRange(min=10, max=60)
SM2 ; flagRange(min=10, max=40)
......@@ -103,7 +103,7 @@ data = pd.read_csv(
qc = SaQC(data=data)
qc = (qc
.shift("SM2", freq="15Min")
.align("SM2", freq="15Min")
.flagMissing("SM(1|2)+", regex=True)
.flagRange("SM1", min=10, max=60)
.flagRange("SM2", min=10, max=40)
......
......@@ -30,7 +30,7 @@ clean:
# make documentation
doc:
# generate environment table from dictionary
@$(SPHINXBUILD) -M html "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
@ $(SPHINXBUILD) -M html "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
# run tests
test:
......
......@@ -315,10 +315,10 @@ Aggregation
If we want to comprise several values by aggregation and assign the result to the new regular timestamp, instead of
selecting a single one, we can do this, with the :py:meth:`~saqc.SaQC.resample` method.
Lets resample the *SoilMoisture* data to have a *20* minutes sample rate by aggregating every *20* minutes intervals
content with the arithmetic mean (which is provided by the ``numpy.mean`` function for example).
content with the arithmetic mean.
>>> import numpy as np
>>> qc = qc.resample('SoilMoisture', target='SoilMoisture_mean', freq='20min', method='bagg', func=np.mean)
>>> qc = qc.resample('SoilMoisture', target='SoilMoisture_mean', freq='20min', method='bagg', func="mean")
>>> qc.data # doctest: +SKIP
SoilMoisture | SoilMoisture_mean |
================================ | ===================================== |
......
......@@ -140,7 +140,7 @@ Looking at the example data set more closely, we see that 2 of the 5 variables s
qc.plot(variables, xscope=slice('2017-05', '2017-11'))
Lets try to detect those drifts via saqc. The changes we observe in the data seem to develop significantly only in temporal spans over a month,
so we go for ``"1M"`` as value for the
so we go for ``"1ME"`` as value for the
``window`` parameter. We identified the majority group as a group containing three variables, whereby two variables
seem to be scattered away, so that we can leave the ``frac`` value at its default ``.5`` level.
The majority group seems on average not to be spread out more than 3 or 4 degrees. So, for the ``spread`` value
......@@ -152,7 +152,7 @@ average in a month from any member of the majority group.
.. doctest:: flagDriftFromNorm
>>> variables = ['temp1 [degC]', 'temp2 [degC]', 'temp3 [degC]', 'temp4 [degC]', 'temp5 [degC]']
>>> qc = qc.flagDriftFromNorm(variables, window='1M', spread=3)
>>> qc = qc.flagDriftFromNorm(variables, window='1ME', spread=3)
.. plot::
:context: close-figs
......@@ -160,7 +160,7 @@ average in a month from any member of the majority group.
:class: center
>>> variables = ['temp1 [degC]', 'temp2 [degC]', 'temp3 [degC]', 'temp4 [degC]', 'temp5 [degC]']
>>> qc = qc.flagDriftFromNorm(variables, window='1M', spread=3)
>>> qc = qc.flagDriftFromNorm(variables, window='1ME', spread=3)
Lets check the results:
......@@ -173,5 +173,5 @@ Lets check the results:
:include-source: False
:class: center
qc.plot(variables, marker_kwargs={'alpha':.3, 's': 1, 'color': 'red', 'edgecolor': 'face'})
qc.plot(variables, marker_kwargs={'alpha':.3, 's': 1, 'color': 'red', 'edgecolors': 'face'})
......@@ -191,7 +191,6 @@ The resulting timeseries now has has regular timestamp.
.. doctest:: exampleMV
>>> qc.data['sac254_raw'] #doctest:+NORMALIZE_WHITESPACE
Timestamp
2016-01-01 00:00:00 NaN
2016-01-01 00:15:00 18.617873
2016-01-01 00:30:00 18.942700
......
......@@ -147,19 +147,19 @@ Rolling Mean
^^^^^^^^^^^^
Easiest thing to do, would be, to apply some rolling mean
model via the method :py:meth:`saqc.SaQC.roll`.
model via the method :py:meth:`saqc.SaQC.rolling`.
.. doctest:: exampleOD
>>> import numpy as np
>>> qc = qc.roll(field='incidents', target='incidents_mean', func=np.mean, window='13D')
>>> qc = qc.rolling(field='incidents', target='incidents_mean', func=np.mean, window='13D')
.. plot::
:context:
:include-source: False
import numpy as np
qc = qc.roll(field='incidents', target='incidents_mean', func=np.mean, window='13D')
qc = qc.rolling(field='incidents', target='incidents_mean', func=np.mean, window='13D')
The ``field`` parameter is passed the variable name, we want to calculate the rolling mean of.
The ``target`` parameter holds the name, we want to store the results of the calculation to.
......@@ -174,13 +174,13 @@ under the name ``np.median``. We just calculate another model curve for the ``"i
.. doctest:: exampleOD
>>> qc = qc.roll(field='incidents', target='incidents_median', func=np.median, window='13D')
>>> qc = qc.rolling(field='incidents', target='incidents_median', func=np.median, window='13D')
.. plot::
:context:
:include-source: False
qc = qc.roll(field='incidents', target='incidents_median', func=np.median, window='13D')
qc = qc.rolling(field='incidents', target='incidents_median', func=np.median, window='13D')
We chose another :py:attr:`target` value for the rolling *median* calculation, in order to not override our results from
the previous rolling *mean* calculation.
......@@ -318,18 +318,18 @@ for the point lying in the center of every window, we would define our function
z_score = lambda D: abs((D[14] - np.mean(D)) / np.std(D))
And subsequently, use the :py:meth:`~saqc.SaQC.roll` method to make a rolling window application with the scoring
And subsequently, use the :py:meth:`~saqc.SaQC.rolling` method to make a rolling window application with the scoring
function:
.. doctest:: exampleOD
>>> qc = qc.roll(field='incidents_residuals', target='incidents_scores', func=z_score, window='27D')
>>> qc = qc.rolling(field='incidents_residuals', target='incidents_scores', func=z_score, window='27D', min_periods=27)
.. plot::
:context: close-figs
:include-source: False
qc = qc.roll(field='incidents_residuals', target='incidents_scores', func=z_score, window='27D')
qc = qc.rolling(field='incidents_residuals', target='incidents_scores', func=z_score, window='27D', min_periods=27)
Optimization by Decomposition
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
......@@ -347,13 +347,13 @@ So the attempt works fine, only because our data set is small and strictly regul
Meaning that it has constant temporal distances between subsequent meassurements.
In order to tweak our calculations and make them much more stable, it might be useful to decompose the scoring
into seperate calls to the :py:meth:`~saqc.SaQC.roll` function, by calculating the series of the
into seperate calls to the :py:meth:`~saqc.SaQC.rolling` function, by calculating the series of the
residuals *mean* and *standard deviation* seperately:
.. doctest:: exampleOD
>>> qc = qc.roll(field='incidents_residuals', target='residuals_mean', window='27D', func=np.mean)
>>> qc = qc.roll(field='incidents_residuals', target='residuals_std', window='27D', func=np.std)
>>> qc = qc.rolling(field='incidents_residuals', target='residuals_mean', window='27D', func=np.mean)
>>> qc = qc.rolling(field='incidents_residuals', target='residuals_std', window='27D', func=np.std)
>>> qc = qc.processGeneric(field=['incidents_scores', "residuals_mean", "residuals_std"], target="residuals_norm",
... func=lambda this, mean, std: (this - mean) / std)
......@@ -362,15 +362,15 @@ residuals *mean* and *standard deviation* seperately:
:context: close-figs
:include-source: False
qc = qc.roll(field='incidents_residuals', target='residuals_mean', window='27D', func=np.mean)
qc = qc.roll(field='incidents_residuals', target='residuals_std', window='27D', func=np.std)
qc = qc.rolling(field='incidents_residuals', target='residuals_mean', window='27D', func=np.mean)
qc = qc.rolling(field='incidents_residuals', target='residuals_std', window='27D', func=np.std)
qc = qc.processGeneric(field=['incidents_scores', "residuals_mean", "residuals_std"], target="residuals_norm", func=lambda this, mean, std: (this - mean) / std)
With huge datasets, this will be noticably faster, compared to the method presented :ref:`initially <cookbooks/ResidualOutlierDetection:Scores>`\ ,
because ``saqc`` dispatches the rolling with the basic numpy statistic methods to an optimized pandas built-in.
Also, as a result of the :py:meth:`~saqc.SaQC.roll` assigning its results to the center of every window,
Also, as a result of the :py:meth:`~saqc.SaQC.rolling` assigning its results to the center of every window,
all the values are centered and we dont have to care about window center indices when we are generating
the *Z*\ -Scores from the two series.
......
......@@ -5,88 +5,136 @@
Customizations
==============
SaQC comes with a continuously growing number of pre-implemented
quality checking and processing routines as well as flagging schemes.
For any sufficiently large use case however, it is very likely that the
functions provided won't fulfill all your needs and requirements.
Acknowledging the impossibility to address all imaginable use cases, we
designed the system to allow for extensions and costumizations. The main extensions options, namely
SaQC comes with a continuously growing number of pre-implemented quality-checking and processing
routines as well as flagging schemes. For a sufficiently large use case, however, it might be
necessary to extend the system anyhow. The main extension options, namely
:ref:`quality check routines <documentation/Customizations:custom quality check routines>`
and the :ref:`flagging scheme <documentation/Customizations:custom flagging schemes>`
are described within this documents.
and the :ref:`flagging scheme <documentation/Customizations:custom flagging schemes>`.
Both of these mechanisms are described within this document.
Custom quality check routines
Custom Quality Check Routines
-----------------------------
In case you are missing quality check routines, you are of course very
welcome to file a feature request issue on the project's
`gitlab repository <https://git.ufz.de/rdm-software/saqc>`_. However, if
you are more the "I-get-this-done-by-myself" type of person,
SaQC provides two ways to integrate custom routines into the system:
In case you are missing quality check routines, you are, of course, very welcome to file a feature request issue on the project's `GitLab repository <https://git.ufz.de/rdm-software/saqc>`_. However, if you are more the "I-get-this-done-by-myself" type of person, SaQC offers the possibility to directly extend its functionality using its interface to the evaluation machinery.
#. The :ref:`extension language <documentation/GenericFunctions:Generic Functions>`
#. An :ref:`interface <documentation/Customizations:interface>` to the evaluation machinery
In order to make a function usable within the evaluation framework of SaQC, it needs to implement the following function interface:
Interface
^^^^^^^^^
In order to make a function usable within the evaluation framework of SaQC, it needs to
implement the following function interface
.. code-block:: python
import pandas
import saqc
def yourTestFunction(
saqc: SaQC
field: str,
*args,
**kwargs
) -> SaQC
def yourTestFunction(qc: SaQC, field: str | list[str], *args, **kwargs) -> SaQC:
# your code
return qc
Argument Descriptions
~~~~~~~~~~~~~~~~~~~~~
with the following parameters
.. list-table::
:header-rows: 1
* - Name
- Description
* - ``data``
- The actual dataset, an instance of ``saqc.DictOfSeries``.
* - ``qc``
- An instance of ``SaQC``
* - ``field``
- The field/column within ``data``, that function is processing.
* - ``flags``
- An instance of saqc.Flags, responsible for the translation of test results into quality attributes.
- The field(s)/column(s) of ``data`` the function is processing/flagging.
* - ``args``
- Any other arguments needed to parameterize the function.
- Any number of named arguments needed to parameterize the function.
* - ``kwargs``
- Any other keyword arguments needed to parameterize the function.
- Any number of named keyword arguments needed to parameterize the function. ``kwargs``
need to be present, even if the function needs no keyword arguments at all
Integrate into SaQC
^^^^^^^^^^^^^^^^^^^
In order make your function available to the system it needs to be registered. We provide a decorator
`\ ``flagging`` <saqc/functions/register.py>`_ with saqc, to integrate your
test functions into SaQC. Here is a complete dummy example:
SaQC provides two decorators, :py:func:`@flagging` and :py:func:`@register`, to integrate custom functions
into its workflow. The choice between them depends on the nature of your algorithm. :py:func:`@register`
is a more versatile decorator, allowing you to handle masking, demasking, and squeezing of data and flags, while
:py:func:`@flagging` is simpler and suitable for univariate flagging functions without the need for complex
data manipulations.
Use :py:func:`@flagging` for simple univariate flagging tasks without the need for complex data manipulations.
:py:func:`@flagging` is especially suitable when your algorithm operates on a single column
.. code-block:: python
from saqc import register
from saqc import SaQC
from saqc.core.register import flagging
@flagging()
def yourTestFunction(saqc: SaQC, field: str, *args, **kwargs):
def simpleFlagging(saqc: SaQC, field: str | list[str], param1: ..., param2: ..., **kwargs) -> SaQC:
"""
Your simple univariate flagging logic goes here.
Parameters
----------
saqc : SaQC
The SaQC instance.
field : str
The field or fields on which to apply anomaly detection.
param1 : ...
Additional parameters needed for your algorithm.
param2 : ...
Additional parameters needed for your algorithm.
Returns
-------
SaQC
The modified SaQC instance.
"""
# Your flagging logic here
# Modify saqc._flags as needed
return saqc
Use :py:func:`@register` when your algorithm needs to handle multiple columns simultaneously (``multivariate=True``)
and or you need explicit control over masking, demasking, and squeezing of data and flags.
:py:func:`register` is especially for complex algorithms that involve interactions between different columns.
.. code-block:: python
from saqc import SaQC
from saqc.core.register import register
@register(
mask=["field"], # Parameter(s) of the decorated functions giving the names of columns in SaQC._data to mask before the call
demask=["field"], # Parameter(s) of the decorated functions giving the names of columns in SaQC._data to unmask after the call
squeeze=["field"], # Parameter(s) of the decorated functions giving the names of columns in SaQC._flags to squeeze into a single flags column after the call
multivariate=True, # Set to True to handle multiple columns
handles_target=False,
)
def complexAlgorithm(
saqc: SaQC, field: str | list[str], param1: ..., param2: ..., **kwargs
) -> SaQC:
"""
Your custom anomaly detection logic goes here.
Parameters
----------
saqc : SaQC
The SaQC instance.
field : str or list of str
The field or fields on which to apply anomaly detection.
param1 : ...
Additional parameters needed for your algorithm.
param2 : ...
Additional parameters needed for your algorithm.
Returns
-------
SaQC
The modified SaQC instance.
"""
# Your anomaly detection logic here
# Modify saqc._flags and saqc._data as needed
return saqc
Example
^^^^^^^
The function `\ ``flagRange`` <saqc/funcs/outliers.py>`_ provides a simple, yet complete implementation of
a quality check routine. You might want to look into its implementation as an example.
Custom flagging schemes
-----------------------
......
......@@ -18,7 +18,7 @@ Documentation
SourceTarget
FlaggingTranslation
.. grid:: 3
.. grid:: 2
:gutter: 2
.. grid-item-card:: Configuration files (csv)
......@@ -41,5 +41,12 @@ Documentation
+++
*Keywords shared by all the flagging functions*
.. grid-item-card:: Customizations
:link: Customizations
:link-type: doc
* add custom functions to SaQC
+++
*Keywords shared by all the flagging functions*
......@@ -3,7 +3,7 @@
.. SPDX-License-Identifier: GPL-3.0-or-later
Basic Anomalies
===============
---------------
.. currentmodule:: saqc
......@@ -16,5 +16,6 @@ Basic Anomalies
~SaQC.flagRaise
~SaQC.flagConstants
~SaQC.flagByVariance
~SaQC.flagPlateau
......@@ -3,8 +3,7 @@
.. SPDX-License-Identifier: GPL-3.0-or-later
Data Products
=============
-------------
.. currentmodule:: saqc
......
......@@ -3,7 +3,7 @@
.. SPDX-License-Identifier: GPL-3.0-or-later
Change Points and Noise
=======================
-----------------------
.. currentmodule:: saqc
......
......@@ -3,7 +3,7 @@
.. SPDX-License-Identifier: GPL-3.0-or-later
Drift detection and correction
==============================
------------------------------
.. currentmodule:: saqc
......
......@@ -3,9 +3,7 @@
.. SPDX-License-Identifier: GPL-3.0-or-later
Gap filling
===========
-----------
.. currentmodule:: saqc
......@@ -13,4 +11,3 @@ Gap filling
:nosignatures:
~SaQC.interpolateByRolling
~SaQC.interpolate
......@@ -2,8 +2,8 @@
..
.. SPDX-License-Identifier: GPL-3.0-or-later
flagtools
=========
Flagtools
---------
.. currentmodule:: saqc
......@@ -15,3 +15,5 @@ flagtools
~SaQC.flagManual
~SaQC.flagDummy
~SaQC.transferFlags
~SaQC.andGroup
~SaQC.orGroup
......@@ -5,126 +5,131 @@
.. _funcs:
Anomaly Detection
------------------
Functionality Overview
----------------------
..
Anomaly Detection
------------------
.. grid:: 2
.. grid:: 1
:gutter: 2
.. grid-item-card:: Basic Anomaly Detection
:link: basicAnomalies
:link-type: doc
.. grid-item-card:: Anomaly Detection
* data *gaps*,
* data *jumps*,
* *isolated* points,
* *constant* and low variance regimes.
+++
.. grid:: 2
:gutter: 2
.. grid-item-card:: Outlier Detection
:link: outlierDetection
:link-type: doc
.. grid-item-card:: Basic Anomaly Detection
:link: basicAnomalies
:link-type: doc
* rolling *Z-score* cutoff
* modified local outlier factor (univariate-*LOF*)
* deterministic *offset pattern* search
+++
* data *gaps*,
* data *jumps*,
* *isolated* points,
* *constant* and low variance regimes.
+++
.. grid-item-card:: Multivariate Analysis
:link: multivariateAnalysis
:link-type: doc
.. grid-item-card:: Outlier Detection
:link: outlierDetection
:link-type: doc
* k-nearest neighbor scores (*kNN*)
* local outlier factor (*LOF*)
+++
* rolling *Z-score* cutoff
* modified local outlier factor (univariate-*LOF*)
* deterministic *offset pattern* search
+++
.. grid-item-card:: Distributional Analysis
:link: distributionalAnomalies
:link-type: doc
.. grid-item-card:: Multivariate Analysis
:link: multivariateAnalysis
:link-type: doc
* detect *change points*
* detect continuous *noisy* data sections
+++
* k-nearest neighbor scores (*kNN*)
* local outlier factor (*LOF*)
+++
Data and Flags Tools
--------------------
.. grid-item-card:: Distributional Analysis
:link: distributionalAnomalies
:link-type: doc
.. grid:: 2
:gutter: 2
* detect *change points*
* detect continuous *noisy* data sections
+++
.. grid-item-card:: Data Independent Flags Manipulation
:link: flagTools
:link-type: doc
* *copy* flags
* *transfer* flags
* *propagate* flags
* *force*-set unitary or precalculated flags values
+++
.. grid-item-card:: Data and Flag Tools
.. grid-item-card:: Basic tools
:link: tools
:link-type: doc
.. grid:: 2
:gutter: 2
* plot variables
* copy and delete variables
+++
.. grid-item-card:: Data Independent Flags Manipulation
:link: flagTools
:link-type: doc
.. grid-item-card:: Generic and Custom Functions
:link: genericWrapper
:link-type: doc
* *copy* flags
* *transfer* flags
* *propagate* flags
* *force*-set unitary or precalculated flags values
+++
* basic *logical* aggregation of variables
* basic *arithmetical* aggregation of variables
* *custom functions*
* *rolling*, *resampling*, *transformation*
+++
.. grid-item-card:: Basic tools
:link: tools
:link-type: doc
* plot variables
* copy and delete variables
+++
Data Manipulation
-----------------
.. grid-item-card:: Generic and Custom Functions
:link: genericWrapper
:link-type: doc
.. grid:: 2
:gutter: 2
* basic *logical* aggregation of variables
* basic *arithmetical* aggregation of variables
* *custom functions*
* *rolling*, *resampling*, *transformation*
+++
.. grid-item-card:: Data Products
:link: dataProducts
:link-type: doc
.. grid-item-card:: Data Manipulation
* smooth with *frequency filter*
* smooth with *polynomials*
* obtain *residuals* from smoothing
* obtain *kNN* or *LOF* scores
+++
.. grid:: 2
:gutter: 2
.. grid-item-card:: Resampling
:link: samplingAlignment
:link-type: doc
.. grid-item-card:: Data Products
:link: dataProducts
:link-type: doc
* *resample* data using custom aggregation
* *align* data to frequency grid with minimal data distortion
* *back project* flags from aligned data onto original series
+++
* smooth with *frequency filter*
* smooth with *polynomials*
* obtain *residuals* from smoothing
* obtain *kNN* or *LOF* scores
+++
Data Correction
---------------
.. grid-item-card:: Resampling
:link: samplingAlignment
:link-type: doc
.. grid:: 2
:gutter: 2
* *resample* data using custom aggregation
* *align* data to frequency grid with minimal data distortion
* *back project* flags from aligned data onto original series
+++
.. grid-item-card:: Data Correction
.. grid:: 2
:gutter: 2
.. grid-item-card:: Gap filling
:link: filling
:link-type: doc
.. grid-item-card:: Gap filling
:link: filling
:link-type: doc
* fill gaps with *interpolations*
* fill gaps using a *rolling* window
+++
* fill gaps with *interpolations*
* fill gaps using a *rolling* window
+++
.. grid-item-card:: Drift Detection and Correction
:link: driftBehavior
:link-type: doc
.. grid-item-card:: Drift Detection and Correction
:link: driftBehavior
:link-type: doc
* deviation predicted by a *model*
* deviation from the *majority* of parallel curves
* deviation from a defined *norm* curve
+++
* deviation predicted by a *model*
* deviation from the *majority* of parallel curves
* deviation from a defined *norm* curve
+++
......@@ -2,8 +2,8 @@
..
.. SPDX-License-Identifier: GPL-3.0-or-later
generic wrapper
===============
Generic Functions
-----------------
.. currentmodule:: saqc
......@@ -13,6 +13,6 @@ generic wrapper
~SaQC.processGeneric
~SaQC.flagGeneric
~SaQC.roll
~SaQC.transform
~SaQC.resample
~SaQC.andGroup
~SaQC.orGroup