Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • berntm/saqc
  • rdm-software/saqc
  • schueler/saqc
3 results
Show changes
Commits on Source (25)
Showing
with 347 additions and 198 deletions
......@@ -29,7 +29,7 @@ jobs:
fail-fast: false
matrix:
os: ["windows-latest", "ubuntu-latest", "macos-latest"]
python-version: ["3.7", "3.8", "3.9", "3.10"]
python-version: ["3.8", "3.9", "3.10"]
defaults:
run:
# somehow this also works for windows O.o ??
......@@ -61,11 +61,11 @@ jobs:
pytest tests dios/test -Werror
python -m saqc --config docs/resources/data/config.csv --data docs/resources/data/data.csv --outfile /tmp/test.csv
- name: run doc tests
run: |
cd docs
pip install -r requirements.txt
make doc
make test
# - name: run doc tests
# run: |
# cd docs
# pip install -r requirements.txt
# make doc
# make test
......@@ -75,20 +75,6 @@ coverage:
path: coverage.xml
# test saqc with python 3.7
python37:
stage: test
image: python:3.7
script:
- pytest tests dios/test -Werror --junitxml=report.xml
- python -m saqc --config docs/resources/data/config.csv --data docs/resources/data/data.csv --outfile /tmp/test.csv
artifacts:
when: always
reports:
junit: report.xml
# test saqc with python 3.8
python38:
stage: test
script:
......@@ -100,7 +86,6 @@ python38:
junit: report.xml
# test saqc with python 3.9
python39:
stage: test
image: python:3.9
......@@ -113,7 +98,6 @@ python39:
junit: report.xml
# test saqc with python 3.10
python310:
stage: test
image: python:3.10
......@@ -125,7 +109,6 @@ python310:
reports:
junit: report.xml
doctest:
stage: test
script:
......
......@@ -7,12 +7,30 @@ SPDX-License-Identifier: GPL-3.0-or-later
# Changelog
## Unreleased
[List of commits](https://git.ufz.de/rdm-software/saqc/-/compare/v2.2.1...develop)
[List of commits](https://git.ufz.de/rdm-software/saqc/-/compare/v2.3.0...develop)
### Added
### Changed
### Removed
### Fixed
## [2.3.0](https://git.ufz.de/rdm-software/saqc/-/tags/v2.3.0) - 2023-01-17
[List of commits](https://git.ufz.de/rdm-software/saqc/-/compare/v2.2.1...v2.3.0)
### Added
- add option to not overwrite existing flags to `concatFlags`
- add option to pass existing axis object to `plot`
- python 3.11 support
### Changed
- Remove all flag value restrictions from the default flagging scheme `FloatTranslator`
- Renamed `TranslationScheme.forward` to `TranslationScheme.toInternal`
- Renamed `TranslationScheme.backward` to `TranslationScheme.toExternal`
- Changed default value of the parameter `limit` for `SaQC.interpolateIndex` and `SaQC.interpolateInvalid` to ``None``
- Changed default value of the parameter ``overwrite`` for ``concatFlags`` to ``False``
- Deprecate ``transferFlags`` in favor of ``concatFlags``
### Removed
- python 3.7 support
### Fixed
- Error for interpolations with limits set to be greater than 2 (`interpolateNANs`)
## [2.2.1](https://git.ufz.de/rdm-software/saqc/-/tags/v2.2.1) - 2022-10-29
[List of commits](https://git.ufz.de/rdm-software/saqc/-/compare/v2.2.0...v2.2.1)
### Added
......@@ -29,7 +47,7 @@ SPDX-License-Identifier: GPL-3.0-or-later
- translation of `dfilter`
- new generic function `clip`
- parameter `min_periods` to `SaQC.flagConstants`
- function `fitButterworth`
- function `fitLowpassFilter`
- tracking interpolation routines in `History`
### Changed
- test function interface changed to `func(saqc: SaQC, field: str | Sequence[str], *args, **kwargs)`
......
......@@ -18,9 +18,6 @@ help:
.PHONY: help Makefile clean
test:
for k in $(MDLIST); do echo docs/"$$k"; done
# clean sphinx generated stuff
clean:
rm -rf _build _static _api
......
......@@ -347,7 +347,7 @@ correlated with relatively high *kNNscores*, we could try to calculate a thresho
`STRAY <https://arxiv.org/pdf/1908.04000.pdf>`_ algorithm, which is available as the method:
:py:meth:`~saqc.SaQC.flagByStray`. This method will mark some samples of the `kNNscore` variable as anomaly.
Subsequently we project this marks (or *flags*) on to the *sac* variable with a call to
:py:meth:`~saqc.SaQC.transferFlags`. For the sake of demonstration, we also project the flags
:py:meth:`~saqc.SaQC.concatFlags`. For the sake of demonstration, we also project the flags
on the normalized *sac* and plot the flagged values in the *sac254_norm* - *level_norm* feature space.
......@@ -355,8 +355,8 @@ on the normalized *sac* and plot the flagged values in the *sac254_norm* - *leve
.. doctest:: exampleMV
>>> qc = qc.flagByStray(field='kNNscores', freq='30D', alpha=.3)
>>> qc = qc.transferFlags(field='kNNscores', target='sac254_corrected', label='STRAY')
>>> qc = qc.transferFlags(field='kNNscores', target='sac254_norm', label='STRAY')
>>> qc = qc.concatFlags(field='kNNscores', target='sac254_corrected', label='STRAY')
>>> qc = qc.concatFlags(field='kNNscores', target='sac254_norm', label='STRAY')
>>> qc.plot('sac254_corrected', xscope='2016-11') # doctest:+SKIP
>>> qc.plot('sac254_norm', phaseplot='level_norm', xscope='2016-11') # doctest:+SKIP
......@@ -365,8 +365,8 @@ on the normalized *sac* and plot the flagged values in the *sac254_norm* - *leve
:include-source: False
qc = qc.flagByStray(field='kNNscores', freq='30D', alpha=.3)
qc = qc.transferFlags(field='kNNscores', target='sac254_corrected', label='STRAY')
qc = qc.transferFlags(field='kNNscores', target='sac254_norm', label='STRAY')
qc = qc.concatFlags(field='kNNscores', target='sac254_corrected', label='STRAY')
qc = qc.concatFlags(field='kNNscores', target='sac254_norm', label='STRAY')
.. plot::
:context: close-figs
......
......@@ -273,7 +273,7 @@ To see all the results obtained so far, plotted in one figure window, we make us
.. doctest:: exampleOD
>>> data.to_df().plot()
<AxesSubplot:>
<AxesSubplot: >
.. plot::
:context:
......
......@@ -3,11 +3,10 @@
# SPDX-License-Identifier: GPL-3.0-or-later
recommonmark==0.7.1
sphinx<6
sphinx<7
sphinx-automodapi==0.14.1
sphinxcontrib-fulltoc==1.2.0
sphinx-markdown-tables==0.0.17
m2r==0.2.1
jupyter-sphinx==0.3.2
sphinx_autodoc_typehints==1.18.2
sphinx-tabs==3.4.1
......@@ -16,6 +16,6 @@ water_z ; transform(field=['water_temp_raw'], func=zScore(x), fr
sac_z ; transform(field=['sac254_raw'], func=zScore(x), freq='20D')
kNN_scores ; assignKNNScore(field=['level_z', 'water_z', 'sac_z'], freq='20D')
kNN_scores ; flagByStray(freq='20D')
level_raw ; transferFlags(field=['kNN_scores'], label='STRAY')
sac254_corr ; transferFlags(field=['kNN_scores'], label='STRAY')
water_temp_raw ; transferFlags(field=['kNN_scores'], label='STRAY')
\ No newline at end of file
level_raw ; concatFlags(field=['kNN_scores'], label='STRAY')
sac254_corr ; concatFlags(field=['kNN_scores'], label='STRAY')
water_temp_raw ; concatFlags(field=['kNN_scores'], label='STRAY')
docs/resources/temp/SM1processingResults.png

58.8 KiB | W: 0px | H: 0px

docs/resources/temp/SM1processingResults.png

58.8 KiB | W: 0px | H: 0px

docs/resources/temp/SM1processingResults.png
docs/resources/temp/SM1processingResults.png
docs/resources/temp/SM1processingResults.png
docs/resources/temp/SM1processingResults.png
  • 2-up
  • Swipe
  • Onion skin
docs/resources/temp/SM2processingResults.png

147 KiB | W: 0px | H: 0px

docs/resources/temp/SM2processingResults.png

147 KiB | W: 0px | H: 0px

docs/resources/temp/SM2processingResults.png
docs/resources/temp/SM2processingResults.png
docs/resources/temp/SM2processingResults.png
docs/resources/temp/SM2processingResults.png
  • 2-up
  • Swipe
  • Onion skin
......@@ -4,13 +4,12 @@
Click==8.1.3
dtw==1.4.0
hypothesis==6.55.0
matplotlib==3.5.3
numba==0.56.3
numpy==1.21.6
matplotlib==3.6.2
numba==0.56.4
numpy==1.23.5
outlier-utils==0.0.3
pyarrow==9.0.0
pyarrow==10.0.1
pandas==1.3.5
scikit-learn==1.0.2
scipy==1.7.3
typing_extensions==4.3.0
scikit-learn==1.2.0
scipy==1.10.0
typing_extensions==4.4.0
......@@ -110,7 +110,7 @@ class SaQC(FunctionsMixin):
@property
def flags(self) -> MutableMapping:
flags = self._scheme.backward(self._flags, attrs=self._attrs, raw=True)
flags = self._scheme.toExternal(self._flags, attrs=self._attrs)
flags.attrs = self._attrs.copy()
return flags
......
......@@ -6,7 +6,7 @@
from __future__ import annotations
from typing import DefaultDict, Dict, Iterable, Mapping, Optional, Tuple, Type, Union
from typing import DefaultDict, Dict, Iterable, Mapping, Tuple, Type, Union
import numpy as np
import pandas as pd
......@@ -147,7 +147,7 @@ class Flags:
0 True
1 False
2 True
Name: 2, dtype: bool
dtype: bool
.. doctest:: exampleFlags
......@@ -191,9 +191,7 @@ class Flags:
2 -inf 25.0 25.0 0.0 99.0
"""
def __init__(
self, raw_data: Optional[Union[DictLike, Flags]] = None, copy: bool = False
):
def __init__(self, raw_data: DictLike | Flags | None = None, copy: bool = False):
self._data: dict[str, History]
......
......@@ -8,10 +8,11 @@ from __future__ import annotations
from copy import copy as shallowcopy
from copy import deepcopy
from typing import Any, Callable, Dict, List, Tuple, Union
from typing import Any, Callable, Dict, List, Tuple
import numpy as np
import pandas as pd
from pandas.api.types import is_categorical_dtype, is_float_dtype
from saqc.constants import UNFLAGGED
......@@ -45,8 +46,32 @@ class History:
def __init__(self, index: pd.Index | None):
self.hist = pd.DataFrame(index=index)
self.meta = []
self._hist = pd.DataFrame(index=index)
self._meta = []
@property
def hist(self):
return self._hist.astype(float, copy=True)
@hist.setter
def hist(self, value: pd.DataFrame) -> None:
self._validateHist(value)
if len(value.columns) != len(self._meta):
raise ValueError(
"passed history does not match existing meta. "
"To use a new `hist` with new `meta` use "
"'History.createFromData(new_hist, new_meta)'"
)
self._hist = value.astype("category", copy=True)
@property
def meta(self) -> list[dict[str, Any]]:
return list(self._meta)
@meta.setter
def meta(self, value: list[dict[str, Any]]) -> None:
self._validateMetaList(value, self._hist)
self._meta = deepcopy(value)
@property
def index(self) -> pd.Index:
......@@ -66,7 +91,7 @@ class History:
-------
index : pd.Index
"""
return self.hist.index
return self._hist.index
@property
def columns(self) -> pd.Index:
......@@ -80,7 +105,7 @@ class History:
-------
columns : pd.Index
"""
return self.hist.columns
return self._hist.columns
@property
def empty(self) -> bool:
......@@ -118,15 +143,11 @@ class History:
# all following code must handle a passed empty series
# ensure continuous increasing columns
assert 0 <= pos <= len(self)
self.hist[pos] = s.astype("category")
assert 0 <= pos <= len(self.columns)
self._hist[pos] = s.astype("category")
return self
def append(
self, value: Union[pd.Series, History], meta: dict | None = None
) -> History:
def append(self, value: pd.Series | History, meta: dict | None = None) -> History:
"""
Create a new FH column and insert given pd.Series to it.
......@@ -157,8 +178,7 @@ class History:
if meta is None:
meta = {}
if not isinstance(meta, dict):
elif not isinstance(meta, dict):
raise TypeError("'meta' must be of type None or dict")
val = self._validateValue(value)
......@@ -166,10 +186,10 @@ class History:
raise ValueError("Index does not match")
self._insert(val, pos=len(self))
self.meta.append(meta.copy())
self._meta.append(meta.copy())
return self
def _appendHistory(self, value: History):
def _appendHistory(self, value: History) -> History:
"""
Append multiple columns of a history to self.
......@@ -190,44 +210,107 @@ class History:
-----
This ignores the column names of the passed History.
"""
self._validate(value.hist, value.meta)
self._validate(value._hist, value._meta)
if not value.index.equals(self.index):
raise ValueError("Index does not match")
# we copy shallow because we only want to set new columns
# the actual data copy happens in calls to astype
value_hist = value.hist.copy(deep=False)
value_meta = value.meta.copy()
value_hist = value._hist.copy(deep=False)
value_meta = value._meta.copy()
# rename columns, to avoid ``pd.DataFrame.loc`` become confused
n = len(self.columns)
columns = pd.Index(range(n, n + len(value_hist.columns)))
value_hist.columns = columns
hist = self.hist.astype(float)
hist = self._hist.astype(float)
hist.loc[:, columns] = value_hist.astype(float)
self.hist = hist.astype("category")
self.meta += value_meta
self._hist = hist.astype("category")
self._meta += value_meta
return self
def squeeze(self, raw=False) -> pd.Series:
def squeeze(
self, raw: bool = False, start: int | None = None, end: int | None = None
) -> pd.Series:
"""
Get the last flag value per row of the FH.
Reduce history to a series, by taking the last set value per row.
By passing `start` and/or `end` only a slice of the history is used.
This can be used to get the values of an earlier test. See the
Examples.
Parameters
----------
raw : bool, default False
If True, 'unset' values are represented by `nan`,
otherwise, 'unset' values are represented by the
`UNFLAGGED` (`-inf`) constant
start : int, default None
The first history column to use (inclusive).
end : int, default None
The last history column to use (exclusive).
Returns
-------
pd.Series
pandas.Series
Examples
--------
>>> from saqc.core.history import History
>>> s0 = pd.Series([np.nan, np.nan, 99.])
>>> s1 = pd.Series([1., 1., np.nan])
>>> s2 = pd.Series([2., np.nan, 2.])
>>> h = History(pd.Index([0,1,2])).append(s0).append(s1).append(s2)
>>> h
0 1 2
0 nan 1.0 2.0
1 nan 1.0 nan
2 99.0 nan 2.0
Get current flags.
>>> h.squeeze()
0 2.0
1 1.0
2 2.0
dtype: float64
Get only the flags that the last function had set:
>>> h.squeeze(start=-1)
0 2.0
1 -inf
2 2.0
dtype: float64
Get the flags before the last function run:
>>> h.squeeze(end=-1)
0 1.0
1 1.0
2 99.0
dtype: float64
Get only the flags that the 2nd function had set:
>>> h.squeeze(start=1, end=2)
0 1.0
1 1.0
2 -inf
dtype: float64
"""
result = self.hist.astype(float)
if result.empty:
result = pd.DataFrame(data=np.nan, index=self.hist.index, columns=[0])
result = result.ffill(axis=1).iloc[:, -1]
if raw:
return result
hist = self._hist.iloc[:, slice(start, end)].astype(float)
if hist.empty:
result = pd.Series(data=np.nan, index=self._hist.index, dtype=float)
else:
return result.fillna(UNFLAGGED)
result = hist.ffill(axis=1).iloc[:, -1]
if not raw:
result = result.fillna(UNFLAGGED)
result.name = None
return result
def reindex(
self, index: pd.Index, fill_value_last: float = UNFLAGGED, copy: bool = True
......@@ -251,17 +334,11 @@ class History:
-------
History
"""
# Note: code must handle empty frames
out = self.copy() if copy else self
hist = out.hist.astype(float).reindex(
index=index, copy=False, fill_value=np.nan
)
# Note: all following code must handle empty frames
hist = out._hist.astype(float).reindex(index=index, copy=False)
hist.iloc[:, -1:] = hist.iloc[:, -1:].fillna(fill_value_last)
out.hist = hist.astype("category")
out._hist = hist.astype("category")
return out
def apply(
......@@ -271,7 +348,7 @@ class History:
func_kws: dict,
func_handle_df: bool = False,
copy: bool = True,
):
) -> History:
"""
Apply a function on each column in history.
......@@ -309,27 +386,31 @@ class History:
Returns
-------
history with altered columns
History with altered columns
"""
hist = pd.DataFrame(index=index)
# implicit copy by astype
# convert data to floats as functions may fail with categoricals
# convert data to floats as functions may fail with categorical dtype
if func_handle_df:
hist = func(self.hist.astype(float), **func_kws)
hist = func(self._hist.astype(float, copy=True), **func_kws)
else:
for pos in self.columns:
hist[pos] = func(self.hist[pos].astype(float), **func_kws)
hist[pos] = func(self._hist[pos].astype(float, copy=True), **func_kws)
History._validate(hist, self.meta)
try:
self._validate(hist, self._meta)
except Exception as e:
raise ValueError(
f"result from applied function is not a valid History, because {e}"
) from e
if copy:
history = History(index=None) # noqa
history.meta = self.meta.copy()
history._meta = self._meta.copy()
else:
history = self
history.hist = hist.astype("category")
history._hist = hist.astype("category")
return history
......@@ -350,8 +431,8 @@ class History:
"""
copyfunc = deepcopy if deep else shallowcopy
new = History(self.index)
new.hist = self.hist.copy(deep)
new.meta = copyfunc(self.meta)
new._hist = self._hist.copy(deep)
new._meta = copyfunc(self._meta)
return new
def __copy__(self):
......@@ -367,14 +448,14 @@ class History:
return self.copy(deep=True)
def __len__(self) -> int:
return len(self.hist.columns)
return len(self._hist.columns)
def __repr__(self):
if self.empty:
return str(self.hist).replace("DataFrame", "History")
return str(self._hist).replace("DataFrame", "History")
r = self.hist.astype(str)
r = self._hist.astype(str)
return str(r)[1:]
......@@ -382,51 +463,62 @@ class History:
# validation
#
@staticmethod
def _validate(hist: pd.DataFrame, meta: List[Any]) -> Tuple[pd.DataFrame, List]:
@classmethod
def _validate(
cls, hist: pd.DataFrame, meta: List[Any]
) -> Tuple[pd.DataFrame, List]:
"""
check type, columns, index, dtype of hist and if the meta fits also
"""
cls._validateHist(hist)
cls._validateMetaList(meta, hist)
return hist, meta
# check hist
if not isinstance(hist, pd.DataFrame):
@classmethod
def _validateHist(cls, obj):
if not isinstance(obj, pd.DataFrame):
raise TypeError(
f"'hist' must be of type pd.DataFrame, but is of type {type(hist).__name__}"
f"'hist' must be of type pd.DataFrame, "
f"but is of type {type(obj).__name__}"
)
# isin([float, ..]) does not work !
if not (
(hist.dtypes == float)
| (hist.dtypes == np.float32)
| (hist.dtypes == np.float64)
| (hist.dtypes == "category")
).all():
if not obj.columns.equals(pd.RangeIndex(len(obj.columns))):
raise ValueError(
"dtype of all columns in hist must be float or categorical"
)
if not hist.empty and (
not hist.columns.equals(pd.Index(range(len(hist.columns))))
or not np.issubdtype(hist.columns.dtype, np.integer)
):
raise ValueError(
"column names must be continuous increasing int's, starting with 0."
"Columns of 'hist' must consist of "
"continuous increasing integers, "
"starting with 0."
)
for c in obj.columns:
try:
cls._validateValue(obj[c])
except Exception as e:
raise ValueError(f"Bad column in hist. column '{c}': {e}") from None
return obj
# check meta
if not isinstance(meta, list):
@classmethod
def _validateMetaList(cls, obj, hist=None):
if not isinstance(obj, list):
raise TypeError(
f"'meta' must be of type list, but is of type {type(meta).__name__}"
)
if not all([isinstance(e, dict) for e in meta]):
raise TypeError("All elements in meta must be of type 'dict'")
# check combinations of hist and meta
if not len(hist.columns) == len(meta):
raise ValueError(
"'meta' must have as many entries as columns exist in hist"
f"'meta' must be of type list, got type {type(obj).__name__}"
)
if hist is not None:
if not len(obj) == len(hist.columns):
raise ValueError(
"'meta' must have as many entries as columns in 'hist'"
)
for i, item in enumerate(obj):
try:
cls._validateMetaDict(item)
except Exception as e:
raise ValueError(f"Bad meta. item {i}: {e}") from None
return obj
return hist, meta
@staticmethod
def _validateMetaDict(obj):
if not isinstance(obj, dict):
raise TypeError("obj must be dict")
if not all(isinstance(k, str) for k in obj.keys()):
raise ValueError("all keys in dict must be strings")
return obj
@staticmethod
def _validateValue(obj: pd.Series) -> pd.Series:
......@@ -435,14 +527,52 @@ class History:
"""
if not isinstance(obj, pd.Series):
raise TypeError(
f"value must be of type pd.Series, but {type(obj).__name__} was given"
f"value must be of type pd.Series, got type {type(obj).__name__}"
)
if not ((obj.dtype == float) or isinstance(obj.dtype, pd.CategoricalDtype)):
if not is_float_dtype(obj.dtype) and not is_categorical_dtype(obj.dtype):
raise ValueError("dtype must be float or categorical")
return obj
@classmethod
def createFromData(cls, hist: pd.DataFrame, meta: List[Dict], copy: bool = False):
"""
Create a History from existing data.
Parameters
----------
hist : pd.Dataframe
Data that define the flags of the history.
meta : List of dict
A list holding meta information for each column, therefore it must
have the same number of entries as columns exist in `hist`.
copy : bool, default False
If `True`, the input data is copied, otherwise not.
Notes
-----
To create a very simple History from a flags dataframe ``f`` use
``mask = pd.DataFrame(True, index=f.index, columns=f.columns``
and
``meta = [{}] * len(f.columns)``.
Returns
-------
History
"""
cls._validate(hist, meta)
if copy:
hist = hist.copy()
meta = deepcopy(meta)
history = cls(index=None) # noqa
history._hist = hist.astype("category", copy=False)
history._meta = meta
return history
def createHistoryFromData(
hist: pd.DataFrame,
......@@ -476,13 +606,10 @@ def createHistoryFromData(
-------
History
"""
History._validate(hist, meta)
if copy:
hist = hist.copy()
meta = deepcopy(meta)
history = History(index=None) # noqa
history.hist = hist.astype("category", copy=False)
history.meta = meta
return history
# todo: expose History, enable this warning
# warnings.warn(
# "saqc.createHistoryFromData() will be deprecated soon. "
# "Please use saqc.History.createFromData() instead.",
# category=FutureWarning,
# )
return History.createFromData(hist, meta, copy)
......@@ -147,20 +147,12 @@ def _squeezeFlags(old_flags, new_flags: Flags, columns: pd.Index, meta) -> Flags
# function call. If no such columns exist, we end up with an empty
# new_history.
start = len(old_history.columns)
new_history = _sliceHistory(new_history, slice(start, None))
squeezed = new_history.squeeze(raw=True)
squeezed = new_history.squeeze(raw=True, start=start)
out.history[col] = out.history[col].append(squeezed, meta=meta)
return out
def _sliceHistory(history: History, sl: slice) -> History:
history.hist = history.hist.iloc[:, sl]
history.meta = history.meta[sl]
return history
def _maskData(
data: dios.DictOfSeries, flags: Flags, columns: Sequence[str], thresh: float
) -> Tuple[dios.DictOfSeries, dios.DictOfSeries]:
......
......@@ -7,6 +7,7 @@
# -*- coding: utf-8 -*-
from saqc.core.translation.basescheme import (
FloatScheme,
MappingScheme,
SimpleScheme,
TranslationScheme,
)
......
......@@ -8,6 +8,7 @@
from __future__ import annotations
from abc import abstractmethod, abstractproperty
from typing import Any, Dict
import numpy as np
......@@ -22,7 +23,26 @@ ForwardMap = Dict[ExternalFlag, float]
BackwardMap = Dict[float, ExternalFlag]
class TranslationScheme:
class TranslationScheme: # pragma: no cover
@property
@abstractmethod
def DFILTER_DEFAULT(self):
pass
@abstractmethod
def __call__(self, flag: ExternalFlag) -> float:
pass
@abstractmethod
def toInternal(self, flags: pd.DataFrame | DictOfSeries) -> Flags:
pass
@abstractmethod
def toExternal(self, flags: Flags, attrs: dict | None = None) -> DictOfSeries:
pass
class MappingScheme(TranslationScheme):
"""
This class provides the basic translation mechanism and should serve as
a base class for every other translation scheme.
......@@ -81,7 +101,7 @@ class TranslationScheme:
@staticmethod
def _translate(
flags: Flags | pd.DataFrame | pd.Series,
flags: Flags | pd.DataFrame | pd.Series | DictOfSeries,
trans_map: ForwardMap | BackwardMap,
) -> DictOfSeries:
"""
......@@ -95,7 +115,7 @@ class TranslationScheme:
Returns
-------
pd.DataFrame, Flags
DictOfSeries
"""
if isinstance(flags, pd.Series):
flags = flags.to_frame()
......@@ -128,9 +148,9 @@ class TranslationScheme:
if flag not in self._backward:
raise ValueError(f"invalid flag: {flag}")
return float(flag)
return self._forward[flag]
return float(self._forward[flag])
def forward(self, flags: pd.DataFrame) -> Flags:
def toInternal(self, flags: pd.DataFrame | DictOfSeries | pd.Series) -> Flags:
"""
Translate from 'external flags' to 'internal flags'
......@@ -145,13 +165,11 @@ class TranslationScheme:
"""
return Flags(self._translate(flags, self._forward))
def backward(
def toExternal(
self,
flags: Flags,
raw: bool = False,
attrs: dict | None = None,
**kwargs,
) -> pd.DataFrame | DictOfSeries:
) -> DictOfSeries:
"""
Translate from 'internal flags' to 'external flags'
......@@ -160,9 +178,6 @@ class TranslationScheme:
flags : pd.DataFrame
The external flags to translate
raw: bool, default False
if True return data as DictOfSeries, otherwise as pandas DataFrame.
attrs : dict or None, default None
global meta information of saqc-object
......@@ -172,8 +187,6 @@ class TranslationScheme:
"""
out = self._translate(flags, self._backward)
out.attrs = attrs or {}
if not raw:
out = out.to_df()
return out
......@@ -184,16 +197,30 @@ class FloatScheme(TranslationScheme):
internal float flags
"""
_MAP = {
-np.inf: -np.inf,
**{k: k for k in np.arange(0, 256, dtype=float)},
}
DFILTER_DEFAULT: float = FILTER_ALL
def __init__(self):
super().__init__(self._MAP, self._MAP)
def __call__(self, flag: float | int) -> float:
try:
return float(flag)
except (TypeError, ValueError, OverflowError):
raise ValueError(f"invalid flag, expected a numerical value, got: {flag}")
def toInternal(self, flags: pd.DataFrame | DictOfSeries) -> Flags:
try:
return Flags(flags.astype(float))
except (TypeError, ValueError, OverflowError):
raise ValueError(
f"invalid flag(s), expected a collection of numerical values, got: {flags}"
)
def toExternal(self, flags: Flags, attrs: dict | None = None) -> DictOfSeries:
out = flags.toDios()
out.attrs = attrs or {}
return out
class SimpleScheme(TranslationScheme):
class SimpleScheme(MappingScheme):
"""
Acts as the default Translator, provides a changeable subset of the
......
......@@ -17,7 +17,7 @@ import pandas as pd
from saqc.constants import BAD, DOUBTFUL, GOOD, UNFLAGGED
from saqc.core.flags import Flags
from saqc.core.history import History
from saqc.core.translation.basescheme import BackwardMap, ForwardMap, TranslationScheme
from saqc.core.translation.basescheme import BackwardMap, ForwardMap, MappingScheme
_QUALITY_CAUSES = [
"",
......@@ -40,7 +40,7 @@ _QUALITY_LABELS = [
]
class DmpScheme(TranslationScheme):
class DmpScheme(MappingScheme):
"""
Implements the translation from and to the flagging scheme implemented in
......@@ -91,7 +91,7 @@ class DmpScheme(TranslationScheme):
field_history.append(histcol, meta=meta)
return field_history
def forward(self, df: pd.DataFrame) -> Flags:
def toInternal(self, df: pd.DataFrame) -> Flags:
"""
Translate from 'external flags' to 'internal flags'
......@@ -114,7 +114,7 @@ class DmpScheme(TranslationScheme):
return Flags(data)
def backward(
def toExternal(
self, flags: Flags, attrs: dict | None = None, **kwargs
) -> pd.DataFrame:
"""
......@@ -131,7 +131,7 @@ class DmpScheme(TranslationScheme):
-------
translated flags
"""
tflags = super().backward(flags, raw=True, attrs=attrs)
tflags = super().toExternal(flags, attrs=attrs)
out = pd.DataFrame(
index=reduce(lambda x, y: x.union(y), tflags.indexes).sort_values(),
......
......@@ -12,10 +12,10 @@ import pandas as pd
from saqc.constants import BAD, DOUBTFUL, GOOD, UNFLAGGED
from saqc.core.flags import Flags, History
from saqc.core.translation.basescheme import BackwardMap, ForwardMap, TranslationScheme
from saqc.core.translation.basescheme import BackwardMap, ForwardMap, MappingScheme
class PositionalScheme(TranslationScheme):
class PositionalScheme(MappingScheme):
"""
Implements the translation from and to the flagging scheme implemented by CHS
......@@ -43,7 +43,7 @@ class PositionalScheme(TranslationScheme):
def __init__(self):
super().__init__(forward=self._FORWARD, backward=self._BACKWARD)
def forward(self, flags: pd.DataFrame) -> Flags:
def toInternal(self, flags: pd.DataFrame) -> Flags:
"""
Translate from 'external flags' to 'internal flags'
......@@ -75,7 +75,7 @@ class PositionalScheme(TranslationScheme):
return Flags(data)
def backward(self, flags: Flags, **kwargs) -> pd.DataFrame:
def toExternal(self, flags: Flags, **kwargs) -> pd.DataFrame:
"""
Translate from 'internal flags' to 'external flags'
......
......@@ -396,7 +396,15 @@ class FlagtoolsMixin:
0 -inf -inf -inf
1 255.0 255.0 255.0
"""
import warnings
warnings.warn(
f"""The method 'transferFlags' is deprecated and
will be removed in version 2.5 of SaQC. Please use
'SaQC.concatFlags(field={field}, target={target}, method="match", squeeze=False)'
instead""",
DeprecationWarning,
)
return self.concatFlags(field, target=target, method="match", squeeze=False)
@flagging()
......