Suggestion for __getitem__/__setitem__ support
The main idea is, to allow add slicing functionality to the SaQC
obejct, so that we can additionally express the following:
qc = SaQC(data)
qc = qc.flagRange(field="a")
qc = qc.flagRange(field="b", target="c")
as
qc = SaQC(data)
qc["a"] = qc["a"].flagRange()
qc["c"] = qc["b"].flagRange()
To get this rolling, (I think) the following changes are needed:
-
__getitem__
with signature(str | Sequence[str]) -> SaQC
to 'slice' theSaQC
-Object into a subset of variables -
__setitem__
with signature(str | Sequence[str], SaQC) -> None
to 'merge' toSaQC
objects into one - make
field
an optional parameter and default to 'process-all-columns'. While this might not be strictly necessary, something like the following is IMO a requirement to make the entire feature usable:qc["c", "d"] = qc["a", "b"].flagRange()
The code below is only there to express the ideas given above in code...What do you think @palmb ?
import numpy as np
import pandas as pd
class Prototype:
def __init__(self, data) -> None:
self._data = data
def __getitem__(self, key):
return Prototype({key: self._data[key]})
def __repr__(self):
return f"Test({self._data})"
def __setitem__(self, key: str, value: "Prototype"):
self._data[key] = value._data[tuple(value._data.keys())[0]]
def flagRange(self, field=None, min=-np.inf, max=np.inf):
if field is None:
field = self._data.keys()
if np.isscalar(field):
field = [field]
for k, v in self._data.items():
self._data[k] = self._data[k].clip(lower=min, upper=max)
return self
if __name__ == "__main__":
p = Prototype(data={"SM1": pd.Series([1, 5, 10, 15]), "SM2": pd.Series([2, 18, 34, 64])})
p["SM3"] = p["SM1"].flagRange(min=2, max=19)
p = p.flagRange("SM1", target="SM3", min=2, max=19)
print(p)
Edited by David Schäfer