Draft: Saqc dunder operators
dunder
operator Mixin for SaQC
class.
The idea is to just dispatch dunder
calls to flagGeneric
or processGeneric
, and strictly constrain arithmetic and logic operators to either be data (arithmetic) or flags (logic) operators.
+
, *
, -
, /
, %
, **
)
Arithmetic operators (Arithmetic dunder
operators are confined to operate on data
and thus are wrapping processGeneric
.
Consistency for the arithmetic operators is fairly easily achieved by defining:
qc[field] + t
with t
being some scalar (float, int)
qc.processGeneric(field, target=target, func=lambda x: x + t)[target]
(target
is always selected so that is a new variable)
qc[field] + A
with A
being some Array-like
qc.processGeneric(field, target=target, func=lambda x: x + A)[target]
qc1[field1] + qc2[field2]
In case both operands are SaQC
objects, solution remains fairly straight forward.
qc.processGeneric([field1, field2], target=target, func=lambda x,y: x+y)[target]
(with qc
being a join of qc1
and qc2
)
-qc[field]
Unitary operation also just gets passed to processGeenric
qc.processGeneric(field, target=target, func=lambda x: -x)[target]
&
, |
, ~
)
Binary operators (Binary dunder
are confined to wrap flagGeneric
and only operate on the flags
.
The idea is to dispatch the actual logic down to an isflagged
-call in flagGeneric
.
qc[field] & A
with A
being some boolean type Array-like
qc.flagGeneric(field, target=field, func=lambda x: isflagged(x) & A)[field]
qc1[field1] & qc2[field2]
To evade the ambiguity of what happens with the data if both operands are saqc
objects, in this case, all data is returned as NaN
. Flags are calculated with respect to the operator (&
in this case):
The call is dispatched to:
qc.flagGeneric([field1,field2], target=target, func=lambda x,y: isflagged(x) & isflagged(y))[target]
(with qc
being a join of qc1
and qc2
and target
being chosen so that it isnt present in qc
)
~qc[field]
Negation is dispatched via::
qc.flagGeneric(field,target, func=lambda x: ~isflagged(x))[target]
where()
, |=
)
Assignment Operator (Unfortunately, to assign flags resulting from binary dunder
application to a variable, additional operators are needed: .where()
and |=
qc1[field1].where(qc2[field2])
Achieves flagging of field1
in qc1
, according to field2
in qc2
. Internally,this is dispatched to:
qc.flagGeneric(field2, target=field1, func=lambda x: isflagged(x))[field1]
(with field1
keeping the data)
qc1[field1].where(qc2[field2], flag=F)
where()
can be provided with a flag
value to control the set flag. The parameter gets forwarded to the underlying flagGeneric
Call:
qc.flagGeneric(field2, target=field1, func=lambda x: isflagged(x), flag=F)[field1]
While where()
returns the flagged saqc object, |=
operates inplace:
qc1[field1] |= qc2[field2]
equals qc1[field1] = qc1[field1].where(qc2[field2])
, and
qc1[field1] |= qc2[field2], F
equals qc1[field1] = qc1[field1].where(qc2[field2], flag=F)
<
, <=
, >
, >=
)
Comperative Oparators (Comparative operators also only operate on the flags
part of the SaQC
input.
The idea is to have a shorthand for filtering input data by their flag level.
qc[field] > 0
This will return a new SaQC
object, where values are flagged BAD
, if their flag level in qc
was above 0
, and UNFLAGGED
otherwise.
(qc[field] > "DOUBTFUL") + 5
this will yield an SaQC
object, where data is qc[field] + 5
, if field
value was flagged better than doubtfull and np.nan
otherwise.
Examples
some data:
>> import saqc
>> import pandas as pd
>> dat= pd.Series([1,2,3], index=pd.date_range('2000', periods=3, freq='30min'), name='d1')
>> qc=saqc.SaQC(dat)
>> qc.data
d1 |
====================== |
2000-01-01 00:00:00 1 |
2000-01-01 00:30:00 2 |
2000-01-01 01:00:00 3 |
Multiply data
>> new_qc = qc * 13
>> new_qc.data
d1*13 |
======================= |
2000-01-01 00:00:00 13 |
2000-01-01 00:30:00 26 |
2000-01-01 01:00:00 39 |
We can chain and mix with operations:
>> new_qc = (qc * 13) + 5
>> new_ac.data
(d1*13)+5 |
======================= |
2000-01-01 00:00:00 18 |
2000-01-01 00:30:00 31 |
2000-01-01 01:00:00 44 |
>> qc['arithmetic_juggling'] = new_qc / qc
>> qc['arithmetic_juggling'].data
d1 | arithmetic_juggling |
====================== | ============================== |
2000-01-01 00:00:00 1 | 2000-01-01 00:00:00 18.000000 |
2000-01-01 00:30:00 2 | 2000-01-01 00:30:00 15.500000 |
2000-01-01 01:00:00 3 | 2000-01-01 01:00:00 14.666667 |
make some flags:
>> qc['d2'] = qc.flagRange('d1', max=2)['d1']
>> qc.flags
d1 | d2 |
======================== | ========================== |
2000-01-01 00:00:00 -inf | 2000-01-01 00:00:00 -inf |
2000-01-01 00:30:00 -inf | 2000-01-01 00:30:00 -inf |
2000-01-01 01:00:00 -inf | 2000-01-01 01:00:00 255.0 |
flag d1
with value 100, where d2
is not flagged:
>> qc['d1'] = qc['d1'].where(~qc['d2'], flag=100)
>> qc.flags['d1']
d1 |
========================== |
2000-01-01 00:00:00 100.0 |
2000-01-01 00:30:00 100.0 |
2000-01-01 01:00:00 -inf |
Adding d1
and d2
yields all-NaN data
>> (qc['d1'] + qc['d2']).data
d1+d2 |
======================= |
2000-01-01 00:00:00 NaN |
2000-01-01 00:30:00 NaN |
2000-01-01 01:00:00 NaN |
because either, d1
or d2
have a flag at every timestamp.
>> qc['d2'].flags
d2 |
========================== |
2000-01-01 00:00:00 -inf |
2000-01-01 00:30:00 -inf |
2000-01-01 01:00:00 255.0 |
>> qc['d1'].flags
d1 |
========================== |
2000-01-01 00:00:00 100.0 |
2000-01-01 00:30:00 100.0 |
2000-01-01 01:00:00 -inf |
By including all the values of 'd1' that are flagged "better" than 200
, we can change the arithmetic operators input:
>> ((qc['d1'] > 200) + qc['d2']).data
(d1<200)+d2 |
======================== |
2000-01-01 00:00:00 2.0 |
2000-01-01 00:30:00 4.0 |
2000-01-01 01:00:00 NaN |
To include the last value that is flagged at level 255
in 'd2`, also change the right operators input:
>> qc_ = (qc['d1'] > 200) + (qc['d2'] > 300)
>> qc_.data
(d1<200)+(d2<300) |
====================== |
2000-01-01 00:00:00 2 |
2000-01-01 00:30:00 4 |
2000-01-01 01:00:00 6 |
Flags, where either 'd1' is flagged above 150
or 'd2' is above 150
:
>> qc_ = (qc['d1'] > 150) | (qc['d2'] < 150)
>> qc_.flags
(d1>150)|(d2<150) |
========================== |
2000-01-01 00:00:00 255.0 |
2000-01-01 00:30:00 255.0 |
2000-01-01 01:00:00 -inf |
Note, that qc_
field holds only NaN
in the data:
>> qc_.data
(d1>150)|(d2<150) |
======================= |
2000-01-01 00:00:00 NaN |
2000-01-01 00:30:00 NaN |
2000-01-01 01:00:00 NaN |
To combine flags with existing data, use where()
.
>> qc_=qc['d1'].where((qc['d1'] > 150) | (qc['d2'] > 250))
>> qc_.data
d1 |
======================== |
2000-01-01 00:00:00 1.0 |
2000-01-01 00:30:00 2.0 |
2000-01-01 01:00:00 3.0 |
>> qc_.flags
d1 |
========================== |
2000-01-01 00:00:00 100.0 |
2000-01-01 00:30:00 100.0 |
2000-01-01 01:00:00 255.0 |