concatFlags always overwrites existing flags
Summary
(Summarize the bug encountered concisely)
Reproducible Example
import numpy as np
import pandas as pd
from saqc import SaQC
df = pd.DataFrame(
data={"a": [1, 2, 5, 4, 3,]},
index=pd.to_datetime(["2020-01-01 00:00", "2020-01-01 00:10", "2020-01-01 00:30", "2020-01-01 00:40", "2020-01-01 01:00"])
)
qc = SaQC(df)
qc = qc.flagRange(field="a", max=4)
# branch out to another variable
qc = qc.flagRange(field="a", target="b", max=3)
# bring the flags back again
qc = qc.concatFlags("b", target="a", squeeze=True)
What is the current bug behavior?
>>> qc._flags.history["a"]
0 1
2020-01-01 00:00:00 nan nan
2020-01-01 00:10:00 nan nan
2020-01-01 00:30:00 255.0 255.0
2020-01-01 00:40:00 nan 255.0
2020-01-01 01:00:00 nan nan
So, the original flag in column 0
got overwritten.
What is the expected correct behavior?
We should at least have the option to reproduce the behavior of the following in theory equivalent example:
import numpy as np
import pandas as pd
from saqc import SaQC
df = pd.DataFrame(
data={"a": [1, 2, 5, 4, 3,]},
index=pd.to_datetime(["2020-01-01 00:00", "2020-01-01 00:10", "2020-01-01 00:30", "2020-01-01 00:40", "2020-01-01 01:00"])
)
qc = SaQC(df)
qc = qc.flagRange(field="a", max=4)
qc = qc.flagRange(field="a", max=3)
Which obviously respects existing flags and yields:
>>> qc._flags.history["a"]
0 1
2020-01-01 00:00:00 nan nan
2020-01-01 00:10:00 nan nan
2020-01-01 00:30:00 255.0 nan
2020-01-01 00:40:00 nan 255.0
2020-01-01 01:00:00 nan nan
Possible fixes
I suggest to add an option to concatFlags
, with which we can control, if concatFlags
should overwrite existing flags or not.