Non scalar `field`s
I'd like to lift the restriction that field
arguments to test function need to be scalar and instead allow values of type Union[str, Sequence[str]]
.
In cases of multivariate tests (e.g. outliers.flagMVScores
) this restriction does not make sense and brings us to the situation, that we not only flag field
but also columns given in another argument (here: fields
). As we already allow regular expressions as field
values the scalar restriction is even less reasonable.
So I'd like to make the following changes:
-
Remove the value guards on the
field
parameter of allSaQC
- method calls -
Adjust the config reader to support coma separated
varname
fields -
Add another keyword to
@register
, namelyarity: Literal["multi", "single"]
(or something like that). We need a way to differentiate, if we should call a function repeatedly on all provided variables for univariate use cases or pass them as a collection to multivariate functions. I.e. we need a way to detect, if we should expandflagFoo(field=["var1", "var2"])
toflagFoo("var1").flagFoo("var2")
or not and marking the function itself seems to be an easy and appriote solution to me.We could probably get away without the
@register
parameter and yield the arity information from type hints tofield
(i.e.field: str
vsfield: Sequence[str]
), but I guess this is brittle and needs quite some care from function authors. -
Change the
SaQC
machinery implementing the field expansion.
All these changes should be straight forward to implement and I would take this over, if I get an okay from @palmb or/and @luenensc .