How to source-target multivariate functions?

While working on !320 (closed) I realized that out source-target concept for multivariate functions is not yet sound enough.


Consider a univariate function call like:

saqc.flagUnivariate(field=["a", "b"], target=["x", "y"])

Here the semantics is reasonable and (as a reminder) as follows:

  • Copy a -> x and b -> y. This includes data, Flags and History
  • Call the function as saqc.flagUnvariate(field="x") ; saqc.flagUnivariate(field="y")

When we go into multivariate terrain, a similar function call behaves differently. An example:

saqc.flagMultivariate(field=["a", "b"], target=["x", "y"])

Here we could still follow a similar semantics like:

  • Copy a -> x and b -> y. This includes data, Flags and History
  • Call the function as saqc.flagMultivariate(field=["a", "b"], target=["x", "y"])

But as flagMultivariate gets both field and target and is free to do what it wants to do, things already start to get a bit messy... The following questions could come to mind:

  • Does it generally make sense to map a to x and b to y or should they be 'fresh' and empty variables?
  • Is it sensible to have two separated histories for x and y, where each variable carries only the legacy of their source variable? Or should x and y reflect the 'merge' of the histories of a and b?

If we go into the more likely (?) multivariate use case, where a single target is generated from multiple fields, we are completely on yet undefined terrain:

saqc.flagMultivariate(field=["a", "b"], target="x")

Now, how should we generate x? From a-> x or from b -> x? As an 'empty' variable?. While the latter would make sense to me, we would also loose the histories from a and b (which we shouldn't, IMO).


Currently (but this might rapidly change), I tend to the following for multivariate functions:

  • Initialize all targets as empty variables, i.e. data[t].isna() for t in target (restriction: all field variables need the same index)
  • All targets get same History and Flags which are the product of all Historys and Flags from all field variables. How such a History merge could look like however, is still not clear to me.

I need your thoughts here @palmb and @luenensc !