flatten the hirarchy

I just on the implementation of the new unified flagger (see !189 (closed)) which also provide a new syntax like

some random examples

# set some flags
flagger[:, 'var1'] = BAD
# get a series 
series_v1 = flagger['var1']
# set a new series
flagger['new'] = pd.Series([1,2,3,4], dtype=float)
# slice to only bads
bad_flagger = flagger[flagger == BAD] 

this will somehow work just fine, and we will add more methods that handle all our (and the users) needs.. so far so good..

the things we need are briefly:

  • set flags only if worse
  • get flags
  • compare flags (like old isFlagged)
  • set flags condition-less (former force)
  • insert a new column
  • delete (drop) a column
  • replace/overwrite or rename a column
  • etc.

all in all we need some user-friendly stuff, like set flags only iff they are worse, but we also need some advanced stuff, like replacing whole columns. The latter mainly for all the data-processing features we want to provide in saqc.

Currently we implement a core that abstract somehow the flagger, which itself abstract somehow, a DictOfSeries. So i came to ask myself why not rename the flagger to flags, than the above example look like this:

flagger becomes flags

# set some flags
flags[:, 'var1'] = BAD
# get a series 
series_v1 = flags['var1']
# set a new series
flags['new'] = pd.Series([1,2,3,4], dtype=float)
# slice to only bads
bad_flags = flags[flags == BAD] 

wow! thats looks nice :D, (in my opinon). I know, i know, i already suggested this at some point and the general answer was something like why not, looks nice.. So far so good, again. but now the actual thing...

This flagger/flags-thing is somehow nothing else that a DictOfSeries, except for some checks, like is everything a valid unified-flag aka float. The other thing that is missing is a simple user-friendly method to only set flags if they are worse and maybe one or two other helper-methods.

If a user currently want to destroy our flags (with the old BaseFlagger-interface), its totally possible, we just somehow ask the user not to do so: please only use the functions that are save (except you know what yr doing...

So i suggest to clean up DictOfSeries, provide one, two save user functions and were done; instead of providing a whole new level of abstraction, which makes it even harder to get the whole picture.. And also is quite a bunch of code, which in the end also just wraps DictOfSeries..

If the user messed around with our frame in a self-registered function, we can also check after the actual function-call in the core. So we loose not much. And that we will not mess around, if we write good test ;)..

i imagine something like this:

# a user helper
# set some flags if they are worse
flags.update[:, 'var1'] = BAD

# set some flags unconditionally (!)
flags[:, 'var1'] = BAD
# get a series 
series_v1 = flags['var1']
# set a new series
flags['new'] = pd.Series([1,2,3,4], dtype=float)
# slice to only bads
bad_flags = flags[flags == BAD] 

or if you dont like accessors like the update[] we could simply provide a method update()..

what do you think ?

Edited by Bert Palm