flatten the hirarchy
I just on the implementation of the new unified flagger (see !189 (closed)) which also provide a new syntax like
some random examples
# set some flags
flagger[:, 'var1'] = BAD
# get a series
series_v1 = flagger['var1']
# set a new series
flagger['new'] = pd.Series([1,2,3,4], dtype=float)
# slice to only bads
bad_flagger = flagger[flagger == BAD]
this will somehow work just fine, and we will add more methods that handle all our (and the users) needs.. so far so good..
the things we need are briefly:
- set flags only if worse
- get flags
- compare flags (like old
isFlagged
) - set flags condition-less (former force)
- insert a new column
- delete (drop) a column
- replace/overwrite or rename a column
- etc.
all in all we need some user-friendly stuff, like set flags only iff they are worse, but we also need some advanced stuff, like replacing whole columns. The latter mainly for all the data-processing features we want to provide in saqc.
Currently we implement a core that abstract somehow the flagger, which itself abstract somehow, a DictOfSeries. So i came to ask myself why not rename the flagger to flags, than the above example look like this:
flagger becomes flags
# set some flags
flags[:, 'var1'] = BAD
# get a series
series_v1 = flags['var1']
# set a new series
flags['new'] = pd.Series([1,2,3,4], dtype=float)
# slice to only bads
bad_flags = flags[flags == BAD]
wow! thats looks nice :D, (in my opinon). I know, i know, i already suggested this at some point and the general answer was something like why not, looks nice.. So far so good, again. but now the actual thing...
This flagger/flags-thing is somehow nothing else that a DictOfSeries, except for some checks, like is everything a valid unified-flag aka float. The other thing that is missing is a simple user-friendly method to only set flags if they are worse and maybe one or two other helper-methods.
If a user currently want to destroy our flags (with the old BaseFlagger-interface), its totally possible, we just somehow ask the user not to do so: please only use the functions that are save (except you know what yr doing...
So i suggest to clean up DictOfSeries, provide one, two save user functions and were done; instead of providing a whole new level of abstraction, which makes it even harder to get the whole picture.. And also is quite a bunch of code, which in the end also just wraps DictOfSeries..
If the user messed around with our frame in a self-registered function, we can also check after the actual function-call in the core. So we loose not much. And that we will not mess around, if we write good test ;)..
i imagine something like this:
# a user helper
# set some flags if they are worse
flags.update[:, 'var1'] = BAD
# set some flags unconditionally (!)
flags[:, 'var1'] = BAD
# get a series
series_v1 = flags['var1']
# set a new series
flags['new'] = pd.Series([1,2,3,4], dtype=float)
# slice to only bads
bad_flags = flags[flags == BAD]
or if you dont like accessors like the update[]
we could simply provide a method update()
..
what do you think ?