Implementation of a programming interface
Ready for review @palmb and @luenensc .
A few things to get you started, as the diffs are probably not that helpful anymore. The changes concentrate on the following module/programs:
-
core/core.py
: in most parts a complete rewrite, theSaQC
class is implemented here. -
core/reader.py
: a complete rewrite but 'only' a refactoring. The functionality should be mostly identical, even though the option to have table like config files is gone. TBH this change should not be in there, but I never liked the actual reader implementation (or any of the preceding ones) and could not resist to make it 'better' (still don't know, if I really like this one, though) -
core/register.py
: the formerfuncs/register.py
. Quite some logic moved into the partial like objetcs defined here, e.g. the nan-masking (still the old, somewhat limited version without the changed sitting in !45 (closed) ) and some function argument checking. -
core/visitor.py
: the only thing left from the formercore.evaluator
module. The AST rewriting is gone, we now only walk the tree, check its validity and extract all relevant information into 'real' python objects (generic functions are 'compiled' to lambda functions, keyword arguments into dictionary items). -
funcs/functions.py
: The evaluation of generic functions moved into the method_execGeneric
-
funcs/harm_functions.py
: As the new config evaluation machinery now spits out real function objects, I changed quite a few places (manly function signatures and calls) to satisfy this new 'situation'. All tests passed but I guess @luenensc should have a thorough look into these changes.
A few shortcomings I'd like to mention:
- The plotting is not yet integrated into the API. It is available through the config in pretty much the same manner as before, put the
SaQC
object does not provide a respective method. This should change soon, but I wasn't quite sure how to integrate the existing plotting routines. - The
SaQC
class still has sort of static behavior:data
andflags
are passed as constructor arguments and only settable through direct attribute access. I think there should be the possibility to passdata
andflags
togetResult
(at least optionally). That way we could define anSaQC
object be declaring a bunch of tests and throw different data on it. I have a pretty good idea on how to implement that, but it requires some changes. What is your opinion on that? Should we treatdata
andflags
as:- constructor only arguments (ready to merge)
- optional arguments to constructor and
getResult
(ready to merge even though not complete yet) - arguments to only
getResult
(not ready to merge yet)
- I don't like the method name
SaQC.getResult
. Any suggestions are welcome... - The generic functions are not quite as useful when called through the API. They expect a function object as an argument to
func
(what I consider reasonable), but the parameters of this function are tightly coupled to the dataset, which is less ideal. To give an example:saqc.flagGeneric(field="precipitation", func=lambda temperature: temperature < 0)
"temperature"
, so the name of a dataset column morphed into the name of a function parameter. IDK, I have no idea how to make this more generic, but if you do, let me know...
General remarks:
After the main idea was settled, most of the changes really fell into place. I don't want to say, that there were no complications along the line, but it was mostly straight forward to get the general concepts into code. That is why I am pretty confident, that, once we get the initial set of bugs popping up during the first weeks of usage out, the proposed changes will improve saqc
in the long run. We didn't loose any features besides the table-like config files (and that fact has nothing to do with the structural changes) and got quite a few possibilities to increase performance and usability of the system. But maybe I am just a bit biased here...