Implementation of a programming interface (!38) · Merge requests · rdm-software / SaQC

David Schäfer requested to merge api into develop Apr 22, 2020

Ready for review @palmb and @luenensc .

A few things to get you started, as the diffs are probably not that helpful anymore. The changes concentrate on the following module/programs:

core/core.py: in most parts a complete rewrite, the SaQC class is implemented here.
core/reader.py: a complete rewrite but 'only' a refactoring. The functionality should be mostly identical, even though the option to have table like config files is gone. TBH this change should not be in there, but I never liked the actual reader implementation (or any of the preceding ones) and could not resist to make it 'better' (still don't know, if I really like this one, though)
core/register.py: the former funcs/register.py. Quite some logic moved into the partial like objetcs defined here, e.g. the nan-masking (still the old, somewhat limited version without the changed sitting in !45 (closed) ) and some function argument checking.
core/visitor.py: the only thing left from the former core.evaluator module. The AST rewriting is gone, we now only walk the tree, check its validity and extract all relevant information into 'real' python objects (generic functions are 'compiled' to lambda functions, keyword arguments into dictionary items).
funcs/functions.py: The evaluation of generic functions moved into the method _execGeneric
funcs/harm_functions.py: As the new config evaluation machinery now spits out real function objects, I changed quite a few places (manly function signatures and calls) to satisfy this new 'situation'. All tests passed but I guess @luenensc should have a thorough look into these changes.

A few shortcomings I'd like to mention:

The plotting is not yet integrated into the API. It is available through the config in pretty much the same manner as before, put the SaQC object does not provide a respective method. This should change soon, but I wasn't quite sure how to integrate the existing plotting routines.
The SaQC class still has sort of static behavior: data and flags are passed as constructor arguments and only settable through direct attribute access. I think there should be the possibility to pass data and flags to getResult (at least optionally). That way we could define an SaQC object be declaring a bunch of tests and throw different data on it. I have a pretty good idea on how to implement that, but it requires some changes. What is your opinion on that? Should we treat data and flags as:
- constructor only arguments (ready to merge)
- optional arguments to constructor and getResult (ready to merge even though not complete yet)
- arguments to only getResult (not ready to merge yet)
I don't like the method name SaQC.getResult. Any suggestions are welcome...
The generic functions are not quite as useful when called through the API. They expect a function object as an argument to func (what I consider reasonable), but the parameters of this function are tightly coupled to the dataset, which is less ideal. To give an example:
```
saqc.flagGeneric(field="precipitation", func=lambda temperature: temperature < 0) 
```
To make this work, our dataset needs to have a column called "temperature", so the name of a dataset column morphed into the name of a function parameter. IDK, I have no idea how to make this more generic, but if you do, let me know...

General remarks: After the main idea was settled, most of the changes really fell into place. I don't want to say, that there were no complications along the line, but it was mostly straight forward to get the general concepts into code. That is why I am pretty confident, that, once we get the initial set of bugs popping up during the first weeks of usage out, the proposed changes will improve saqc in the long run. We didn't loose any features besides the table-like config files (and that fact has nothing to do with the structural changes) and got quite a few possibilities to increase performance and usability of the system. But maybe I am just a bit biased here...

Edited May 18, 2020 by David Schäfer

Implementation of a programming interface

Merge request reports