Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
SaQC
Manage
Activity
Members
Labels
Plan
Issues
35
Issue boards
Milestones
Wiki
Code
Merge requests
8
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Model registry
Operate
Environments
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
rdm-software
SaQC
Commits
f91ac835
Commit
f91ac835
authored
5 years ago
by
David Schäfer
Browse files
Options
Downloads
Patches
Plain Diff
updated README
parent
988f7fe2
No related branches found
Branches containing commit
No related tags found
Tags containing commit
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
README.md
+44
-0
44 additions, 0 deletions
README.md
with
44 additions
and
0 deletions
README.md
+
44
−
0
View file @
f91ac835
...
...
@@ -81,3 +81,47 @@ For example:
Let
`var1`
and
`var2`
be two variables of a given dataset and
`func: var1 > mean(var1)`
the condition wheter to flag
`var2`
. The result of the check can be described
as
`isflagged(var1) & istrue(func())`
.
## Contributing
### Testing
Please run the tests before you commit!
```
sh
python
-m
pytest
test
```
can save us a lot of time...
### New QC-Algorithms
Currently all test algorithms are collected within the module
`funcs.functions`
.
In order to make your test available for the system you need to:
-
Place your code into the file
`funcs/functions.py`
-
Register your function by adding it to the dictionary
`func_map`
within the function body of
`funcs.functions.flagDispatch`
. Your function
will be available to the system by its key.
-
Implement the common interface:
+
Function input:
Your function needs to accept the following arguments:
+
`data: pd.DataFrame`
: A dataframe holding the entire dataset (i.e. not only
the variable, the current test is performed on)
+
`flags: pd.DataFrame`
: A dataframe holding the flags for the entire
dataset
+
`field: String`
: The name of the variable the current test is performed on.
The data and flags for this variable is available via
`data[field]`
and
`flags[field]`
respectively
+
`flagger: flagger.BaseFlagger`
: An instance of the
`BaseFlagger`
class
(more likely one of its subclasses). To initialize, create or check
against existing flags you should use the respective
`flagger`
-methods
(
`flagger.empytFlags`
,
`flagger.isFlagged`
and
`flagger.setFlag`
)
+
`**kwargs: Any`
: All the parameters given in the configuration file are passed
to your function, you are of course free to make some of them requires
by the signature.
`kwargs`
should be passed on to the
`flagger.setFlag`
methods, in order to allow configuration based fine tuning of the flagging
+
Function output:
Your function needs to return two DataFrame/ndarray, data and flags. As
the names suggest, the first holds the data, the second the possibly
modified flags
+
Note: The choosen interface allows you to not only manipulate
the flags, but also the data of the entire dataset within your function
body. This freedom might come in handy, but also requires a certain amount
of care to not mess things up!
+
Example: The function
`flagRange`
in
`funcs/functions.py`
may serve as an
simple example of the general scheme
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment