Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
SaQC
Manage
Activity
Members
Labels
Plan
Issues
36
Issue boards
Milestones
Wiki
Code
Merge requests
8
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Model registry
Operate
Environments
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
rdm-software
SaQC
Commits
e4f17056
Commit
e4f17056
authored
5 years ago
by
Peter Lünenschloß
Browse files
Options
Downloads
Patches
Plain Diff
Update FunctionDescriptions.md (spikes_spectrumBased)
parent
c111c372
No related branches found
Branches containing commit
No related tags found
Tags containing commit
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
docs/FunctionDescriptions.md
+53
-2
53 additions, 2 deletions
docs/FunctionDescriptions.md
with
53 additions
and
2 deletions
docs/FunctionDescriptions.md
+
53
−
2
View file @
e4f17056
...
...
@@ -15,8 +15,8 @@ missing(nodata=NaN)
```
### Description
The Function flags those values in the the passed data series, that are
associated with "missing" data. The missing
value
indicator (np.nan by default)
,
can be altered to any other value by passing this new value to the
associated with "missing" data. The missing
data
indicator (
`
np.nan
`
by default)
,
can be altered to any other value by passing this new value to the
parameter
`nodata`
.
## sesonalRange
...
...
@@ -64,6 +64,29 @@ mad(length, z=3.5, freq=None)
Spikes_Basic(thresh=7, tol=0, length="15min")
```
### Description
A basic outlier test, that is designed to work for harmonized, as well as raw
(not-harmonized) data.
The values x(n), x(n+1), .... , x(n+k) of a passed timeseries x, are considered
spikes, if:
1.
|x(n-1) - x(n + s)| >
`thresh`
, for all integers s in {0,1,2,...,k}
2.
|x(n-1) - x(n+k+1)| <
`tol`
3.
|x(n-1).index - x(n+k+1).index| <
`length`
By this definition, spikes are values, that, after a jump of margin
`thresh`
(1),
are keeping that new value level they jumped to, for a timespan smaller than
`length`
(3), and do then return to the initial value level -
within a tolerance margin of
`tol`
(2).
Note, that this characterization of a "spike", not only includes one-value
outliers, but also plateau-ish value courses.
The implementation is a time-window based version of an outlier test from the
UFZ Python library, that can be found here:
https://git.ufz.de/chs/python/blob/master/ufz/level1/spike.py
## Spikes_SpektrumBased
...
...
@@ -75,6 +98,34 @@ Spikes_SpektrumBased(filter_window_size="3h", raise_factor=0.15, dev_cont_factor
```
### Description
The function detects and flags spikes in input data series by evaluating the
the timeseries' derivatives and applying some conditions to it.
NOTE, that the dataseries-to-be flagged is supposed to be harmonized to an
equadistant frequencie grid.
A datapoint x(k) of a dataseries x, is considered a spike, if:
1.
The quotient to its preceeding datapoint exceeds a certain bound:
*
x(k)/x(k-1) > 1 +
`raise_factor`
, or:
*
x(k)/x(k-1) < 1 -
`raise_factor`
2.
The quotient of the datas second derivate x'', at the preceeding
and subsequent timestamps is close enough to 1:
*
(1 -
`dev_cont_factor`
) < | x''(k-1)/x''(k+1) |, and
*
(1 +
`dev_cont_factor`
) > | x''(k-1)/x''(k+1) |
3.
The dataset, surrounding x(k), within
`noise_window_size`
range, but excluding
x(k), is not too noisy. Wheras the noisyness gets measured by
`noise_statistic`
:
*
'noise_statistic'(x.index(k-'noise_window_size'),...,
x.index(k+'noise_window') <
`noise_barrier`
This Function is a generalization of the Spectrum based Spike flagging
mechanism as presented in:
Dorigo,W,.... Global Automated Quality Control of In Situ Soil Moisture
Data from the international Soil Moisture Network. 2013. Vadoze Zone J.
doi:10.2136/vzj2012.0097.
## constant
### Signature
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment