Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
SaQC
Manage
Activity
Members
Labels
Plan
Issues
36
Issue boards
Milestones
Wiki
Code
Merge requests
8
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Model registry
Operate
Environments
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
rdm-software
SaQC
Commits
ec941ef5
Commit
ec941ef5
authored
5 years ago
by
David Schäfer
Browse files
Options
Downloads
Patches
Plain Diff
Update FunctionDescriptions.md
parent
87cae1d8
No related branches found
Branches containing commit
No related tags found
Tags containing commit
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
docs/FunctionDescriptions.md
+16
-16
16 additions, 16 deletions
docs/FunctionDescriptions.md
with
16 additions
and
16 deletions
docs/FunctionDescriptions.md
+
16
−
16
View file @
ec941ef5
...
...
@@ -50,7 +50,7 @@ range(min, max)
| parameter | data type | default value | description |
| --------- | --------- | ------------- | ----------- |
| min | float | | Upper bound for valid values. ($
`<`
$) |
| max | float | | lower bound for valid values. ($
`
\geq
`
$)|
| max | float | | lower bound for valid values. ($
`
>
`
$)|
The function flags all the values, that exceed the closed interval $
`[`
$
`min`
,
`max`
$
`]`
$.
...
...
@@ -64,14 +64,14 @@ sesonalRange(min, max, startmonth=1, endmonth=12, startday=1, endday=31)
| parameter | data type | default value | description |
| --------- | ----------- | ---- | ----------- |
| min | float | | Upper bound for valid values. ($
`<`
$) |
| max | float | | lower bound for valid values. ($
`
\geq
`
$) |
| max | float | | lower bound for valid values. ($
`
>
`
$) |
| startmonth | integer |
`1`
| interval start month |
| endmonth | integer |
`12`
| interval end month |
| startday | integer |
`1`
| interval start day |
| endday | integer |
`31`
| interval end day |
The function do the same as
`range`
do
(flags all data, that exceed the interval $
`[`
$
`min`
,
`max`
$
`
)
`
$),
The function do
es
the same as
`range`
(flags all data, that exceed the interval $
`[`
$
`min`
,
`max`
$
`
]
`
$),
but only, if the timestamp of the data-point lies in a time interval defined by day and month only.
The year is
**not**
used by the interval calculation.
The left interval boundary is defined by
`startmonth`
and
`startday`
, the right by
`endmonth`
and
`endday`
.
...
...
@@ -107,7 +107,7 @@ of isolated values.
A continuous group of values
$
`x_{k}, x_{k+1},...,x_{k+n}`
$ of the timeseries of meassurements $
`x`
$,
is considered "isolated", if:
is considered
to be
"isolated", if:
1.
There are no values, preceeding $
`x_{k}`
$ within
`isolation_range`
or all the
preceeding values within this range are flagged with a flag listed in
...
...
@@ -128,12 +128,12 @@ missing(nodata=NaN)
| parameter | data type | default value | description |
| --------- | ---------- | -------------- | ----------- |
| nodata | any |
`N
a
N`
| Value indicating missing values in the passed data. |
| nodata | any |
`N
A
N`
| Value indicating missing values in the passed data. |
The function flags those values in the the passed data series, that are
associated with "missing" data. The missing data indicator (default:
`N
a
N`
), can
be altered to any other value by passing this
new
value to the parameter
`nodata`
.
associated with "missing" data. The missing data indicator (default:
`N
A
N`
), can
be altered to any other value by passing this value to the parameter
`nodata`
.
### clear
...
...
@@ -204,14 +204,14 @@ spikes_simpleMad(winsz="1h", z=3.5)
The
*modified Z-score*
[1] is used to detect outlier.
All values are flagged as outlier, if in any slice of th
w
sliding window, a value fulfill:
All values are flagged as outlier, if in any slice of th
e
sliding window, a value fulfill
s
:
```
math
0.6745 * |x - M| > mad * z > 0
```
with $
`
x, M, mad, z
`
$: window data, window median, window median absolute deviation,
`z`
.
The window is
continu
ed by one frequency step.
with $
`x, M, mad, z`
$: window data, window median, window median absolute deviation,
`z`
.
The window is
mov
ed by one frequency step.
Note: This function should only applied on normali
s
ed data.
Note: This function should only
be
applied on normali
z
ed data.
See also:
[1] https://www.itl.nist.gov/div898/handbook/eda/section3/eda35h.htm
...
...
@@ -235,15 +235,15 @@ spikes_slidingZscore(winsz="1h", dx="1h", count=1, deg=1, z=3.5, method="modZ")
Parameter notes:
-
`winsz`
and
`dx`
must be of same type, mixing of offset and integer is not supported and will fail.
-
if
offset-strings only work with datetime indexed data
-
offset-strings only work with datetime indexed data
The algorithm works as follows:
1.
a window of size
`winsz`
is cut from the data
2.
normalisation - (the data is fit by a polynomial of the given degree
`deg`
, which is subtracted from the data)
3.
the outlier detection
`method`
is applied on the residual, and possible outlier are marked
4.
the window (on the data) is
continu
ed by
`dx`
to the next data-slot
4.
the window (on the data) is
mov
ed by
`dx`
5.
start over from 1. until the end of data is reached
6.
all potential outlier, that are detected
`count`
-many times, are flagged as outlier
6.
all potential outlier
s
, that are detected
`count`
-many times, are flagged as outlier
The possible outlier detection methods are
*zscore*
and
*modZ*
.
In the following description, the residual (calculated from a slice by the sliding window) is referred as
*data*
.
...
...
@@ -252,7 +252,7 @@ The **zscore** (Z-score) [1] mark every value as possible outlier, which fulfill
```
math
|r - m| > s * z
```
with $
`
r, m, s, z
`
$: data, data mean, data standard deviation,
`z`
.
with $
`r, m, s, z`
$: data, data mean, data standard deviation,
`z`
.
The
**modZ**
(modified Z-score) [1] mark every value as possible outlier, which fulfill:
```
math
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment