From 3cf8dda6ead653ae9de66fcb119727d027e81844 Mon Sep 17 00:00:00 2001
From: Peter Luenenschloss <peter.luenenschloss@ufz.de>
Date: Wed, 27 Nov 2019 08:52:36 +0100
Subject: [PATCH] Update FunctionDescriptions.md (formula typesetting,
 parameter tabs)

---
 docs/FunctionDescriptions.md | 66 +++++++++++++++++++++++-------------
 1 file changed, 43 insertions(+), 23 deletions(-)

diff --git a/docs/FunctionDescriptions.md b/docs/FunctionDescriptions.md
index 0bff41158..9aa360503 100644
--- a/docs/FunctionDescriptions.md
+++ b/docs/FunctionDescriptions.md
@@ -19,6 +19,11 @@ associated with "missing" data. The missing data indicator (`np.nan` by default)
 , can be altered to any other value by passing this new value to the 
 parameter `nodata`.
 
+| parameter | description |
+| ------ | ------ |
+| nodata | Value. (Default = np.nan). Any value, that shall indicate missing data in the passed dataseries. |
+           
+
 ## sesonalRange
 ### Signature
 ```
@@ -61,33 +66,38 @@ mad(length, z=3.5, freq=None)
 ## Spikes_Basic
 ### Signature
 ```
-Spikes_Basic(thresh=7, tol=0, length="15min")
+Spikes_Basic(thresh, tolerance, window_size)
 ```
 ### Description
 A basic outlier test, that is designed to work for harmonized, as well as raw 
 (not-harmonized) data.
 
-The values x(n), x(n+1), .... , x(n+k) of a passed timeseries x, are considered
-spikes, if:
+The values $`x_{n}, x_{n+1}, .... , x_{n+k} `$ of a passed timeseries $`x`$, 
+are considered spikes, if:
 
-1. |x(n-1) - x(n + s)| > `thresh`, for all integers s in {0,1,2,...,k}
+1. $`|x_{n-1} - x_{n + s}| > `$ `thresh`, $` s \in \{0,1,2,...,k\} `$
 
-2. |x(n-1) - x(n+k+1)| < `tol`
+2. $`|x_{n-1} - x_{n+k+1}| < `$ `tolerance`
 
-3. |x(n-1).index - x(n+k+1).index| < `length`
+3. $` |y_{n-1} - y_{n+k+1}| < `$ `window_size`, with $`y `$, denoting the series 
+   of timestamps associated with $`x `$.
 
 By this definition, spikes are values, that, after a jump of margin `thresh`(1), 
 are keeping that new value level they jumped to, for a timespan smaller than 
-`length` (3), and do then return to the initial value level - 
-within a tolerance margin of `tol` (2).  
+`window_size` (3), and do then return to the initial value level - 
+within a tolerance margin of `tolerance` (2).  
+
 Note, that this characterization of a "spike", not only includes one-value 
 outliers, but also plateau-ish value courses.
 
 The implementation is a time-window based version of an outlier test from the 
-UFZ Python library, that can be found here:
-
-https://git.ufz.de/chs/python/blob/master/ufz/level1/spike.py
+UFZ Python library, that can be found [here](https://git.ufz.de/chs/python/blob/master/ufz/level1/spike.py).
 
+| parameter | description |
+| ------ | ------ |
+| thresh | Float. <br/> Minimum jump margin for spikes. See condition (1). |
+| tolerance | Float. <br/> Range of area, containing al "valid return values". See condition (2). |
+| window_size | Offset String. <br/> An offset string, denoting the maximal length of "spikish" value courses. See condition (3). |
 
 ## Spikes_SpektrumBased
 ### Signature
@@ -99,25 +109,25 @@ Spikes_SpektrumBased(filter_window_size="3h", raise_factor=0.15, dev_cont_factor
 ### Description
 
 The function detects and flags spikes in input data series by evaluating the 
-the timeseries' derivatives and applying some conditions to it. 
+the timeseries' derivatives and applying some conditions to them. 
 
 NOTE, that the dataseries-to-be flagged is supposed to be harmonized to an 
 equadistant frequencie grid.
 
-A datapoint x(k) of a dataseries x, is considered a spike, if:
+A datapoint $`x_k `$ of a dataseries $`x`$, 
+is considered a spike, if:
 
 1. The quotient to its preceeding datapoint exceeds a certain bound:
-    * x(k)/x(k-1) > 1 + `raise_factor`, or:
-    * x(k)/x(k-1) < 1 - `raise_factor`
-2. The quotient of the datas second derivate x'', at the preceeding 
+    * $`|\frac{x_k}{x_{k-1}}| > 1 +`$ `raise_factor`, or:
+    * $`|\frac{x_k}{x_{k-1}}| < 1 -`$ `raise_factor`
+2. The quotient of the datas second derivate $`x''`$, at the preceeding 
    and subsequent timestamps is close enough to 1:
-    * (1 - `dev_cont_factor`) < | x''(k-1)/x''(k+1) |, and
-    * (1 + `dev_cont_factor`) > | x''(k-1)/x''(k+1) |   
-3. The dataset, surrounding x(k), within `noise_window_size` range, but excluding 
-   x(k), is not too noisy. Wheras the noisyness gets measured by 
-   `noise_statistic`: 
-    * 'noise_statistic'(x.index(k-'noise_window_size'),...,
-      x.index(k+'noise_window') < `noise_barrier`
+    * $`|\frac{x''_{k-1}}{x''_{k+1}} | > 1 -`$ `dev_cont_factor`, and
+    * $`|\frac{x''_{k-1}}{x''_{k+1}} | < 1 +`$ `dev_cont_factor`   
+3. The dataset, $`X_k`$, surrounding $`x_{k}`$, within `noise_window_size` range, 
+   but excluding $`x_{k}`$, is not too noisy. Wheras the noisyness gets measured 
+   by `noise_statistic`: 
+    * `noise_statistic`$`(X_k) <`$ `noise_barrier`
 
 
 This Function is a generalization of the Spectrum based Spike flagging 
@@ -127,6 +137,16 @@ Dorigo,W,.... Global Automated Quality Control of In Situ Soil Moisture
 Data from the international Soil Moisture Network. 2013. Vadoze Zone J. 
 doi:10.2136/vzj2012.0097.
 
+All parameters default to the values given there.
+
+| parameter | description |
+| ------ | ------ |
+| raise_factor | Float. (Default=0.15). <br/> Minimum change margin for a datapoint to become a candidate for a spike. See condition (1). |
+| dev_cont_factor | Float. (Default=0.2). <br/> See condition (2). |
+| noise_barrier| Float. (Default=1). <br/> Upper bound for noisyness of data surrounding potential spikes. See condition (3).|
+| noise_window_size| Offset String. (Default='12h'). <br/> Size of the timewindow of the "surrounding" data of a potential spike. See condition (3). |
+| noise_statistic| String. (Default="CoVar"). <br/> Operator to calculate noisyness of data, surrounding potential spike. Either "Covar" (=Coefficient od Variation) or "rvar" (=relative Variance).|
+
 ## constant
 ### Signature
 ```
-- 
GitLab