Skip to content
Snippets Groups Projects
Commit d207d490 authored by Peter Lünenschloß's avatar Peter Lünenschloß
Browse files

data regularisation snippets fixed

parent 177b40b1
No related branches found
No related tags found
7 merge requests!685Release 2.4,!684Release 2.4,!567Release 2.2.1,!566Release 2.2,!501Release 2.1,!372fix doctest snippets,!369Current documentation
......@@ -175,31 +175,25 @@ comprehensible from one look.
Shifted data
^^^^^^^^^^^^
>>> qc.dataRaw
SoilMoisture | SoilMoisture_bshift |
================================ | ========================================= |
Date Time | Date Time |
2021-01-01 00:00:00 23.429701 | 2021-01-01 00:09:07 23.429701 |
2021-01-01 00:10:00 23.431900 | 2021-01-01 00:18:55 23.431900 |
2021-01-01 00:20:00 23.343100 | 2021-01-01 00:28:42 23.343100 |
2021-01-01 00:30:00 23.476400 | 2021-01-01 00:38:30 23.476400 |
2021-01-01 00:40:00 23.343100 | 2021-01-01 00:48:18 23.343100 |
2021-01-01 00:50:00 23.298800 | 2021-01-01 00:58:06 23.298800 |
2021-01-01 01:00:00 23.387400 | 2021-01-01 01:07:54 23.387400 |
2021-01-01 01:10:00 23.343100 | 2021-01-01 01:17:41 23.343100 |
2021-01-01 01:20:00 23.298800 | 2021-01-01 01:27:29 23.298800 |
2021-01-01 01:30:00 23.343100 | 2021-01-01 01:37:17 23.343100 |
... | ... ... |
2021-03-20 07:20:00 156.587906 | 2021-03-20 05:07:02 137.271500 |
2021-03-20 07:30:00 NaN | 2021-03-20 05:21:35 138.194107 |
2021-03-20 07:40:00 166.146194 | 2021-03-20 05:41:59 154.116806 |
2021-03-20 07:50:00 164.690598 | 2021-03-20 06:03:09 150.567505 |
2021-03-20 08:00:00 NaN | 2021-03-20 06:58:10 145.027496 |
2021-03-20 08:10:00 NaN | 2021-03-20 07:13:49 152.883102 |
2021-03-20 08:20:00 NaN | 2021-03-20 07:26:16 156.587906 |
2021-03-20 08:30:00 NaN | 2021-03-20 07:40:37 166.146194 |
2021-03-20 08:40:00 155.318893 | 2021-03-20 07:54:59 164.690598 |
[11286] [10607]
>>> qc.dataRaw
SoilMoisture | SoilMoisture_bshift |
================================ | ======================================= |
2021-01-01 00:09:07 23.429701 | 2021-01-01 00:00:00 23.429701 |
2021-01-01 00:18:55 23.431900 | 2021-01-01 00:10:00 23.431900 |
2021-01-01 00:28:42 23.343100 | 2021-01-01 00:20:00 23.343100 |
2021-01-01 00:38:30 23.476400 | 2021-01-01 00:30:00 23.476400 |
2021-01-01 00:48:18 23.343100 | 2021-01-01 00:40:00 23.343100 |
... | ... ... |
2021-03-20 07:13:49 152.883102 | 2021-03-20 08:10:00 NaN |
2021-03-20 07:26:16 156.587906 | 2021-03-20 08:20:00 NaN |
2021-03-20 07:40:37 166.146194 | 2021-03-20 08:30:00 NaN |
2021-03-20 07:54:59 164.690598 | 2021-03-20 08:40:00 155.318893 |
2021-03-20 08:40:41 155.318893 | 2021-03-20 08:50:00 NaN |
[10607] [11286]
<BLANKLINE>
max: [11286 rows x 2 columns]
<BLANKLINE>
We see, the first and last *10* datapoints of both, the original data time series and the shifted one.
......@@ -258,10 +252,7 @@ within the *10* minutes interval ranging from ``2021-01-01 07:30:00`` to ``2021-
in the original data - and only the first of the two reappears in the shifted data set, as representation
for that interval.
.. code-block:: python
>>> data_result['2021-01-01T07:00:00':'2021-01-01T08:00:00']
>>> qc.dataRaw["2021-01-01 07:00:00":"2021-01-01 08:00:00"]
SoilMoisture_bshift | SoilMoisture |
================================ | ========================================= |
Date Time | Date Time |
......@@ -282,13 +273,8 @@ appears to be closer to the original one.
To shift to any frequncy aligned timestamp the value that is closest to that timestamp, we
can perform a *nearest shift* instead of a simple *back shift*\ , by using the shift method ``"nshift"``\ :
.. code-block:: python
>>> saqc = saqc.shift('SoilMoisture', target='SoilMoisture_nshift', freq='10min', method='nshift')
>>> saqc = saqc.evaluate()
>>> data_result = saqc.getResult(raw=True)[0]
>>> data_result['2021-01-01T07:00:00':'2021-01-01T08:00:00']
>>> qc = qc.shift('SoilMoisture', target='SoilMoisture_nshift', freq='10min', method='nshift')
>>> qc.dataRaw['2021-01-01T07:00:00':'2021-01-01T08:00:00']
SoilMoisture_nshift | SoilMoisture |
================================ | ========================================= |
Date Time | Date Time |
......@@ -305,11 +291,7 @@ timestamp would get assigned the nearest value of all the values, that preceed o
Maybe check out, what happens with the chunk of the final 2 hours of our shifted *Soil Moisture* dataset, to get an idea.
.. code-block:: python
>>> data_result['2021-03-20 07:00:00']
>>> qc.dataRaw['2021-03-20 07:00:00']
SoilMoisture_nshift | SoilMoisture |
================================ | ========================================= |
Date Time | Date Time |
......@@ -326,6 +308,7 @@ Maybe check out, what happens with the chunk of the final 2 hours of our shifted
2021-03-20 08:40:00 155.318893 | |
2021-03-20 08:50:00 NaN | |
Since there is no valid data available, for example, in the interval from ``2021-03-20 07:55:00`` to ``2021-03-20 08:05:00`` - the new value
for the regular timestamp ``2021-03-20 08:00:00``\ , that lies in the center of this interval, is ``NaN``.
......@@ -337,13 +320,9 @@ selecting a single one, we can do this, with the :py:func:`saqc.resample <Functi
Lets resample the *SoilMoisture* data to have a *20* minutes sample rate by aggregating every *20* minutes intervals
content with the arithmetic mean (which is implemented by numpies ``numpy.mean`` function for example).
.. code-block:: python
>>> import numpy
>>> saqc = saqc.resample('SoilMoisture', target='SoilMoisture_mean', freq='20min', method='bagg', agg_func=np.mean)
>>> saqc = saqc.evaluate()
saqc.getResult(raw=True)[0]
>>> qc = qc.resample('SoilMoisture', target='SoilMoisture_mean', freq='20min', method='bagg', agg_func=np.mean)
>>> qc.dataRaw
SoilMoisture | SoilMoisture_mean |
================================ | ===================================== |
Date Time | Date Time |
......@@ -393,7 +372,7 @@ Interpolation
Another common way of obtaining regular timestamps, is, the interpolation of data at regular timestamps.
In the pool of :doc:`regularisation <function_cats/regularisation>` methods, is available the
In the pool of py:mod:`regularisation <Functions.saqc.resampling>` methods, is available the
:py:func:`saqc.interpolate <Functions.saqc.interpolate>` method.
Lets apply a linear interpolation onto the dataset. To access
......@@ -403,12 +382,8 @@ applies an interpolation, that is sensitive to the difference in temporal gaps
of the possible interpolation methods in the :py:func:`saqc.interpolate <Functions.saqc.interpolate>`
documentation. Lets check the results:
.. code-block:: python
>>> saqc = saqc.interpolate('SoilMoisture', target='SoilMoisture_linear', freq='10min', method='time')
>>> saqc = saqc.evaluate()
>>> saqc.getResult(raw=True)[0]
>>> qc = qc.interpolate('SoilMoisture', target='SoilMoisture_linear', freq='10min', method='time')
>>> qc.dataRaw
SoilMoisture | SoilMoisture_linear |
================================ | ========================================= |
Date Time | Date Time |
......@@ -466,13 +441,8 @@ Since data, that is flagged by a level higher or equal to the passed ``to_mask``
it can be of advantage, to flag data before regularisation in order to effectively exclude it
from the resulting regularly sampled data set. Lets see an example for the *SoilMoisture* data set.
.. code-block:: python
>>> saqc = saqc.linear('SoilMoisture', target='SoilMoisture_linear', freq='10min')
>>> saqc = saqc.evaluate()
>>> d = saqc.getResult(raw=True)[0]
>>> d['2021-01-01 15:00:00':'2021-01-01 16:00:00']
>>> qc = qc.linear('SoilMoisture', target='SoilMoisture_linear', freq='10min')
>>> qc.dataRaw['2021-01-01 15:00:00':'2021-01-01 16:00:00']
SoilMoisture_linear | SoilMoisture |
================================ | ========================================= |
Date Time | Date Time |
......@@ -501,12 +471,9 @@ for the interpolation at ``2021-01-01 15:40:00``. So lets flag all the values sm
with the :py:func:`saqc.flagRange <Functions.saqc.flagRange>` method and after this,
do the interpolation.
.. code-block:: python
>>> saqc = saqc.outliers.flagRange('SoilMoisture', min=0)
>>> saqc = saqc.resampling.interpolate('SoilMoisture', freq='10min', method='time')
>>> saqc.getResult(raw=True)[0]['2021-01-01T07:00:00':'2021-01-01T08:00:00']
>>> qc = qc.flagRange('SoilMoisture', min=0)
>>> qc = qc.interpolate('SoilMoisture', freq='10min', method='time')
>>> qc.dataRaw['2021-01-01T07:00:00':'2021-01-01T08:00:00']
SoilMoisture | SoilMoisture_original |
================================ | ========================================= |
Date Time | Date Time |
......
......@@ -83,3 +83,5 @@ Indices and tables
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
:py:meth:`saqc.SaQC.flagMissing`
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment