Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
SaQC
Manage
Activity
Members
Labels
Plan
Issues
36
Issue boards
Milestones
Wiki
Code
Merge requests
8
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Model registry
Operate
Environments
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
rdm-software
SaQC
Commits
d207d490
Commit
d207d490
authored
3 years ago
by
Peter Lünenschloß
Browse files
Options
Downloads
Patches
Plain Diff
data regularisation snippets fixed
parent
177b40b1
No related branches found
Branches containing commit
No related tags found
Tags containing commit
7 merge requests
!685
Release 2.4
,
!684
Release 2.4
,
!567
Release 2.2.1
,
!566
Release 2.2
,
!501
Release 2.1
,
!372
fix doctest snippets
,
!369
Current documentation
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
sphinx-doc/cook_books/DataRegularisation.rst
+34
-67
34 additions, 67 deletions
sphinx-doc/cook_books/DataRegularisation.rst
sphinx-doc/index.rst
+2
-0
2 additions, 0 deletions
sphinx-doc/index.rst
with
36 additions
and
67 deletions
sphinx-doc/cook_books/DataRegularisation.rst
+
34
−
67
View file @
d207d490
...
...
@@ -175,31 +175,25 @@ comprehensible from one look.
Shifted data
^^^^^^^^^^^^
>>> qc.dataRaw
SoilMoisture | SoilMoisture_bshift |
================================ | ========================================= |
Date Time | Date Time |
2021-01-01 00:00:00 23.429701 | 2021-01-01 00:09:07 23.429701 |
2021-01-01 00:10:00 23.431900 | 2021-01-01 00:18:55 23.431900 |
2021-01-01 00:20:00 23.343100 | 2021-01-01 00:28:42 23.343100 |
2021-01-01 00:30:00 23.476400 | 2021-01-01 00:38:30 23.476400 |
2021-01-01 00:40:00 23.343100 | 2021-01-01 00:48:18 23.343100 |
2021-01-01 00:50:00 23.298800 | 2021-01-01 00:58:06 23.298800 |
2021-01-01 01:00:00 23.387400 | 2021-01-01 01:07:54 23.387400 |
2021-01-01 01:10:00 23.343100 | 2021-01-01 01:17:41 23.343100 |
2021-01-01 01:20:00 23.298800 | 2021-01-01 01:27:29 23.298800 |
2021-01-01 01:30:00 23.343100 | 2021-01-01 01:37:17 23.343100 |
... | ... ... |
2021-03-20 07:20:00 156.587906 | 2021-03-20 05:07:02 137.271500 |
2021-03-20 07:30:00 NaN | 2021-03-20 05:21:35 138.194107 |
2021-03-20 07:40:00 166.146194 | 2021-03-20 05:41:59 154.116806 |
2021-03-20 07:50:00 164.690598 | 2021-03-20 06:03:09 150.567505 |
2021-03-20 08:00:00 NaN | 2021-03-20 06:58:10 145.027496 |
2021-03-20 08:10:00 NaN | 2021-03-20 07:13:49 152.883102 |
2021-03-20 08:20:00 NaN | 2021-03-20 07:26:16 156.587906 |
2021-03-20 08:30:00 NaN | 2021-03-20 07:40:37 166.146194 |
2021-03-20 08:40:00 155.318893 | 2021-03-20 07:54:59 164.690598 |
[11286] [10607]
>>> qc.dataRaw
SoilMoisture | SoilMoisture_bshift |
================================ | ======================================= |
2021-01-01 00:09:07 23.429701 | 2021-01-01 00:00:00 23.429701 |
2021-01-01 00:18:55 23.431900 | 2021-01-01 00:10:00 23.431900 |
2021-01-01 00:28:42 23.343100 | 2021-01-01 00:20:00 23.343100 |
2021-01-01 00:38:30 23.476400 | 2021-01-01 00:30:00 23.476400 |
2021-01-01 00:48:18 23.343100 | 2021-01-01 00:40:00 23.343100 |
... | ... ... |
2021-03-20 07:13:49 152.883102 | 2021-03-20 08:10:00 NaN |
2021-03-20 07:26:16 156.587906 | 2021-03-20 08:20:00 NaN |
2021-03-20 07:40:37 166.146194 | 2021-03-20 08:30:00 NaN |
2021-03-20 07:54:59 164.690598 | 2021-03-20 08:40:00 155.318893 |
2021-03-20 08:40:41 155.318893 | 2021-03-20 08:50:00 NaN |
[10607] [11286]
<BLANKLINE>
max: [11286 rows x 2 columns]
<BLANKLINE>
We see, the first and last *10* datapoints of both, the original data time series and the shifted one.
...
...
@@ -258,10 +252,7 @@ within the *10* minutes interval ranging from ``2021-01-01 07:30:00`` to ``2021-
in the original data - and only the first of the two reappears in the shifted data set, as representation
for that interval.
.. code-block:: python
>>> data_result['2021-01-01T07:00:00':'2021-01-01T08:00:00']
>>> qc.dataRaw["2021-01-01 07:00:00":"2021-01-01 08:00:00"]
SoilMoisture_bshift | SoilMoisture |
================================ | ========================================= |
Date Time | Date Time |
...
...
@@ -282,13 +273,8 @@ appears to be closer to the original one.
To shift to any frequncy aligned timestamp the value that is closest to that timestamp, we
can perform a *nearest shift* instead of a simple *back shift*\ , by using the shift method ``"nshift"``\ :
.. code-block:: python
>>> saqc = saqc.shift('SoilMoisture', target='SoilMoisture_nshift', freq='10min', method='nshift')
>>> saqc = saqc.evaluate()
>>> data_result = saqc.getResult(raw=True)[0]
>>> data_result['2021-01-01T07:00:00':'2021-01-01T08:00:00']
>>> qc = qc.shift('SoilMoisture', target='SoilMoisture_nshift', freq='10min', method='nshift')
>>> qc.dataRaw['2021-01-01T07:00:00':'2021-01-01T08:00:00']
SoilMoisture_nshift | SoilMoisture |
================================ | ========================================= |
Date Time | Date Time |
...
...
@@ -305,11 +291,7 @@ timestamp would get assigned the nearest value of all the values, that preceed o
Maybe check out, what happens with the chunk of the final 2 hours of our shifted *Soil Moisture* dataset, to get an idea.
.. code-block:: python
>>> data_result['2021-03-20 07:00:00']
>>> qc.dataRaw['2021-03-20 07:00:00']
SoilMoisture_nshift | SoilMoisture |
================================ | ========================================= |
Date Time | Date Time |
...
...
@@ -326,6 +308,7 @@ Maybe check out, what happens with the chunk of the final 2 hours of our shifted
2021-03-20 08:40:00 155.318893 | |
2021-03-20 08:50:00 NaN | |
Since there is no valid data available, for example, in the interval from ``2021-03-20 07:55:00`` to ``2021-03-20 08:05:00`` - the new value
for the regular timestamp ``2021-03-20 08:00:00``\ , that lies in the center of this interval, is ``NaN``.
...
...
@@ -337,13 +320,9 @@ selecting a single one, we can do this, with the :py:func:`saqc.resample <Functi
Lets resample the *SoilMoisture* data to have a *20* minutes sample rate by aggregating every *20* minutes intervals
content with the arithmetic mean (which is implemented by numpies ``numpy.mean`` function for example).
.. code-block:: python
>>> import numpy
>>> saqc = saqc.resample('SoilMoisture', target='SoilMoisture_mean', freq='20min', method='bagg', agg_func=np.mean)
>>> saqc = saqc.evaluate()
saqc.getResult(raw=True)[0]
>>> qc = qc.resample('SoilMoisture', target='SoilMoisture_mean', freq='20min', method='bagg', agg_func=np.mean)
>>> qc.dataRaw
SoilMoisture | SoilMoisture_mean |
================================ | ===================================== |
Date Time | Date Time |
...
...
@@ -393,7 +372,7 @@ Interpolation
Another common way of obtaining regular timestamps, is, the interpolation of data at regular timestamps.
In the pool of
:doc
:`regularisation <
f
unction
_cats/regularisation
>` methods, is available the
In the pool of
py:mod
:`regularisation <
F
unction
s.saqc.resampling
>` methods, is available the
:py:func:`saqc.interpolate <Functions.saqc.interpolate>` method.
Lets apply a linear interpolation onto the dataset. To access
...
...
@@ -403,12 +382,8 @@ applies an interpolation, that is sensitive to the difference in temporal gaps
of the possible interpolation methods in the :py:func:`saqc.interpolate <Functions.saqc.interpolate>`
documentation. Lets check the results:
.. code-block:: python
>>> saqc = saqc.interpolate('SoilMoisture', target='SoilMoisture_linear', freq='10min', method='time')
>>> saqc = saqc.evaluate()
>>> saqc.getResult(raw=True)[0]
>>> qc = qc.interpolate('SoilMoisture', target='SoilMoisture_linear', freq='10min', method='time')
>>> qc.dataRaw
SoilMoisture | SoilMoisture_linear |
================================ | ========================================= |
Date Time | Date Time |
...
...
@@ -466,13 +441,8 @@ Since data, that is flagged by a level higher or equal to the passed ``to_mask``
it can be of advantage, to flag data before regularisation in order to effectively exclude it
from the resulting regularly sampled data set. Lets see an example for the *SoilMoisture* data set.
.. code-block:: python
>>> saqc = saqc.linear('SoilMoisture', target='SoilMoisture_linear', freq='10min')
>>> saqc = saqc.evaluate()
>>> d = saqc.getResult(raw=True)[0]
>>> d['2021-01-01 15:00:00':'2021-01-01 16:00:00']
>>> qc = qc.linear('SoilMoisture', target='SoilMoisture_linear', freq='10min')
>>> qc.dataRaw['2021-01-01 15:00:00':'2021-01-01 16:00:00']
SoilMoisture_linear | SoilMoisture |
================================ | ========================================= |
Date Time | Date Time |
...
...
@@ -501,12 +471,9 @@ for the interpolation at ``2021-01-01 15:40:00``. So lets flag all the values sm
with the :py:func:`saqc.flagRange <Functions.saqc.flagRange>` method and after this,
do the interpolation.
.. code-block:: python
>>> saqc = saqc.outliers.flagRange('SoilMoisture', min=0)
>>> saqc = saqc.resampling.interpolate('SoilMoisture', freq='10min', method='time')
>>> saqc.getResult(raw=True)[0]['2021-01-01T07:00:00':'2021-01-01T08:00:00']
>>> qc = qc.flagRange('SoilMoisture', min=0)
>>> qc = qc.interpolate('SoilMoisture', freq='10min', method='time')
>>> qc.dataRaw['2021-01-01T07:00:00':'2021-01-01T08:00:00']
SoilMoisture | SoilMoisture_original |
================================ | ========================================= |
Date Time | Date Time |
...
...
This diff is collapsed.
Click to expand it.
sphinx-doc/index.rst
+
2
−
0
View file @
d207d490
...
...
@@ -83,3 +83,5 @@ Indices and tables
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
:py:meth:`saqc.SaQC.flagMissing`
\ No newline at end of file
This diff is collapsed.
Click to expand it.
David Schäfer
@schaefed
mentioned in commit
63b55c6d
·
2 years ago
mentioned in commit
63b55c6d
mentioned in commit 63b55c6d7dadb0e612b23a897f292d5ffc14cb52
Toggle commit list
David Schäfer
@schaefed
mentioned in commit
684dc8a0
·
2 years ago
mentioned in commit
684dc8a0
mentioned in commit 684dc8a0515470d644fc85fee95d07661c8dd572
Toggle commit list
David Schäfer
@schaefed
mentioned in commit
8f7a90e4
·
1 year ago
mentioned in commit
8f7a90e4
mentioned in commit 8f7a90e4aed61c79a9dc8d67541a46beba0907e8
Toggle commit list
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment