MultivariateFlagging.rst

>>> qc.data.columns
Index(['sac254_raw', 'level_raw', 'water_temp_raw', 'maint'], dtype='object', name='columns')
>>> qc.data['maint'] # doctest:+SKIP
Timestamp
2016-01-10 11:15:00    2016-01-10 12:15:00
2016-01-12 14:40:00    2016-01-12 15:30:00
2016-02-10 13:40:00    2016-02-10 14:40:00
2016-02-24 16:40:00    2016-02-24 17:30:00
....                                  ....
2017-10-17 08:55:00    2017-10-17 10:20:00
2017-11-14 15:30:00    2017-11-14 16:20:00
2017-11-27 09:10:00    2017-11-27 10:10:00
2017-12-12 14:10:00    2017-12-12 14:50:00
Name: maint, dtype: object
>>> qc = qc.flagManual('sac254_raw', mdata='maint', method='closed', label='Maintenance')
>>> qc = qc.flagRange('level_raw', min=0)
>>> qc = qc.flagRange('water_temp_raw', min=-1, max=40)
>>> qc = qc.flagRange('sac254_raw', min=0, max=60)
>>> qc.data[['sac254_raw', 'level_raw', 'water_temp_raw']] # doctest:+NORMALIZE_WHITESPACE
                    sac254_raw |                     level_raw |                     water_temp_raw |
============================== | ============================= | ================================== |
Timestamp                      | Timestamp                     | Timestamp                          |
2016-01-01 00:02:00    18.4500 | 2016-01-01 00:02:00   103.290 | 2016-01-01 00:02:00           4.84 |
2016-01-01 00:17:00    18.6437 | 2016-01-01 00:17:00   103.285 | 2016-01-01 00:17:00           4.82 |
2016-01-01 00:32:00    18.9887 | 2016-01-01 00:32:00   103.253 | 2016-01-01 00:32:00           4.81 |
2016-01-01 00:47:00    18.8388 | 2016-01-01 00:47:00   103.210 | 2016-01-01 00:47:00           4.80 |
2016-01-01 01:02:00    18.7438 | 2016-01-01 01:02:00   103.167 | 2016-01-01 01:02:00           4.78 |
...                        ... | ...                       ... | ...                            ... |
2017-12-31 22:47:00    43.2275 | 2017-12-31 22:47:00   186.060 | 2017-12-31 22:47:00           5.49 |
2017-12-31 23:02:00    43.6937 | 2017-12-31 23:02:00   186.115 | 2017-12-31 23:02:00           5.49 |
2017-12-31 23:17:00    43.6012 | 2017-12-31 23:17:00   186.137 | 2017-12-31 23:17:00           5.50 |
2017-12-31 23:32:00    43.2237 | 2017-12-31 23:32:00   186.128 | 2017-12-31 23:32:00           5.51 |
[70163]                          [70163]                         [70163]
<BLANKLINE>
max: [70163 rows x 3 columns]
>>> qc.data[['sac254_raw', 'level_raw', 'water_temp_raw']]['2017-10-29 07:00:00':'2017-10-29 09:00:00'] # doctest:+NORMALIZE_WHITESPACE
                    sac254_raw |                     level_raw |                     water_temp_raw |
============================== | ============================= | ================================== |
Timestamp                      | Timestamp                     | Timestamp                          |
2017-10-29 07:02:00    40.3050 | 2017-10-29 07:02:00   112.570 | 2017-10-29 07:02:00          10.91 |
2017-10-29 07:17:00    39.6287 | 2017-10-29 07:17:00   112.497 | 2017-10-29 07:17:00          10.90 |
2017-10-29 07:32:00    39.5800 | 2017-10-29 07:32:00   112.460 | 2017-10-29 07:32:00          10.88 |
2017-10-29 07:32:01    39.9750 | 2017-10-29 07:32:01   111.837 | 2017-10-29 07:32:01          10.70 |
2017-10-29 07:47:00    39.1350 | 2017-10-29 07:47:00   112.330 | 2017-10-29 07:47:00          10.84 |
2017-10-29 07:47:01    40.6937 | 2017-10-29 07:47:01   111.615 | 2017-10-29 07:47:01          10.68 |
2017-10-29 08:02:00    40.4938 | 2017-10-29 08:02:00   112.040 | 2017-10-29 08:02:00          10.77 |
2017-10-29 08:02:01    39.3337 | 2017-10-29 08:02:01   111.552 | 2017-10-29 08:02:01          10.68 |
2017-10-29 08:17:00    41.5238 | 2017-10-29 08:17:00   111.835 | 2017-10-29 08:17:00          10.72 |
2017-10-29 08:17:01    38.6963 | 2017-10-29 08:17:01   111.750 | 2017-10-29 08:17:01          10.69 |
2017-10-29 08:32:01    39.4337 | 2017-10-29 08:32:01   112.027 | 2017-10-29 08:32:01          10.66 |
>>> qc = qc.linear(['sac254_raw', 'level_raw', 'water_temp_raw'], freq='15min')
>>> qc.data['sac254_raw'] #doctest:+NORMALIZE_WHITESPACE
Timestamp
2016-01-01 00:00:00          NaN
2016-01-01 00:15:00    18.617873
2016-01-01 00:30:00    18.942700
2016-01-01 00:45:00    18.858787
2016-01-01 01:00:00    18.756467
                         ...
2017-12-31 23:00:00    43.631540
2017-12-31 23:15:00    43.613533
2017-12-31 23:30:00    43.274033
2017-12-31 23:45:00    43.674453
2018-01-01 00:00:00          NaN
Name: sac254_raw, Length: 70177, dtype: float64
>>> qc.plot('sac254_raw') # doctest:+SKIP
>>> qc = qc.correctDrift('sac254_raw', target='sac254_corrected',maintenance_field='maint', model='exponential')
>>> plt.plot(qc.data['sac254_raw']['2016'], alpha=.5, color='black', label='original') # doctest:+SKIP
>>> plt.plot(qc.data['sac254_corrected']['2016'], color='black', label='corrected') # doctest:+SKIP
>>> from scipy.stats import zscore
>>> zscore_func = lambda x: zscore(x, nan_policy='omit')
>>> qc = qc.transform(['sac254_corrected', 'level_raw', 'water_temp_raw'], target=['sac254_norm', 'level_norm', 'water_temp_norm'], func=zscore_func, freq='30D')
>>> qc = qc.assignKNNScore(field=['sac254_norm', 'level_norm', 'water_temp_norm'], target='kNNscores', freq='30D', n=5)
>>> qc.plot('kNNscores') # doctest:+SKIP
>>> qc.plot('sac254_norm', phaseplot='level_norm', xscope='2016-11') # doctest:+SKIP
>>> qc = qc.flagByStray(field='kNNscores', freq='30D', alpha=.3)
>>> qc = qc.transferFlags(field='kNNscores', target='sac254_corrected', label='STRAY')
>>> qc = qc.transferFlags(field='kNNscores', target='sac254_norm', label='STRAY')
>>> qc.plot('sac254_corrected', xscope='2016-11') # doctest:+SKIP
>>> qc.plot('sac254_norm', phaseplot='level_norm', xscope='2016-11') # doctest:+SKIP