Regular dimensionless time not identified as regular
Summary
I want to detect outliers in a synthetic timeseries with a dimensionless time t = np.arange(0, T, 1)
. However, I get problems, which seem to stem from the function getFreqDelta, which assumes that the time is always in some kind of format, which pandas can interpret as a date. Maybe this is intended and SaQC is supposed to only work on actual dates? In that case, would this generalisation be something you are interested in? If not, a quick solution for me would be to simply introduce dummy dates.
Reproducible Example
import numpy as np
import pandas as pd
import saqc
t = np.arange(0, 10)
df = pd.DataFrame({'data': np.linspace(-5, 5, 10)}, index=t)
print(df)
winsize = 4
score_func = lambda x: (x[int(winsize / 2)] - x.mean()) / x.std()
qc = saqc.SaQC(df)
qc = qc.processGeneric('data', target='data_diff', func=lambda x: x.diff())
qc = qc.roll(
field='data_diff',
target='data_scores',
window=winsize,
func=score_func,
)
The i
in getFreqDelta
looks like this in the example above:
DatetimeIndex([ '1970-01-01 00:00:00',
'1970-01-01 00:00:00.000000001',
'1970-01-01 00:00:00.000000002',
'1970-01-01 00:00:00.000000003',
'1970-01-01 00:00:00.000000004',
'1970-01-01 00:00:00.000000005',
'1970-01-01 00:00:00.000000006',
'1970-01-01 00:00:00.000000007',
'1970-01-01 00:00:00.000000008',
'1970-01-01 00:00:00.000000009'],
dtype='datetime64[ns]', freq=None)