Update readManflagFile in dacces for more data
When accessing Manual_Flags file, with this script:
import pandas as pd
from lib import faccess
def main(start_date, end_date, tag, station, logger):
devices = faccess.getDevices(station=station, logger=logger, tag=tag, start_date=start_date, end_date=end_date)
for device in devices:
manflags = device.getManualFlags()
print(manflags)
if __name__ == '__main__':
start_date = pd.to_datetime('2019-01-01 00:00:00')
end_date = pd.to_datetime('2019-02-01 00:00:00')
tag = 'meteo'
for station in ['HH']:
for logger in ['BC1']:
main(start_date, end_date, tag, station, logger)
You get this output:
var_id | start | end | flag |
---|---|---|---|
dendro_022 [um] | 2014-03-26 17:30:00 | 2014-03-26 17:30:00 | 2 |
dendro_022 [um] | 2014-04-23 00:00:00 | 2014-04-23 00:00:00 | 2 |
dendro_022 [um] | 2014-06-24 00:00:00 | 2014-06-24 00:00:00 | 0 |
... | ... | ... | ... |
But there are other columns in the csv file: dbh, d_ini, comment. At least the values of dbh and d_ini are needed for other calculations.
The function readManflagFile in dacces is currently used for this:
def readManflagFile(fname):
formats = {"%d.%m.%Y %H:%M:%S", "%d.%m.%Y %H:%M"}
df = pd.read_csv(fname, comment="#", encoding="latin")
df["start"] = _toDatetime(df["start"], formats)
df["end"] = _toDatetime(df["end"], formats)
df = (df
.set_index("var_id")
.loc[:, ["start", "end", "flag"]]
.fillna(pd.Timestamp.now()))
return df
My suggestion is to edit the function so that all columns are read and the non-existent values are filled with a NaN instead of Timestamp.now(). Like this:
def readManflagFile(fname):
formats = {"%d.%m.%Y %H:%M:%S", "%d.%m.%Y %H:%M"}
df = pd.read_csv(fname, comment="#", encoding="latin")
df["start"] = _toDatetime(df["start"], formats)
df["end"] = _toDatetime(df["end"], formats)
df = (df
.set_index("var_id")
.fillna(np.nan))
return df