Skip to content

Univariat Outlier Probabilities and thresh automatisation for flagUniLOF

Peter Lünenschloß requested to merge prob_lof into develop

Adds thresh-parameter dependent dispatch to uni variate probabilistic local outlier factor (Local Outlier Probability (LOP)) to flagUniLOF:

This mainly serves the issue of making thresh more interpretable and also redundant (optional) parameter

Standard calling now appropriate with just parameter n

flagUniLOF(field, n=N)

Context

LOP derives Outlier Probabilities between 0 and 1 from the outlier factor distributions and thus allows more intuitive threshing. (Also LOP seems to result in a globally more valid ranking, as presented in [1])

-> when assigning thresh=None (default), flagUniLOF switches to local outlier probabilities mode and estimates cut-off probabilities with a k-means cluster-separation approach

-> when assigning thresh in (0,1), flagUniLOF switches to local outlier probabilities mode and calculates probabilities to be cut off, instead of scores

-> when assigning thresh in (-1,0), flagUniLOF switches to local outlier probabilities mode with corruption cap, assuming that not more than thresh percentage of the data are anomalous and therefrom estimates cut-off probability

-> when assigning thresh in (-inf,-1), flagUniLOF switches to local outlier probabilities mode with corruption cap, assuming that not more than thresh number of saples the data are anomalous and therefrom estimates cut-off probability

[1] https://www.semanticscholar.org/paper/LoOP%3A-local-outlier-probabilities-Kriegel-Kr%C3%B6ger/4909d1923941b4981c6e9047a52ebb5a81d50277

Merge request reports

Loading