Skip to content
Snippets Groups Projects
Commit 686065dc authored by David Schäfer's avatar David Schäfer
Browse files

[FIX] reduce the memory consumption of SaQC by >50% through Histories of type pd.Categorical

parent 0ffd8c04
No related branches found
No related tags found
1 merge request!269Several fixes
Pipeline #25309 passed with stage
in 1 minute and 51 seconds
This commit is part of merge request !269. Comments created here will be created in the context of that merge request.
Loading
  • David Schäfer @schaefed

    mentioned in issue #209 (closed)

    ·

    mentioned in issue #209 (closed)

    Toggle commit list
  • From some brief benchmarking, i got, that casting via df.astype(pd.SparseDtype('float', np.nan)), instead of df.astype('category'), is faster in casting (around 30 percent) and uses less memory (Factor 1-10, without initial unflagged column: factor 2-20) and is faster in column and row access and also in row wise max calculation.

    So, since integrating would just mean to replace category cast by sparse cast, maybe we should give it a try?

  • Author Owner

    Sure! But let's please do it after !260 (merged) was merged.

  • Author Owner

    !260 (merged) is in now, so feel free to sparsify.

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment