Skip to content
Snippets Groups Projects

Perf improvements

Merged David Schäfer requested to merge perf_improvements into develop

Some performance improvements, mostly as discussed in #99 (closed).

This reduced the CLI-runtime for a synthetic dataset with 1000000 rows and 20 columns range tested on every column from ~103 to ~30 seconds as measured with the linux time-utility.

As the masking is heavily under tested, please thoroughly review these changes @palmb and @luenensc !

Not sure, why the Pipeline fails as it runs on my machine. I have to dig into that...

Edited by David Schäfer

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • David Schäfer resolved all threads

    resolved all threads

  • David Schäfer added 1 commit

    added 1 commit

    • ffe3633c - Apply 1 suggestion(s) to 1 file(s)

    Compare with previous version

  • David Schäfer added 5 commits

    added 5 commits

    Compare with previous version

  • Bert Palm added 1 commit

    added 1 commit

    • 062db908 - WIP - rework the register and saqcFunc calling machinery

    Compare with previous version

  • Bert Palm marked as a Work In Progress from 062db908

    marked as a Work In Progress from 062db908

  • Bert Palm added 1 commit

    added 1 commit

    • af4051ec - WIP - rework the register and saqcFunc calling machinery

    Compare with previous version

  • Bert Palm added 1 commit

    added 1 commit

    Compare with previous version

  • Bert Palm added 2 commits

    added 2 commits

    • 8c235bda - fixed everything to make it run
    • 013a83ef - clean up register and imports

    Compare with previous version

  • TODO: implement fix for masking/unmasking

    • take data from new_data
    • unmasking on subset, defined in columns (those was masked, others not)
      for c in columns:
         if index-changed:
            continue
         else:
             take old_data values at NAN positions aka: wasmasked & ismasked & isna 
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading