In order to prevent temporal arrays in memory, it could be a good idea to provide a callback function to Outputs that are called when the get_data method is called to directly get data from the models.
Designs
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related.
Learn more.
@muellese We should carefully check how this plays with the current update scheme (which of course can be subject to discussion [and deserves some more detailed documentation]). Particularly, "callback outputs" not only allow for extracting model outputs on demand. As a side effect, purely pull-based components would be possible, that do their calculations only when requested by a target (cool thing!). Currently, "timeless" components (IComponent) only work in a push-based manner, so we should check for consistency with the current concept.
We should also keep in mind that multiple targets can be connected to each output. So when using on-demand outputs blindly, it might save some memory but induce unnecessary re-calculations. Multiple connections are actually the primary reason why data resides in the outputs, and why it should not be modified in-place by downstream components or adapters (which should also be documented).
@thober Actually, I think we should check how severe these memory concerns really are, and not "over-engineer" without knowing some numbers. As an example, Germany in 1km by 1km resolution is around 550k raster cells, which makes 4MB for a grid of double precision. So we could have a few thousand of such outputs (or output z-layers) on a single node.
That said, pull-based components (or rather, outputs) would be great if they work (or if we can make them do so)! 😎
Thinking about it a bit further, "real" pull-based outputs will probably not work. I.e. a component will still need to call notify_targets on such output, but could skip the actual push.
This is because pulls do not propagate back through time-aware adapters (i.e. those overwriting source_changed). Their input "time steps" (push/notification) need to be independent of output "time steps" (pull).
So I think we can realize this in a way that serves the original purpose (save memory), while the IMO more useful feature of pull-based components would be more complicated to implement. Their added benefits would be the possibility of components that behave like pull-based adapters with multiple inputs. E.g.:
Statistical models that have no time step
"Adapters" that require multiple inputs, e.g.:
clip one input raster by the extent of another one
Formind's soil moisture -> growth reduction adapter, with dynamic PWP and FC
An option could be to not allow for source_changed-adapters after pull-based outputs (directly, and also down the adapter chain - would need a separate adapter interface). However, this would limit compatibility between components and adapters. 🤔