Outputs should cache two data pushes

changed milestone to %release-0.4

changed the description

I still think, if the source has the data for the time-step, the current time the module updates to, it should take that time.

Also, this discussions is just about models that have the exact same time-stepping. So following to IO chain, models that don't have any inputs (like file readers) should update first and provide data for the target time of models, that are connected to these.

It's like the internal calculations in mHM: processes that are needed by other processes are calculated first (soilMoisture -> runoff for example.. runoff-routine takes the soil moisture of the current time to calculate the runoff for the current time).

I don't see why this should increase the time. That would also mean, that after the first time-step, the calculated initial value is the same as the result of the first time-step, since we use the values from the previous time-step. That doesn't make sense to me.

Source updated first example as I would imagine it:

Initial values (as done at the moment)

source   o
         ↓
target   o

Source update

source   o-----o

target   o

Target update

source   o-----o
               ↓
target   o-----o

In the last example, the arrow needs to go from top-right to bottom-left, or not? Otherwise, it is just what I proposed with the first time step not shown.

Also, I think we should not have any special handling for synchronized time steps. We need a solution that covers this case just like any other.

For circular or bi-directional coupling, at least one component needs to work with current/past data. Should that be decided by the user? By adding a PrevTime adapter?

Further, I am not sure how, or if at all, this can be compatible with time interpolation. Do we then always pull with the future time? I.e. 1. Increment time, 2. Pull, 3. Push? Think of Formind with 1y step. How could mHM provide data for 1 year in the future? Update order does not help here.

So I think to implement this in a consistent way, we have to completely rethink the scheduling algorithm. This would be fine for me. Inconsistent application of some rules in one case, but not in others, would not.

Let's have an example. Past and current time: ===O future steps: ---o.

B is on turn. What to do/use? Data from a bit ahead? Getting data for the next time is not possible here.

A ===O---o---o---o---o

B =O---------------o

A is on turn. What to do/use? Data from far ahead? Data for the next time could be interpolated.

A ===O---o---o---o---o

B =O===============O--------

For circular or bi-directional coupling, at least one component needs to work with current/past data. Should that be decided by the user? By adding a PrevTime adapter?

Yes I think a PrevTime adapter should be added by the user.

For the other stuff:

I think this is really implying that we need at least two time-steps saved (Or to be able to state how many should be saved (by default 2))
We need a default behavior, how to get the data from available time-steps, options are:
- nearest
- linear interpolation of available time-steps with nearest interpolation, when out of available time-frame
- linear interpolation with extrapolation
- other interpolation
we already had no-branch adapters, that store time-steps until data is pulled, that could help with big time-step differences as you have shown

We should probably avoid extrapolation as much as possible. If not possible, it should be nearest. Also, I think extrapolation should require special user action, like adding an adapter.

That would also mean, that after the first time-step, the calculated initial value is the same as the result of the first time-step, since we use the values from the previous time-step. That doesn't make sense to me.

This is definitely a good point that we should keep in mind for our own mental consistency check.

Handling of the big time step diffs is still unclear to me. And we should also keep in mind that we can't plan with 2 components only. (Esp. regarding a run-ahead mechanism you proposed earlier)

Somehow I see our complete concept collapse if we can't use the scheme of 1) pull for current state, 2) advance state, 3) push for new state. But yeah, this was questioned multiple times already.

So to be honest, I am quite a bit concerned that we based all that stuff on unacceptable assumptions, and that I convinced the team of a concept that starts to fall apart now. 😞

Keep calm. Nothing is falling apart. This is just the next step we need to tackle.

And in addition: What about focusing on the cases we have at the moment? I think adding a time_step attribute could be a good idea. It then could be None by default to state the the model keeps its secret (like it is now). This could also be used to inherit time-stepping from other modules (like netcdf readers).

And we could provide output buffers that are always no-branch adapters to provide buffering at each desired place.

I don't see anything falling apart.

For the updating routine, I now go with the following default scheme:

update time (to have a vertical arrow)
pull inputs
calculate data
push outputs

And in addition downstream inputs should be updated first (like readers). With this as default, we can now generate a list of possible constellations of timing and required available data.

This makes the most sense to me.

Yeah, this could work. See also #78 (closed).

However, I am not sure about updating downstream first.

Looks like we can handle all this well as proposed in #78 (closed).

Implementation in !157 (merged) (data storage/caching) and !159 (merged) (new scheduling algorithm).

mentioned in issue #78 (closed)

mentioned in merge request !157 (merged)

mentioned in commit ed8eb4e9

closed with merge request !157 (merged)

Outputs should cache two data pushes

Designs

Child items ...

Activity