Technology

Reconcile All The Things — Acko.net

Figuring this out requires you to do a proper minimal diff. Even then, there is ambiguity, because the following would also work:

  • Remove B(false) and its sub-calls
  • Replace D(0) with D(2)
  • Remove D(3)

From the code you can tell that it should be the former, not the latter. But while this is easy to describe after the fact, it’s difficult to imagine how such a thing might actually work at run-time. The only way to know what calls will be made is to run the code. Once you do, it’s too late to try and use previous results to save time. Plus, this code only has a few trivial ifs. This becomes even harder if you allow e.g. for loops because now there’s a variable number of elements.

In case it’s not clear, this is the exact same problem as the live DFG with lambdas in disguise.

We need to introduce the operation of reconciliation: rather than doing work, functions like B must return some kind of data structure that describes the work to be done. Sort of like an effect. Then we can reconcile it with what it returned previously, and map 1-to-1 the calls that are identical. Then we can run it, while reusing cached results that are still valid.

It would also be useful to match calls where only arguments changed, because e.g. B(true) and B(false) share some of the same calls, which can be reused. At least in theory.

Granted, there is a huge constraint here, which the mock scenario obfuscates. Memoizing the calls only makes sense if they return values. But if we passed any returned value into another call, this would introduce a data dependency.

That is, in order to reconcile the following:

let F = (x) => {
  let foo = C(x);
  D(foo);
  if (foo) E();
}

We would need to somehow yield in the middle:

let F = function* (x) {
  let foo = C(x);
  yield;
  D(foo);
  if (foo) E();
}

That way we can first reconcile foo = C(x), so we can know whether to reconcile D(false) or D(true), E().

Unfortunately our language does not actually have a trace reconciler built into it, so this can’t work. We have two options.

First, we can transpile the code to a deferred form:

let F = function* (x) {
  foo = yield call(C)(x)
  yield [
    call(D)(foo),
    foo ? call(E)() : null,
  ]);
}

Here, call(C)(x) is a value that says we want to call C with the argument x. A deferred function like F returns one or more wrapped calls via a yield. This allows C(x) to be reconciled, obtaining either a cached or fresh value for foo. Then we can reconcile the calls to D(foo) and E().

To make this work would require functions C, D and E to receive the exact same treatment, which we have to be honest, is not a compelling prospect.

Alternatively, we could recognize that C(x) is called first and unconditionally. It doesn’t actually need to be reconciled: its presence is always guaranteed if F is called. Let’s call such a function a hook.

let F = (x) => {
  let foo = C(x);
  return defer([
    call(D)(foo),
    foo ? call(E)() : null
  ]);
}

If hooks like C(x) aren’t reconciled, they’re regular function calls, so is all the code inside. Like a scriptable node in a DFG, it’s an escape hatch inside the run-time.

But we’re also still missing something: actual memoization. While we have the necessary information to reconcile calls across two different executions, we still don’t have anywhere to store the memoized state.

So we’ll need to reserve some state when we first call F. We’ll put all the state inside something called a fiber. We can pass it in as a bound argument to F:

let F = (fiber) => (x) => {
  let foo = C(fiber)(x);
  return defer([
    call(D)(foo),
    foo ? call(E)() : null
  ]);
}

We also pass the fiber to hooks like C: this provides the perfect place for C to store a memoized value and its dependencies. If we run the program a second time and call this exact same F again, in the same place, it will receive the same fiber as before.

As long as the execution flow remains the same between two runs, the fiber and the memoized values inside will remain. Because functions like C are likely to receive exactly the same argument next time, memoization works very well here.

Apropos of nothing, here’s how you build a UI component on the web these days:

const Component: React.FC<Props> = (props) => {
  const {foo, bar} = props;

  // These are hooks, which can only be called unconditionally
  const [state, setState] = useState('hello world');
  const memoized = useMemo(() => slow(foo), [foo]);

  // And there's also something called useEffect
  useEffect(() => {
    doThing(foo)
    return () => undoThing(foo);
  }, [foo]);

  // Regular JS code goes here
  // ...

  // This schedules a call to D({foo}) and E()
  // They are mounted in the tree inside <Component> recursively
  return <>
    <D foo={foo} />
    {foo ? <E /> : null}
  </>
}

It’s all there. Though useEffect is a side-show: the real Effects are actually the <JSX> tags, which seem to have all the relevant qualities:

  • JSX doesn’t run any actual code when we evaluate it
  • We can compose multiple JSX elements together in series (nesting) or in parallel (<> aka “fragments”)
  • The run-time takes care of all the scheduling, mounting and clean up, in the right order

It will reconcile the before and after calls, and preserve the matching fibers, so functions like useMemo can work.

You can also reconcile variable size sets, by returning an array where every call has a key property. This allows minimal diffing in O(N) time.

You may eventually realize that the JSX at the bottom is really just an obscure dialect of JavaScript which lacks a real return statement: what this is returning is not a value at all. It is also not passed back to the parent component but to the run-time. The syntax is optimized for named rather than positional arguments, but that’s about it.

What’s more, if you do accept these constraints and manage to shoehorn your code into this form, it’s strangely not terrible in practice. Often the opposite. Suddenly a lot of complex things that should be hard seem to just fall out naturally. You can actually begin to become 10x. Compared to O(N2) anyway.

The difference with our hypothetical trace reconciler is that there is no way to yield back to a parent during a deferred render. A common work-around in React land is a so-called render prop, whose value is a lambda. The lambda is called by the child during rendering, so it must be entirely side-effect free.

The code:

x = A();
y = B();
z = C(x, y);

must be turned into:

<A>{
  (x) => <B>{
    (y) => <C x={x} y={y}>{
      (z) => {}
    }</C>
  }</B>
}</A>

This is hideous. Because there is no ability for sibling calls to pass data, B must go inside A, or the other way around. This introduces a false data dependency. But, the data flow graph does match the normal execution trace: in code we also have to decide whether to call A or B first, even if it doesn’t matter, unless you explicitly parallelize.

It’s interesting that a render prop is an injectable lambda which returns new nodes to mount in the tree. Unlike a “scriptable node”, this allows the tree to extend itself on the fly in a Turing-complete yet data-driven way.

So don’t think of a reconciler as a tree differ. Think of it as a generalized factory which maintains a perfectly shaped tree of caches called fibers for you. You never need to manually init() or dispose() them… and if you re-run the code in a slightly different way, the fibers that can be re-used will be reused. The value proposition should be pretty clear.

When we made the fiber visible as an argument, much of the magic went away: a React Component is merely a function (fiber) => (props) => DeferredCall. The first argument is implicit, binding it to a unique, persistent fiber for its entire lifetime. The fiber is keyed off the actual call site and the depth in the stack. The hooks can work because they just reserve the next spot in the fiber as each one is called. This is why hooks must be called unconditionally in React: it’s only way to keep the previous and next states in sync.

Where we go next is hopefully clear: what if you could use these patterns for things that aren’t UI widgets in a tree? Could you retain not just the Effect-like nature, but also the memoization properties? Also, without hideous code? That would be pretty nice.

Related Articles

Back to top button