As part II of the past semester’s PL Seminar, I have implemented probablistically balanced tries in adapton.rust, a general-purpose incremental computation (I.C.) library for Rust. I.C. presents a different programming paradigm to get used to, but can yield big performance improvements without needing to implement dynamic programming algorithms that become harder to reason about. You can see MIXY for my post/project on part I of the seminar.

Details on the theory behind Adapton.Rust’s current implementation can be found in the paper Incremental Computation with Names. Adapton.Rust implements what the authors call nominal matching, giving names to computations for reuse. I’ll go over some of the details before presenting probablistic tries, and then explaining how to use and benchmark my implementation.


Named Computation

Adapton and previous works on incremental computation aim to reuse as much computation as possible after changes to input, improving upon naive memoization. In Adapton, we attempt to reuse sub-computations, such as results from similar tails of different input lists. So, mapping f over [2;3;4;5] first and then over [1;2;3;4;5], Adapton will reuse the entire result on the common tail [2;3;4;5], rather than failing a memo lookup on the original input and performing a full recomputation.

Nominal Adapton take this a step further. Though original Adapton was able to reuse the common tail above, incremental computations on lists are limited to only reusing computations over similar tails. This is because structurally, all prefix cons cells “depend” on the entire tail (Figure 1.a in the paper has a great depiction of how an insertion into the middle of a list “dirties” the entire prefix, triggering the need for recomputation of those values shown in Figure 1.b). This behavior is a product of structural matching to determine whether values can be reused, or must be recomputed. We introduce nominal matching to avoid situations such as these, enabling reuse of more computations.

In nominal Adapton, we associate external “names” with computations. If a future call dirties a node in the demanded computation graph, we can match on the name rather than structure to determine reusability. This results in a constant-time equality check, and greater opportunity for reuse!

In code, this means we wrap input in named articulation points before extending data structures.

    fn push_input(i: usize, t: Trie<usize>) -> Trie<usize> {
        let t = Trie::art(cell(name_of_usize(i), t));
        let t = Trie::name(name_of_usize(i), t);
        Trie::extend(name_unit(), t, i)
    }

    fn push_list(i: usize, l: List<usize>) -> List<usize> {
        let l = List::art(cell(name_of_usize(i), l));
        let l = List::name(name_of_usize(i), l);
        List::cons(i, l)
    }

Adapton can still perform structural matching (try wrapping dcg interactions with the structural(...) function, or running the executable with ADAPTON_STRUCTURAL=1), but programming by naming sub-structures is what enables these speedups.


Probablistic Tries

Probablistic Tries are incremental data structures inspired by probablistically balanced trees. In Adapton, we can use tries to represent sets and finite maps.

Intuitively, tries are binary trees whose nodes are named and whose leaves hold data. Tries use bitstrings (see bitstring.rs for implementation) to represent both the data element (via its hash), and the path to traverse a trie in order to retrieve it (if the element is present). Because hashes describe the path to extract elements, similar inputs in tries yield similarly structured paths. Inserting elements affects little change to the structure; paths only change after insertions if the depth of a trie must grow, and only paths on the grown sub-trie are dirtied. If we memoize named computations in a fold over the structure of a trie, future folds can reuse all unchanged subcomputations.

pub fn trie_fold
    <X, T:TrieElim<X>, Res:Hash+Debug+Eq+Clone+'static, F:'static>
    (t: T, res:Res, f: Rc<F>) -> Res
    where F: Fn(X, Res) -> Res {
    T::elim_arg(t,
                (res, f),
                |_, (arg, _)| arg,
                |_, x, (arg, f)| f(x, arg),
                |_, left, right, (arg, f)| trie_fold(right, trie_fold(left, arg, f.clone()), f),
                |_, t, (arg, f)| trie_fold(t, arg, f),
                |nm, t, (arg, f)| memo!(nm =>> trie_fold, t:t, res:arg ;; f:f))
}

The memo!(nm =>> trie_fold, t:t, res:arg ;; f:f) syntax memoizes recursive trie_fold computations on named sub-tries. If we use nominal matching, these points result in O(1)-time memo lookups, and under structural matching these points are where we compare for structural equality.


Building and Testing

My implementation of tries has been merged into upstream adapton.rust, and can be found on the dev branch. To check out the behavior on your own machine, install the nightly release of rustc and cargo, build and run the tests/benchmarks following the instructions below.

$ curl https://sh.rustup.rs -sSf | sh -s -- --default-toolchain nightly
# Make sure you are on the dev branch
$ git clone https://github.com/baxtersa/adapton.rust
$ cd adapton.rust/
# Build the source
$ cargo build
# Build and run tests
$ cargo test
# Remove a benchmark file that breaks the benchmark build (I should fix this)
$ rm benches/benches.rs
# Build and run benchmarks
$ cargo bench tries_bench


Benchmarks

Some common list-computations perform better over tree-like structures in Adapton. fold is an example of this, as the accumulator carries dependencies on all previous computations, i.e. the accumulator at step i depends on the entire list prefix 0..i-1, demanding recomputation of each step on changes to the input. Tree-like structures improve upon this by expressing independence of sub-problems, so changing a leaf only dirties the path up to the root, rather than all leaves traversed prior.

My performance benchmark compares fold over trees and tries. In this benchmark, we build an input out of the sequence 1..100. At each iteration of building the input, we fold sum over the current iteration of the structure. benchmark_dcg_* uses nominal matching and the demanded computation graph to yeild large performance speedups over benchmark_naive_* tests, that perform recomputation.

Running cargo bench should show some comparisons of this behavior:

running 4 tests
test tree_benchmarks::benchmark_dcg_tree   ... bench:       2,160 ns/iter (+/- 322)
test tree_benchmarks::benchmark_naive_tree ... bench:      66,686 ns/iter (+/- 201)
test trie_input::benchmark_dcg_trie        ... bench:       1,968 ns/iter (+/- 97)
test trie_input::benchmark_naive_trie      ... bench:      16,378 ns/iter (+/- 3,020)

For comparison, run ADAPTON_STRUCTURAL=1 cargo bench to show the performance of structural matching:

running 4 tests
test tree_benchmarks::benchmark_dcg_tree   ... bench:       1,604 ns/iter (+/- 91)
test tree_benchmarks::benchmark_naive_tree ... bench:      66,219 ns/iter (+/- 1,465)
test trie_input::benchmark_dcg_trie        ... bench:       9,717 ns/iter (+/- 95)
test trie_input::benchmark_naive_trie      ... bench:      16,479 ns/iter (+/- 2,906)

Using nominal matching, trees and tries perform roughly equivalent under the dcg engine. This makes sense, because we are folding over similar tree structures, and reusing named points at binary nodes. Interestingly, naive computation is much more performant for tries in this benchmark. My guess is that the tries are much smaller than the trees tested, but haven’t looked into it too much.

You can see that structural matching performs better than naive recomputation for both trees and tries, but tries perform significantly better under nominal matching. This shows the benefits of memoizing names rather than using structural comparison.

I should note that for folding the sum of an input is significantly faster using Rust native structures like vectors, than the benchmark above. This may be a combination of optimizations and the overhead of Adapton’s dcg engine outweighing the benefits of reuse for cheap integer arithmetic. I’d expect the benchmarks to look better compared to naive Rust vectors on more expensive computations, and future incrementalset and graph operations should be coming soon. For now, this is still a hunch I’d like to verify before incorporating I.C. with adapton.rust into any personal projects (of which I have none).


Development

See here for the latest changes upstream.

  • dev contains latest development (including tries!)
  • master contains “stable” features


See here for the discussion on my PR merging tries upstream.