This repository was archived by the owner on Apr 1, 2025. It is now read-only.
[WIP] Remove Distribute effect. #516
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Do not merge this until we’re confident it won’t cause performance regressions in clients.
This removes the Distribute effect, which was problematic. Read on to find out why. You can also read @lexi-lambda's blog post, which explains this in the context of the
MonadBaseControlecosystem, which I won't mention in the following.Consider this age-old situation: given a function
fnand a containerlist, we want to apply this function to every element in the container in such a way that it uses all available computational resources. Put another way, givenfn `parallelMapM` listwe expect its evaluation time to be bounded by the whatever element
einlistentailed the longest computational time when applied tofn. Formally stated, we anticipate the existence of a sequential-map function of complexityΟ(n), and a parallel-map function of complexityΩ(fn).This is a pressing need for any real-world programming language, and indeed it is satisfied in Haskell
IO, withtraverseandmapConcurrently. Evaluating this code will print each character in 'abcdef' in arbitrary order:The concurrency here is observable thanks to the
printstatement, and powered by the special support GHC has for itsIOtype. Once we get outside ofIO, things start getting a little more interesting.It's worth thinking about the cases when parallelism isn't observable. Because Haskell is a lazy language, the statement
map (+1) [1,2,3], in isolation, doesn't produce any computation, much less any concurrency: to get concurrency, we need to invoke functions on these values in a context (likeIOorSTM) where we can observe, thanks to side effects, that a given action has been invoked.However, if our program behavior is unaffected by whether a given
mapoperation is parallel or serial, theparallellibrary can come in handy. It provides combinators that use behind-the-scenes magic to tell GHC "hey, I've mappedfnoverlist: if you access any such application insidelist, you might want to start evaluating the others in parallel, because I expect you to compute them ASAP."It is atop these combinators (specifically
parTraversable) that we built aDistributeeffect, providingdistributeForanddistributeFoldMapfunctions:(Side note:
distributeFoldMapis worth mentioning, because it's a long name for a storied concept: map-reduce. Indeed,foldMapis so universal—it's one of the fundamental methods onFoldable, from which all otherFoldablemethods descend—because it describes the tremendously-common pattern of map-reduce with a monoidal, associative function.)So, ideally, we should be able to just use this effect and everything should be hunky-dory, right? Wrong. I said earlier that the combinators provided by
parallelonly take effect in cases where the parallelism isn't observable. Thus, the following code usingDistribute's combinators doesn't do what you think it might:This function constructs the computation over
"abcdef"in parallel, but evaluates it sequentially; as a result, we get the characters printed in the same order every time, in contrast to themapConcurrentlyexample above.Clearly, this is a bug: if
distributeForcan't actually distribute its computations, then it's not useful for anything. We get a small bit of parallelism out of the cases where we save time by constructing computations in parallel, despite evaluating them sequentially, but it's not real concurrency: the same project, evaluated in Semantic, will produce its observable results in the same way, every time.This raises the question: why not just use a
fused-effectsstyle wrapper formapConcurrently? TheLifteffect provides, with theliftWithfunction, an API that on its surface is powerful enough to do the lifting/unlifting required to promotemapConcurrentlyto this:In practice, this is not the case, because
liftWithmakes us handle our monadic state explicitly. Each invocation ofliftWithrequires that we tellfused-effectshow to propagate updates to monadic state, even if the lifting function does something weird like callcatchorthrow. Other languages don't have to worry about this, because they're always stuck in IO; in Haskell, any given effect might be in a pure or impure context, and since we might not be inIO, we can't rely on global mutable state to save and update monadic state as necessary. As such, we simulate stateful computations by using carrier monads of the forms -> , (s, a), functions that take a state and return a monadic pair of newly-updated-state and return value. The accumulated set of these stateful variables is called a "context", and to useliftWithwe have to indicate, before and after a lifting process, what we are doing with the context. This poses a problem for each call tomapConcurrently: if we map a monadic function over each element of a given list, how do we ensure that the state changes produced by that function are visible to future invocations?Because
fused-effectsdoes not allow us to constraint the type ofmapConcurrentlyso that we could provide vocabulary for this use case, we can't express it with justliftWith. This has been a point of investigation for some time. We are, however, able to access a more type-restricted version:UNDER CONSTRUCTION