Around 2016-17, my colleagues in Academia Sinica invited me to join their project reasoning about Spark, a platform for distributive computation. The aim was to build a Haskell model of Spark and answer questions like “under what conditions is this Spark aggregation deterministic?” Being a distributive computation model, Spark is intrinsically non-deterministic. To properly model non-determinism, I thought, I had to use monads.

That was how I started to take an interest in reasoning and derivation of monadic programs. Several years having passed, I collaborated with many nice people, managed to get some results published, failed to publish some stuffs I personally like, and am still working on some interesting tiny problems. This post summaries what was done, and what remains to be done.

## Non-determinism

Priori to that, all program reasoning I have done was restricted to pure programs. They are beautiful mathematical expressions suitable for equational reasoning, while effectful programs are the awkward squad not worthy of rigorous treatment — so I thought, and I could not have been more wrong! It turned out that there are plenty of fun reasoning one can do with monadic programs. The rule of the game is that you do not know how the monad you are working with is implemented, thus you rely only on the monad laws:

```
return >>= f = f
m >>= return = m
(m >>= f) >>= g = m >>= (\x -> f x >>= g)
```

and the laws of the effect operators. For non-determinism monad we usually assume two operators: `0`

for failure, and `(|)`

for non-deterministic choice (usually denoted by `mzero`

and `mplus`

of the type class `MonadPlus`

). It is usually assumed that `(|)`

is associative with `0`

as its identity element, and they interact with `(>>=)`

by the following laws:

```
0 >>= f = 0 (left-zero)
(m1 | m2) >>= f = (m1 >>= f) | (m2 >>= f) (left-distr.)
m >>= 0 = 0 (right-zero)
m >>= (\x -> f1 x | f2 x) = (m >>= f1) | (m >>= f2) (right-distr.)
```

The four laws are respectively named *left-zero*, *left-distributivity*, *right-zero*, and *right-distributivity*, about which we will discuss more later. These laws are sufficient for proving quite a lot of interesting properties about non-deterministic monad, as well as properties of Spark programs. I find it very fascinating.

Unfortunately, it turns out that monads were too heavy a machinery for the target readers of the Spark paper. The version we eventually published in NETYS 2017 [CHLM17] consists of pure-looking functional programs that occasionally uses “non-deterministic functions” in an informal, but probably more accessible way. Ondřej Lengál should be given credit for most, if not all of the proofs. My proofs using non-deterministic monad was instead collected in a tech. report [Mu19a]. (Why a tech. report? We will come to this later.)

## State and Non-determinism

Certainly, it would be more fun if, besides non-determinism, more effects are involved. I have also been asking myself: rather than proving properties of given programs, can I *derive* monadic programs? For example, is it possible to start from a non-deterministic specification, and derive a program solving the problem using states?

The most obvious class of problems that involve both non-determinism and state are backtracking programs. Thus I tried to tackle a problem previously dealt with by Jeremy Gibbons and Ralf Hinze [GH11], the `n`

-Queens problem — placing `n`

queens on a `n`

by `n`

chess board in such a way that no queen can attack another. The specification non-deterministically generates all chess arrangements, before filtering out safe ones. We wish to derive a backtracking program that remembers the currently occupied diagonals in a state monad.

Jeremy Gibbons suggested to generalise the problem a bit: given a problem specification in terms of a non-deterministic `scanl`

, is it possible to transform it to a non-deterministic *and* stateful `foldr`

?

Assuming all the previous laws and, in addition, laws about `get`

and `put`

of state monad (the same as those assumed by Gibbons and Hinze [GH11], omitted here), I managed to come up with some general theorems for such transformations.

The interaction between non-determinism and state turned out to be intricate. Recall the *right-zero* and *right-distributivity* laws:

```
m >>= 0 = 0 (right-zero)
m >>= (\x -> f1 x | f2 x) = (m >>= f1) | (m >>= f2) (right-distr.)
```

While they do not explicit mention state at all, with the presence of state, these two laws imply that *each non-deterministic branch has its own copy of the state*. In the *right-zero* law, if a computation fails, it just fails — all state modifications in `m`

are forgotten. In *right-distributivity*, the two `m`

on the RHS each operates on their local copy of the state, thus locally it appears that the side effects in `m`

happen only once.

We call a non-deterministic state monad satisfying these laws a *local state* monad. A typical example is `M a = S -> List (a, S)`

where `S`

is the type of the state — modulo order and repetition in the list monad, that is. The same monad can be constructed by `StateT s (ListT Identity)`

in the Monad Transformer Library. With effect handling [KI15], we get the desired monad if we run the handler for state before that for list.

The local state monad is the ideal combination of non-determinism and state we would like to have. It has nice properties, and is much more manageable. However, there are practical reasons where one may want a state to be shared globally. For example, when the state is a large array that is costly to copy. Typically one uses operations to explicit “roll back” the global state to its previous configuration upon the end of each non-deterministic branch.

Can we reason about programs that use a global state?

## Global State

The non-determinism monad with a global state turns out to be a weird beast to tame.

While we are concerned with what laws a monad satisfy, rather than how it is implemented, we digress a little and consider how to implement a global state monad, just to see the issues involved. By intuition one might guess `M a = S -> (List a, S)`

, but that is not even a monad — the direct but naive implementation of its `(>>=)`

does not meet the monad laws! The type `ListT (State s)`

generated using the Monad Transformer Library expands to essentially the same implementation, and is flawed in the same way (but the authors of MTL do not seem to bother fixing it). For correct implementations, see discussions on the Haskell wiki. With effect handling [KI15], we do get a monad by running the handler for list before that for state.

Assuming that we do have a correct implementation of a global state monad. What can we say about the it? We do not have *right-zero* and *right-distributivity* laws anymore, but *left-zero* and *left-distributivity* still hold. For now we assume an informal, intuitive understanding of the semantics: a global state is shared among non-deterministic branches, which are executed left-to-right. We will need more laws to, for example, formally specify what we mean by “the state is shared”. This will turn out to be tricky, so we postpone that for illustrative purpose.

In backtracking algorithms that keep a global state, it is a common pattern to

- update the current state to its next step,
- recursively search for solutions, and
- roll back the state to the previous step.

To implement such pattern as a monadic program, one might come up with something like the code below:

```
modify next >> search >>= modReturn prev
```

where `next`

advances the state, `prev`

undoes the modification of `next`

, and `modify`

and `modReturn`

are defined by:

```
modify f = get >>= (put . f)
modReturn f v = modify f >> return v
```

Let the initial state be `st`

and assume that `search`

found three choices `m1 | m2 | m3`

. The intention is that `m1`

, `m2`

, and `m3`

all start running with state `next st`

, and the state is restored to `prev (next st) = st`

afterwards. By *left-distributivity*, however,

```
modify next >> (m1 | m2 | m3) >>= modReturn prev =
modify next >> ( (m1 >>= modReturn prev) |
(m2 >>= modReturn prev) |
(m3 >>= modReturn prev))
```

which, with a global state, means that `m2`

starts with state `st`

, after which the state is rolled back too early to `prev st`

. The computation `m3`

starts with `prev st`

, after which the state is rolled too far to `prev (prev st)`

.

## Nondeterministic Choice as Sequencing

We need a way to say that “`modify next`

and `modReturn prev`

are run exactly once, respectively before and after all non-deterministic branches in `solve`

.” Fortunately, we have discovered a curious technique. Since non-deterministic branches are executed sequentially, the program

```
(modify next >> 0) | m1 | m2 | m3 | (modify prev >> 0)
```

executes `modify next`

and `modify prev`

once, respectively before and after all the non-deterministic branches, even if they fail. Note that `modify next >> 0`

does not generate a result. Its presence is merely for the side-effect of `modify next`

.

The reader might wonder: now that we are using `(|)`

as a sequencing operator, does it simply coincide with `(>>)`

? Recall that we still have left-distributivity and, therefore, `(m1 | m2) >> n`

equals `(m1 >> n) | (m2 >> n)`

. That is, `(|)`

acts as “insertion points”, where future code followed by `(>>)`

can be inserted into! This is certainly a dangerous feature, whose undisciplined use can lead to chaos.

To be slightly disciplined, we can go a bit further by defining the following variations of `put`

, which restores the original state when it is backtracked over:

```
putR s = get >>= (\s0 -> put s | (put s0 >> 0))
```

To see how it works, assuming that some computation `comp`

follows `putR s`

. By left-distributivity we get:

```
putR s >> comp
= (get >>= (\s0 -> put s | (put s0 >> 0))) >> comp
= { monad laws, left dist., left zero }
get >>= (\s0 -> put s >> comp |
(put s0 >> 0))
```

Therefore, `comp`

runs with new state `s`

. After it finishes, the current state `s0`

is restored.

The hope is that, by replacing all `put`

with `putR`

, we can program as if we are working with local states, while there is actually a shared global state.

(I later learned that Tom Schrijvers had developed similar and more complete techniques, in the context of simulating Prolog boxes in Haskell.)

## Handling Local State with Global State

So was the idea. I had to find out what laws are sufficient to formally specify the behaviour of a global state monad (note that the discussion above has been informal), and make sure that there exists a model/implementation satisfying these laws.

I prepared a draft paper containing proofs about Spark functions using non-determinism monad, a derivation of backtracking algorithms solving problems including `n`

-Queens using a local state monad and, after proposing laws a global state monad should satisfy, derived another backtracking algorithm using a shared global state. I submitted the draft and also sent the draft to some friends for comments. Very soon, Tom Schrijvers wrote back and warned me: the laws I proposed for the global state monad could not be true!

I quickly withdrew the draft, and invited Tom Schrijvers to collaborate and fix the issues. Together with Koen Pauwels, they carefully figured out what the laws should be, showed that the laws are sufficient to guarantee that one can simulate local states using a global state (in the context of effect handling), that there exists a model/implementation of the monad, and verified key theorems in Coq. That resulted in a paper Handling local state with global state, which we published in MPC 2019.

The paper is about semantical concerns of the local/global state interaction. I am grateful to Koen and Tom, who deserve credits for most of the hard work — without their help the paper could not have been done. The backtracking algorithm, meanwhile, became a motivating example that was briefly mentioned.

## Tech. Reports

I was still holding out hope that my derivations could be published in a conference or journal, until I noticed, by chance, a submission to MPC 2019 by Affeldt et al [ANS19]. They formalised a hierarchy of monadic effects in Coq and, for demonstration, needed examples of equational reasoning about monadic programs. They somehow found the draft that was previously withdrawn, and corrected some of its errors. I am still not sure how that happened — I might have put the draft on my web server to communicate with my students, and somehow it showed up on the search engine. The file name was `test.pdf`

. And that was how the draft was cited!

“Oh my god,” I thought in horror, “please do not cite an unfinished work of mine, especially when it is called `test.pdf`

!”

I quickly wrote to the authors, thanked them for noticing the draft and finding errors in it, and said that I will turn them to tech. reports, which they can cite more properly. That resulted in two tech. reports: Equational reasoning for non-determinism monad: the case of Spark aggregation [Mu19a] contains my proofs of Spark programs, and Calculating a backtracking algorithm: an exercise in monadic program derivation [Mu19b] the derivation of backtracking algorithms.

## Pointwise Relational Program Calculation

There are plenty of potentially interesting topics one can do with monadic program derivation. For one, people have been suggesting pointwise notations for relational program calculation (e.g. de Moor and Gibbons [dMG00], Bird and Rabe [RB19]). I believe that monads offer a good alternative. Plenty of relational program calculation can be carried out in terms of non-determinism monad. Program refinement can be defined by

`m1 ⊆ m2 ≡ m1 | m2 = m2`

This definition applies to monads having other effects too. I have a draft demonstrating the idea with quicksort. Sorting is specified by a non-determinism monad returning a permutation of the input that is sorted — when the ordering is not anti-symmetric, there can be more than one ways to sort a list, therefore the specification is non-deterministic. From the specification, one can derive pure quicksort on lists, as well as quicksort that mutates an array. Let us hope I have better luck publishing it this time.

With Kleisli composition, there is even a natural definition of factors. Lifting `(⊆)`

to functions (that is `f ⊆ g ≡ (∀ x : f x ⊆ g x)`

), and recall that `(f >=> g) x = f x >>= g`

, the left factor `(\)`

can be specified by the Galois connection:

`(f >=> g) ⊆ h ≡ g ⊆ (f \ h)`

That is, `f \ h`

is the most non-deterministic (least constrained) monadic program that, when ran after the postcondition set up by `f`

, still meets the result specified by `h`

.

If, in addition, we have a proper notion of *converses*, I believe that plenty of optimisation problems can be specified and solved using calculation rules of factors and converses. I believe these are worth exploring.

## References

[**ANS19**] Reynald Affeldt, David Nowak and Takafumi Saikawa. A hierarchy of monadic effects for program verification using equational reasoning. In *Mathematics of Program Construction (MPC)*, Graham Hutton, editor, pp. 226-254. Springer, 2019.

[**BR19**] Richard Bird, Florian Rabe. How to calculate with nondeterministic functions. In *Mathematics of Program Construction (MPC)*, Graham Hutton, editor, pp. 138-154. Springer, 2019.

[**CHLM17**] Yu-Fang Chen, Chih-Duo Hong, Ondřej Lengál, Shin-Cheng Mu, Nishant Sinha, and Bow-Yaw Wang. An executable sequential specification for Spark aggregation. In *Networked Systems (NETYS)*, pp. 421-438. 2017.

[**GH11**] Jeremy Gibbons, Ralf Hinze. Just do it: simple monadic equational reasoning. In *International Conference on Functional Programming (ICFP)*, pp 2-14, 2011.

[**KI15**] Oleg Kiselyov, Hiromi Ishii. Freer monads, more extensible effects. In *Symposium on Haskell*, pp 94-105, 2015.

[**dMG00**] Oege de Moor, Jeremy Gibbons. Pointwise relational programming. In Rus, T. (ed.) *Algebraic Methodology and Software Technology*. pp. 371–390, Springer, 2000.

[**Mu19a**] Shin-Cheng Mu. Equational reasoning for non-determinism monad: the case of Spark aggregation. Tech. Report TR-IIS-19-002, Institute of Information Science, Academia Sinica, June 2019.

[**Mu19b**] Shin-Cheng Mu. Calculating a backtracking algorithm: an exercise in monadic program derivation. Tech. Report TR-IIS-19-003, Institute of Information Science, Academia Sinica, June 2019.

[**PSM19**] Koen Pauwels, Tom Schrijvers and Shin-Cheng Mu. Handling local state with global state. In *Mathematics of Program Construction (MPC)*, Graham Hutton, editor, pp. 18-44. Springer, 2019.