[arXiv:2101.09699]

The post Longest segment of balanced parentheses — an exercise in program inversion in a segment problem appeared first on niche computing science.

]]>Given a string of parentheses, the task is to find a longest consecutive segment that is properly bracketed. We find it an interesting problem because it involves two techniques: the usual approach for solving segment problems, and the converse-of-a-function theorem — through which we derived an instance of shift-reduce parsing.

The post Longest segment of balanced parentheses — an exercise in program inversion in a segment problem appeared first on niche computing science.

]]>[arXiv:2101.09700|Haskell Code|Agda Proofs]

The post A greedy algorithm for dropping digits appeared first on niche computing science.

]]>Consider the puzzle: given a number, remove k digits such that the resulting number is as large as possible. Various techniques were employed to derive a linear-time solution to the puzzle: predicate logic was used to justify the structure of a greedy algorithm, a dependently-typed proof assistant was used to give a constructive proof of the greedy condition, and equational reasoning was used to calculate the greedy step as well as the final, linear-time optimisation.

[arXiv:2101.09700|Haskell Code|Agda Proofs]

The post A greedy algorithm for dropping digits appeared first on niche computing science.

]]>[PDF]

The post Not by equations alone: reasoning with extensible effects appeared first on niche computing science.

]]>The challenge of reasoning about programs with (multiple) effects such as mutation, jumps or IO dates back to the inception of program semantics in the works of Strachey and Landin. Using monads to represent individual effects and the associated equational laws to reason about them proved exceptionally effective. Even then it is not always clear what laws are to be associated with a monad — for a good reason, as we show for non-determinism. Combining expressions using different effects brings challenges not just for monads, which do not compose, but also for equational reasoning: the interaction of effects may invalidate their individual laws, as well as induce emerging properties that are not apparent in the semantics of individual effects. Overall, the problems are judging the adequacy of a law; determining if or when a law continues to hold upon addition of new effects; and obtaining and easily verifying emergent laws.

We present a solution relying on the framework of (algebraic, extensible) effects, which already proved itself for writing programs with multiple effects. Equipped with a fairly conventional denotational semantics, this framework turns useful, as we demonstrate, also for reasoning about and optimizing programs with multiple interacting effects. Unlike the conventional approach, equational laws are not imposed on programs/effect handlers, but induced from them: our starting point hence is a program (model), whose denotational semantics, besides being used directly, suggests and justifies equational laws and clarifies side-conditions. The main technical result is the introduction of the notion of *equivalence modulo handlers* (`modulo observation’) or a particular combination of handlers — and proving it to be a *congruence*. It is hence usable for reasoning in any context, not just evaluation contexts — provided particular conditions are met.

Concretely, we describe several realistic handlers for non-determinism and elucidate their laws (some of which hold in the presence of any other effect). We demonstrate appropriate equational laws of non-determinism in the presence of global state, which have been a challenge to state and prove before.

[PDF]

The post Not by equations alone: reasoning with extensible effects appeared first on niche computing science.

]]>[PDF |Agda Proofs]

The post Declarative pearl: deriving monadic quicksort appeared first on niche computing science.

]]>[PDF |Agda Proofs]

To demonstrate derivation of monadic programs, we present a specification of sorting using the non-determinism monad, and derive pure quicksort on lists and state-monadic quicksort on arrays. In the derivation one may switch between point-free and pointwise styles, and deploy techniques familiar to functional programmers such as pattern matching and induction on structures or on sizes. Derivation of stateful programs resembles reasoning backwards from the postcondition.

The post Declarative pearl: deriving monadic quicksort appeared first on niche computing science.

]]>The post How to Compute Fibonacci Numbers? appeared first on niche computing science.

]]>Let `Nat`

be the type of natural numbers. We shall all be familiar with the following definition of Fibonacci number:

```
``` fib :: Nat -> Nat
fib 0 = 0
fib 1 = 1
fib (n+2) = fib (n+1) + fib n

`(When defining functions on natural numbers I prefer to see `

`0`

, `(+1)`

(and thus `(+2)`

), as constructors that can appear on the LHS, while avoiding subtraction on the RHS. It makes some proofs more natural, and it is not hard to recover the Haskell definition anyway.)

= (+1) . (+1)

Executing the definition without other support (such as memoization) gives you a very slow algorithm, due to lots of re-computation. I had some programming textbooks in the 80’s wrongly using this as an evidence that “recursion is slow” (`fib`

is usually one of the only two examples in a sole chapter on recursion in such books, the other being tree traversal).

By defining `fib2 n = (fib n, fib (n+1))`

, one can easily derive an inductive definition of `fib2`

,

```
``` fib2 :: Nat -> (Nat, Nat)
fib2 0 = (0, 1)
fib2 (n+1) = (y, x+y)
where (x,y) = fib2 n

` which computes `

`fib n`

(and `fib (n+1)`

) in `O(n)`

recursive calls. Be warned, however, that it does not imply that `fib2 n`

runs in `O(n)`

time, as we shall see soon.

To be even faster, some might recall, do we not have a closed-form formula for Fibonacci numbers?

```
``` fib n = (((1+√5)/2)^n - ((1-√5)/2)^n) /√5

`It was believed that the formula was discovered by Jacques P. M. Binet in 1843, thus we call it `

*Binet’s formula* by convention, although the formula can be traced back earlier. Proving (or even discovering) the formula is a very good exercise in inductive proofs. On that I recommend this tutorial by Joe Halpern (CS 280 @ Cornell, , 2005). Having a closed-form formula gives one an impression that it give you a quick algorithm. Some even claim that it delivers a `O(1)`

algorithm for computing Fibonacci numbers. One shall not assume, however, that `((1+√5)/2)^n`

and `((1-√5)/2)^n`

can always be computed in a snap!

When processing large numbers, we cannot assume that arithmetic operations such as addition and multiplication take constant time. In fact, it is fascinating knowing that multiplying large numbers, something that appears to be the most fundamental, is a research topic that can still see new breakthrough in 2019 [HvdH19].

There is another family of algorithms that manages to compute `fib n`

in `O(log n)`

recursive calls. To construct such algorithms, one might start by asking oneself: can we express `fib (n+k)`

in terms of `fib n`

and `fib k`

(and some other nearby `fib`

if necessary)? Given such a formula, we can perhaps compute `fib (n+n)`

from `fib n`

, and design an algorithm that uses only `O(log n)`

recursive calls.

Indeed, for `n >= 1`

, we have

```
``` fib (n+k) = fib (n-1) * fib k + fib n * fib (k+1) . -- (Vor)

`This property can be traced back to Nikolai. N. Vorobev, and we therefore refer to it as `

*Vorobev’s Equation*. A proof will be given later. For now, let us see how it helps us.

With Vorobev’s equation we can derive a number of (similar) algorithms that computes `fib n`

in `O(log n)`

recursive calls. For example, let `n, k`

in (Vor) be `n+1, n`

, we get

```
``` fib (2n+1) = (fib (n+1))^2 + (fib n)^2 -- (1)

`Let `

`n, k`

be `n+1, n+1`

, we get

```
``` fib (2n+2) = 2 * fib n * fib (n+1) + (fib (n+1))^2 -- (2)

`Subtract (1) from (2), we get`

```
``` fib 2n = 2 * fib n * fib (n+1) - (fib n)^2 -- (3)

The LHS of (1) and (3) are respectively odd and even, while their RHSs involve only `fib n`

and `fib (n+1)`

. Define `fib2v n = (fib n, fib (n+1))`

, we can derive the program below, which uses only `O(log n)`

recursive calls.

```
``` fib2v :: Nat -> (Nat, Nat)
fib2v 0 = (0, 1)
fib2v n | n `mod` 2 == 0 = (c,d)
| otherwise = (d, c + d)
where (a, b) = fib2v (div n 2)
c = 2 * a * b - a * a
d = a * a + b * b

Having so many algorithms, the ultimate question is: which runs faster?

Interestingly, in 1988, James L. Holloway devoted an entire Master’s thesis to analysis and benchmarking of algorithms computing Fibonacci numbers. The thesis reviewed algorithms including (counterparts of) all those mentioned in this post so far, and some more algorithms based on matrix multiplication. I will summarise some of his results below.

For a theoretical analysis, we need know the number of bits needed to represent `fib n`

. Holloway estimated that to represent `fib n`

, we need approximately `n * 0.69424`

bits. We will denote this number by `N n`

. That `N n`

is linear in `n`

is consistent with our impression that `fib n`

grows exponentially in `n`

.

Algorithm `fib2`

makes `O(n)`

recursive calls, but it does not mean that the running time is `O(n)`

. Instead, `fib2 n`

needs around `N (n^2/2 - n/2)`

bit operations to compute. (Note that we are not talking about big-O here, but an approximated upper bound.)

What about Binet formula? We can compute `√5`

by Newton’s method. One can assume that each `n`

bit division needs `n^2`

operations. In each round, however, we need only the most significant `N n + log n`

bits. Overall, the number of bit operations needed to compute Binet formula is dominated by `log n * (N n + log n)^2`

— not faster than `fib2`

.

Holloway studied several matrix based algorithm. Generally, they need around `(N n)^2`

bit operations, multiplied by different constants.

Meanwhile, algorithms based on Vorobev’s Equation perform quite well — it takes about `1/2 * (N n)^2`

bit operations to compute `fib2v n`

!

What about benchmarking? Holloway ran each algorithm up to five minutes. In one of the experiments, the program based on Binet’s formula exceeds 5 minutes when `log n = 7`

. The program based on `fib2`

terminated within 5 minutes until `log n = 15`

. In another experiment (using simpler programs considering only cases when `n`

is a power of `2`

), the program based on Binet’s formula exceeds 5 minutes when `log n = 13`

. Meanwhile the matrix-based algorithms terminated within 3 to 5 seconds, while the program based on Vorobev’s Equation terminated within around 2 seconds.

Finally, let us see how Vorobev’s Equation can be proved. We perform induction on `n`

. The cases when `n := 1`

and `2`

can be easily established. Assuming the equation holds for `n`

(that is, (Vor)) and `n:= n+1`

(abbreviating `fib`

to `f`

):

```
``` f (n+1+k) = f n * f k + f (n+1) * f(k+1) -- (Vor')

`we prove the case for `

`n:=n+2`

:

```
``` f (n+2+k)
= { definition of f }
f (n+k) + f (n+k+1)
= { (Vor) & (Vor') }
f (n-1) * f k + f n * f (k+1) +
f n * f k + f (n+1) * f(k+1)
= { f (n+1) = f n + f (n-1) }
f (n+1) * f k + f n * f (k+1) + f (n+1) * f (k+1)
= { f (n+2) = f (n+1) + f n }
f (n+1) * f k + f (n+2) * f (k+1) .

`Thus completes the proof.`

Dijkstra derived another algorithm that computes `fib n`

in `O(log n)`

recursive calls in EWD654 [Dij78].

Besides his master’s thesis, Holloway and his supervisor Paul Cull also published a journal version of their results [CH89]. I do not know the whereabouts of Holloway — it seems that he didn’t pursue a career in academics. I wish him all the best. It comforts me imagining that any thesis that is written with enthusiasm and love, whatever the topic, will eventually be found by some readers who are also enthusiastic about it, somewhere, sometime.

I found many interesting information on this page hosted by Ron Knott from University of Surrey, and would recommend it too.

After the flamewar, Yoda Lee (李祐棠) conducted many experiments computing Fibonacci, taking into considerations things like precision of floating point computation and choosing suitable floating point libraries. It is worth a read too. (In Chinese.)

So, what was the flamewar about? It started with someone suggesting that we should store on the moon (yes, the moon. Don’t ask me why) some important constants such as `π`

and `e`

and, with the constants being available in very large precision, many problems can be solved in constant time. Then people started arguing what it means computing something in constant time, whether Binet’s formula gives you a constant time algorithm… and here we are. Silly, but we learned something fun.

[**CH89**] Paul Cull, James L. Holloway. Computing fibonacci numbers quickly. Information Processing Letters, 32(3), pp 143-149. 1989.

[**Dij78**] Dijkstra. In honor of Fibonacci. EWD654, 1978.

[**Hol88**] James L. Holloway. Algorithms for Computing Fibonaci Numbers Quickly. Master Thesis, Oregon State University, 1988.

[**HvdH19**] David Harvey, Joris Van Der Hoeven. Integer multiplication in time `O(n log n)`

. 2019. hal-02070778.

The post How to Compute Fibonacci Numbers? appeared first on niche computing science.

]]>The post Adjoint Functors Induce Monads and Comonads appeared first on niche computing science.

]]>Given categories `C`

and `D`

, we call two functors `L : C → D`

and `R : D → C`

a pair of *adjoint functors* if, for all object `A`

in `C`

and object `B`

in `D`

, we have the following *natural isomorphism*:

```
``` Hom (L A, B) ≅ Hom (A, R B)

This is denoted by `L ⊣ R`

. Functors `L`

and `R`

are respectively called *the left and the right adjoint*.

Concepts such as `Hom (A, B)`

and natural isomorphism will be explained in more detail later. For now, it suffices to say that `Hom (A, B)`

is the collection of all *morphisms* from `A`

to `B`

. For an example, in Set (the category of sets and total functions), `Hom (A, B)`

are all the functions having type `A → B`

, and that `Hom (L A, B) ≅ Hom (A, R B)`

can be understood as

```
``` L A → B ≅ A → R B

That is, given a function `L A → B`

there is a unique corresponding function `A → R B`

, and vice versa. A typical example is when `L A = A × S`

and `R B = S → B`

for some `S`

. Indeed we have

```
``` (A × S) → B ≅ A → (S → B)

` with the mapping from left-to-right being `

`curry`

, and the reverse mapping being `uncurry`

.

Note that Set is but an instance of a category (that is easier for me to understand). The notion of adjoint functors is much more general. For example, when the categories are such that the objects are elements of a partially ordered set, and there is a morphism `a → b`

if `a ≼ b`

(thus there is either zero or one morphism between any two objects), then `L`

and `R`

being adjoint functors means that they form a Galois connection.

That `Hom (L A, B)`

and `Hom (A, R B)`

being isomorphic means that there exists a pair of mappings `ϕ : Hom (L A, B) → Hom (A, R B)`

and `θ : Hom (A , R B) → Hom (L A, B)`

, such that ` ϕ ∘ θ = id`

and ` θ ∘ ϕ = id`

. Being *natural isomorphic* refers to an additional constraint: that `ϕ`

and `θ`

must be natural with respect to `A`

and `B`

. This is an important property that will be explained and used later.

If `L : C → D`

and `R : D → C`

form a pair of adjoint functors, `R ∘ L`

is a monad, while `L ∘ R`

is a comonad.

Recall the example `L A = A × S`

and `R B = S → B`

. Indeed, we have `(R ∘ L) A = S → (A × S)`

— the type of state monad!

Merely having the type does not constitute a monad — we have got to construct the monad operators. In a more programming-oriented definition, a monad `M : * → *`

comes with two operators `return : A → M A`

and `(>>=) : M A → (A → M B) → M B`

. In traditional, mathematics-oriented definition, a monad `M`

comes with three operators: `return`

, `map : (A → B) → M A → M B`

, and `join : M (M A) → M A`

— as a convention, `return`

and `join`

are often respectively written as `η`

and `μ`

. Dually, a comonad `N`

comes with three operators: `ε : N B → B`

, `map : (A → B) → N A → N B`

, and ` δ : N B → N (N B)`

.

As mentioned before, adjoint functors `L`

and `R`

induce a monad `M = R ∘ L`

and a comonad `N = L ∘ R`

. The operators `η`

and `ε`

are given by:

```
``` η : A → R (L A)
η = ϕ id -- id : L A → L A
ε : L (R B) → B
ε = θ id -- id : R B → R B

`The types of `

`id`

are given in the comments. Operators `μ`

and `δ`

can then be defined by:

```
``` μ : R (L (R (L A))) → R (L A)
μ = R ε
δ : L (R B) → L (R (L (R B)))
δ = L η

`where `

`η : R B → R (L (R B))`

and `ε : L (R (L A)) → L A`

.

The operators do have correct types. Do they satisfy the monad laws? There are six monad laws for for `(M, η, μ)`

:

`M id = id`

`M f ∘ M g = M (f ∘ g)`

`η ∘ f = M f ∘ η`

`M f ∘ μ = μ ∘ M (M f)`

`μ ∘ η = id = μ ∘ M η`

`μ ∘ μ = μ ∘ M μ`

The first two laws demand that `M`

be a functor. Since `L`

and `R`

are functors, the two laws are immediately true. Laws 3 and 4 demands that `η`

and `μ`

be natural transformations, while Laws 5 and 6 are important computational rules for monads. We have got to check they do hold for the definitions of `μ`

and `η`

.

For comonads there is a collections of dual laws. Since the proofs are dual, we talk only about the laws for monads in this post.

To prove the four remaining monad rules we need more properties about `ϕ`

and `θ`

. For that we give the concepts of hom-set and natural isomorphism, which we quickly skimmed through, a closer look.

The collection of all morphisms from `A`

to `B`

(both of them objects in category `C`

) is denoted by `Hom(A,B)`

. (In general `Hom(A,B)`

is not necessarily a set. When it happens to be a set, `C`

is called a *locally small category*. See hom-set on ncatlab for details.)

Given category `C`

, `Hom`

is also a functor `Cᵒᵖ × C → Set`

(where `Cᵒᵖ`

denotes the dual category of `C`

). It maps an object `(c,d)`

(in `Cᵒᵖ × C`

) to `Hom(A,B)`

, which is now an object in Set, and maps a pair of morphisms `f : A₂ → A₁`

與`g : B₁ → B₂`

to a morphism `Hom(A₁,B₁) → Hom(A₂, B₂)`

in Set, defined by

```
``` Hom : (A₂ → A₁ × B₁ → B₂) → Hom(A₁,B₁) → Hom(A₂, B₂)
Hom (f,g) h = g ∘ h ∘ f

(The “type” given to `Hom`

is not a rigorous notation, but to aid understanding. For more details, see hom-functor on ncatlab.)

Recall that, given functors `F`

and `G`

, when we say `h : F → G`

is a natural transformation (from `F`

to `G`

), we mean that `h`

is a series of morphisms — for each object `A`

there is a morphism `h : F A → G A`

, and for all `f : A → B`

we have the following

```
``` h ∘ F f = G f ∘ h

The types of `h`

on the left and right hand sides are respectively `F B → G B`

and `F A → G A`

.

Recall also that the definition of adjoint functors demands `ϕ`

and `θ`

be *natural with respect to A and B*. What we mean by being natural here is essentially the same, but slightly complicated by the fact that

`Hom`

is a more complex functor: for all `f : A₂ → A₁`

and `g : B₁ → B₂`

, we want

```
``` ϕ ∘ Hom (L f, g) = Hom (f, R g) ∘ ϕ
θ ∘ Hom (f, R g) = Hom (L f, g) ∘ θ

If we expanding the definition of `Hom`

, apply both sides of the equation regarding `ϕ`

to an argument `h : L A₁ → B₁`

, and apply both sides of the equation regarding `θ`

to an argument `k : A₁ → R B₁`

, we get

```
``` ϕ (g ∘ h ∘ L f) = R g ∘ ϕ h ∘ f
θ (R g ∘ k ∘ f) = g ∘ θ k ∘ L f

These naturality condition will be of crucial importance in the proofs later.

Now we are ready to prove the monad laws 3-6:

3. `η ∘ f = M f ∘ η`

:

```
``` R (L f) ∘ ϕ id
= { ϕ (g ∘ h ∘ L f) = R g ∘ ϕ h ∘ f, [g, h, f := L f, id, id] }
ϕ (L f ∘ id ∘ L id)
= ϕ (L f)
= ϕ (id ∘ id ∘ L f)
= { ϕ (g ∘ h ∘ L f) = R g ∘ ϕ h ∘ f, [g, h := id, id]}
ϕ id ∘ f

4. `M f ∘ μ = μ ∘ M (M f)`

```
``` R (θ id) ∘ (R (L (R (L f)))
= R (θ id ∘ L (R (L f)))
= { θ (R g ∘ k ∘ f) = g ∘ θ k ∘ L f, [g, k, f := id, id, R (L f)] }
R (θ (R id ∘ id, R (L f)))
= R (θ (R (L f)))
{ θ (R g ∘ k ∘ f) = g ∘ θ k ∘ L f, [g, k, f := L f, id, id] }
= R (L f ∘ θ id ∘ id)
= R (L f) ∘ R (θ id)

5.1 `μ ∘ η = id`

```
``` R (θ id) ∘ ϕ id
= { ϕ (g ∘ h ∘ L f) = R g ∘ ϕ h ∘ f, [g, h, f := θ id, id, id] }
ϕ (θ id ∘ id ∘ id)
= ϕ (θ id)
= { ϕ ∘ θ = id }
id

5.2 `μ ∘ M η = id`

```
``` R (θ id) ∘ R (L (ϕ id))
= R (θ id ∘ L (ϕ id))
= { θ (R g ∘ k ∘ f) = g ∘ θ k ∘ L f, [g, k, f := id, id, ϕ id] }
R (θ (R id ∘ id ∘ ϕ id))
= R (θ (ϕ id))
= { θ ∘ ϕ = id }
R id
= id .

6. `μ ∘ μ = μ ∘ M μ`

```
``` R (θ id) ∘ R (L (R (θ id)))
= R (θ id ∘ L (R (θ id)))
= { θ (R g ∘ k ∘ f) = g ∘ θ k ∘ L f, [g, k, f := id, id, R (θ id)] }
R (θ (R id ∘ id ∘ R (θ id)))
= R (θ (R (θ id)))
= { θ (R g ∘ k ∘ f) = g ∘ θ k ∘ L f, [g, k, f := θ id, id, id]}
R (θ id ∘ θ id)
= R (θ id) ∘ R (θ id)

The proofs above use only functor laws, the fact that `ϕ`

and `θ`

are inverses, and the naturality laws of `ϕ`

and `θ`

. Traditionally, the proofs would proceed by diagram chasing, which would probably be easier for those who familiar with them. I am personally happy about being able to construct these equational proofs, guided mostly by the syntax.

- adjoint functor on ncatlab.
- Anton Hilado, Adjoint Functors and Monads, June 20, 2017.
- Thorsten Wißmann, Adjunctions and monads. Seminar “Categories in Programming”, June 3, 2015.
- Steve Awodey, Monads and algebras. Course Notes of Category Theory, LMU Munich, Sommer Semester 2011.

The post Adjoint Functors Induce Monads and Comonads appeared first on niche computing science.

]]>The post Deriving Monadic Programs appeared first on niche computing science.

]]>That was how I started to take an interest in reasoning and derivation of monadic programs. Several years having passed, I collaborated with many nice people, managed to get some results published, failed to publish some stuffs I personally like, and am still working on some interesting tiny problems. This post summaries what was done, and what remains to be done.

Priori to that, all program reasoning I have done was restricted to pure programs. They are beautiful mathematical expressions suitable for equational reasoning, while effectful programs are the awkward squad not worthy of rigorous treatment — so I thought, and I could not have been more wrong! It turned out that there are plenty of fun reasoning one can do with monadic programs. The rule of the game is that you do not know how the monad you are working with is implemented, thus you rely only on the monad laws:

```
return >>= f = f
m >>= return = m
(m >>= f) >>= g = m >>= (\x -> f x >>= g)
```

and the laws of the effect operators. For non-determinism monad we usually assume two operators: `0`

for failure, and `(|)`

for non-deterministic choice (usually denoted by `mzero`

and `mplus`

of the type class `MonadPlus`

). It is usually assumed that `(|)`

is associative with `0`

as its identity element, and they interact with `(>>=)`

by the following laws:

```
0 >>= f = 0 (left-zero)
(m1 | m2) >>= f = (m1 >>= f) | (m2 >>= f) (left-distr.)
m >>= 0 = 0 (right-zero)
m >>= (\x -> f1 x | f2 x) = (m >>= f1) | (m >>= f2) (right-distr.)
```

The four laws are respectively named *left-zero*, *left-distributivity*, *right-zero*, and *right-distributivity*, about which we will discuss more later. These laws are sufficient for proving quite a lot of interesting properties about non-deterministic monad, as well as properties of Spark programs. I find it very fascinating.

Unfortunately, it turns out that monads were too heavy a machinery for the target readers of the Spark paper. The version we eventually published in NETYS 2017 [CHLM17] consists of pure-looking functional programs that occasionally uses “non-deterministic functions” in an informal, but probably more accessible way. Ondřej Lengál should be given credit for most, if not all of the proofs. My proofs using non-deterministic monad was instead collected in a tech. report [Mu19a]. (Why a tech. report? We will come to this later.)

Certainly, it would be more fun if, besides non-determinism, more effects are involved. I have also been asking myself: rather than proving properties of given programs, can I *derive* monadic programs? For example, is it possible to start from a non-deterministic specification, and derive a program solving the problem using states?

The most obvious class of problems that involve both non-determinism and state are backtracking programs. Thus I tried to tackle a problem previously dealt with by Jeremy Gibbons and Ralf Hinze [GH11], the `n`

-Queens problem — placing `n`

queens on a `n`

by `n`

chess board in such a way that no queen can attack another. The specification non-deterministically generates all chess arrangements, before filtering out safe ones. We wish to derive a backtracking program that remembers the currently occupied diagonals in a state monad.

Jeremy Gibbons suggested to generalise the problem a bit: given a problem specification in terms of a non-deterministic `scanl`

, is it possible to transform it to a non-deterministic *and* stateful `foldr`

?

Assuming all the previous laws and, in addition, laws about `get`

and `put`

of state monad (the same as those assumed by Gibbons and Hinze [GH11], omitted here), I managed to come up with some general theorems for such transformations.

The interaction between non-determinism and state turned out to be intricate. Recall the *right-zero* and *right-distributivity* laws:

```
m >>= 0 = 0 (right-zero)
m >>= (\x -> f1 x | f2 x) = (m >>= f1) | (m >>= f2) (right-distr.)
```

While they do not explicit mention state at all, with the presence of state, these two laws imply that *each non-deterministic branch has its own copy of the state*. In the *right-zero* law, if a computation fails, it just fails — all state modifications in `m`

are forgotten. In *right-distributivity*, the two `m`

on the RHS each operates on their local copy of the state, thus locally it appears that the side effects in `m`

happen only once.

We call a non-deterministic state monad satisfying these laws a *local state* monad. A typical example is `M a = S -> List (a, S)`

where `S`

is the type of the state — modulo order and repetition in the list monad, that is. The same monad can be constructed by `StateT s (ListT Identity)`

in the Monad Transformer Library. With effect handling [KI15], we get the desired monad if we run the handler for state before that for list.

The local state monad is the ideal combination of non-determinism and state we would like to have. It has nice properties, and is much more manageable. However, there are practical reasons where one may want a state to be shared globally. For example, when the state is a large array that is costly to copy. Typically one uses operations to explicit “roll back” the global state to its previous configuration upon the end of each non-deterministic branch.

Can we reason about programs that use a global state?

The non-determinism monad with a global state turns out to be a weird beast to tame.

While we are concerned with what laws a monad satisfy, rather than how it is implemented, we digress a little and consider how to implement a global state monad, just to see the issues involved. By intuition one might guess `M a = S -> (List a, S)`

, but that is not even a monad — the direct but naive implementation of its `(>>=)`

does not meet the monad laws! The type `ListT (State s)`

generated using the Monad Transformer Library expands to essentially the same implementation, and is flawed in the same way (but the authors of MTL do not seem to bother fixing it). For correct implementations, see discussions on the Haskell wiki. With effect handling [KI15], we do get a monad by running the handler for list before that for state.

Assuming that we do have a correct implementation of a global state monad. What can we say about the it? We do not have *right-zero* and *right-distributivity* laws anymore, but *left-zero* and *left-distributivity* still hold. For now we assume an informal, intuitive understanding of the semantics: a global state is shared among non-deterministic branches, which are executed left-to-right. We will need more laws to, for example, formally specify what we mean by “the state is shared”. This will turn out to be tricky, so we postpone that for illustrative purpose.

In backtracking algorithms that keep a global state, it is a common pattern to

- update the current state to its next step,
- recursively search for solutions, and
- roll back the state to the previous step.

To implement such pattern as a monadic program, one might come up with something like the code below:

```
modify next >> search >>= modReturn prev
```

where `next`

advances the state, `prev`

undoes the modification of `next`

, and `modify`

and `modReturn`

are defined by:

```
modify f = get >>= (put . f)
modReturn f v = modify f >> return v
```

Let the initial state be `st`

and assume that `search`

found three choices `m1 | m2 | m3`

. The intention is that `m1`

, `m2`

, and `m3`

all start running with state `next st`

, and the state is restored to `prev (next st) = st`

afterwards. By *left-distributivity*, however,

```
modify next >> (m1 | m2 | m3) >>= modReturn prev =
modify next >> ( (m1 >>= modReturn prev) |
(m2 >>= modReturn prev) |
(m3 >>= modReturn prev))
```

which, with a global state, means that `m2`

starts with state `st`

, after which the state is rolled back too early to `prev st`

. The computation `m3`

starts with `prev st`

, after which the state is rolled too far to `prev (prev st)`

.

We need a way to say that “`modify next`

and `modReturn prev`

are run exactly once, respectively before and after all non-deterministic branches in `solve`

.” Fortunately, we have discovered a curious technique. Since non-deterministic branches are executed sequentially, the program

```
(modify next >> 0) | m1 | m2 | m3 | (modify prev >> 0)
```

executes `modify next`

and `modify prev`

once, respectively before and after all the non-deterministic branches, even if they fail. Note that `modify next >> 0`

does not generate a result. Its presence is merely for the side-effect of `modify next`

.

The reader might wonder: now that we are using `(|)`

as a sequencing operator, does it simply coincide with `(>>)`

? Recall that we still have left-distributivity and, therefore, `(m1 | m2) >> n`

equals `(m1 >> n) | (m2 >> n)`

. That is, `(|)`

acts as “insertion points”, where future code followed by `(>>)`

can be inserted into! This is certainly a dangerous feature, whose undisciplined use can lead to chaos.

To be slightly disciplined, we can go a bit further by defining the following variations of `put`

, which restores the original state when it is backtracked over:

```
putR s = get >>= (\s0 -> put s | (put s0 >> 0))
```

To see how it works, assuming that some computation `comp`

follows `putR s`

. By left-distributivity we get:

```
putR s >> comp
= (get >>= (\s0 -> put s | (put s0 >> 0))) >> comp
= { monad laws, left dist., left zero }
get >>= (\s0 -> put s >> comp |
(put s0 >> 0))
```

Therefore, `comp`

runs with new state `s`

. After it finishes, the current state `s0`

is restored.

The hope is that, by replacing all `put`

with `putR`

, we can program as if we are working with local states, while there is actually a shared global state.

(I later learned that Tom Schrijvers had developed similar and more complete techniques, in the context of simulating Prolog boxes in Haskell.)

So was the idea. I had to find out what laws are sufficient to formally specify the behaviour of a global state monad (note that the discussion above has been informal), and make sure that there exists a model/implementation satisfying these laws.

I prepared a draft paper containing proofs about Spark functions using non-determinism monad, a derivation of backtracking algorithms solving problems including `n`

-Queens using a local state monad and, after proposing laws a global state monad should satisfy, derived another backtracking algorithm using a shared global state. I submitted the draft and also sent the draft to some friends for comments. Very soon, Tom Schrijvers wrote back and warned me: the laws I proposed for the global state monad could not be true!

I quickly withdrew the draft, and invited Tom Schrijvers to collaborate and fix the issues. Together with Koen Pauwels, they carefully figured out what the laws should be, showed that the laws are sufficient to guarantee that one can simulate local states using a global state (in the context of effect handling), that there exists a model/implementation of the monad, and verified key theorems in Coq. That resulted in a paper Handling local state with global state, which we published in MPC 2019.

The paper is about semantical concerns of the local/global state interaction. I am grateful to Koen and Tom, who deserve credits for most of the hard work — without their help the paper could not have been done. The backtracking algorithm, meanwhile, became a motivating example that was briefly mentioned.

I was still holding out hope that my derivations could be published in a conference or journal, until I noticed, by chance, a submission to MPC 2019 by Affeldt et al [ANS19]. They formalised a hierarchy of monadic effects in Coq and, for demonstration, needed examples of equational reasoning about monadic programs. They somehow found the draft that was previously withdrawn, and corrected some of its errors. I am still not sure how that happened — I might have put the draft on my web server to communicate with my students, and somehow it showed up on the search engine. The file name was `test.pdf`

. And that was how the draft was cited!

“Oh my god,” I thought in horror, “please do not cite an unfinished work of mine, especially when it is called `test.pdf`

!”

I quickly wrote to the authors, thanked them for noticing the draft and finding errors in it, and said that I will turn them to tech. reports, which they can cite more properly. That resulted in two tech. reports: Equational reasoning for non-determinism monad: the case of Spark aggregation [Mu19a] contains my proofs of Spark programs, and Calculating a backtracking algorithm: an exercise in monadic program derivation [Mu19b] the derivation of backtracking algorithms.

There are plenty of potentially interesting topics one can do with monadic program derivation. For one, people have been suggesting pointwise notations for relational program calculation (e.g. de Moor and Gibbons [dMG00], Bird and Rabe [RB19]). I believe that monads offer a good alternative. Plenty of relational program calculation can be carried out in terms of non-determinism monad. Program refinement can be defined by

`m1 ⊆ m2 ≡ m1 | m2 = m2`

This definition applies to monads having other effects too. I have a draft demonstrating the idea with quicksort. Sorting is specified by a non-determinism monad returning a permutation of the input that is sorted — when the ordering is not anti-symmetric, there can be more than one ways to sort a list, therefore the specification is non-deterministic. From the specification, one can derive pure quicksort on lists, as well as quicksort that mutates an array. Let us hope I have better luck publishing it this time.

With Kleisli composition, there is even a natural definition of factors. Lifting `(⊆)`

to functions (that is `f ⊆ g ≡ (∀ x : f x ⊆ g x)`

), and recall that `(f >=> g) x = f x >>= g`

, the left factor `(\)`

can be specified by the Galois connection:

`(f >=> g) ⊆ h ≡ g ⊆ (f \ h)`

That is, `f \ h`

is the most non-deterministic (least constrained) monadic program that, when ran after the postcondition set up by `f`

, still meets the result specified by `h`

.

If, in addition, we have a proper notion of *converses*, I believe that plenty of optimisation problems can be specified and solved using calculation rules of factors and converses. I believe these are worth exploring.

[**ANS19**] Reynald Affeldt, David Nowak and Takafumi Saikawa. A hierarchy of monadic effects for program verification using equational reasoning. In *Mathematics of Program Construction (MPC)*, Graham Hutton, editor, pp. 226-254. Springer, 2019.

[**BR19**] Richard Bird, Florian Rabe. How to calculate with nondeterministic functions. In *Mathematics of Program Construction (MPC)*, Graham Hutton, editor, pp. 138-154. Springer, 2019.

[**CHLM17**] Yu-Fang Chen, Chih-Duo Hong, Ondřej Lengál, Shin-Cheng Mu, Nishant Sinha, and Bow-Yaw Wang. An executable sequential specification for Spark aggregation. In *Networked Systems (NETYS)*, pp. 421-438. 2017.

[**GH11**] Jeremy Gibbons, Ralf Hinze. Just do it: simple monadic equational reasoning. In *International Conference on Functional Programming (ICFP)*, pp 2-14, 2011.

[**KI15**] Oleg Kiselyov, Hiromi Ishii. Freer monads, more extensible effects. In *Symposium on Haskell*, pp 94-105, 2015.

[**dMG00**] Oege de Moor, Jeremy Gibbons. Pointwise relational programming. In Rus, T. (ed.) *Algebraic Methodology and Software Technology*. pp. 371–390, Springer, 2000.

[**Mu19a**] Shin-Cheng Mu. Equational reasoning for non-determinism monad: the case of Spark aggregation. Tech. Report TR-IIS-19-002, Institute of Information Science, Academia Sinica, June 2019.

[**Mu19b**] Shin-Cheng Mu. Calculating a backtracking algorithm: an exercise in monadic program derivation. Tech. Report TR-IIS-19-003, Institute of Information Science, Academia Sinica, June 2019.

[**PSM19**] Koen Pauwels, Tom Schrijvers and Shin-Cheng Mu. Handling local state with global state. In *Mathematics of Program Construction (MPC)*, Graham Hutton, editor, pp. 18-44. Springer, 2019.

The post Deriving Monadic Programs appeared first on niche computing science.

]]>[PDF]

The post Handling local state with global state appeared first on niche computing science.

]]>[PDF]

Equational reasoning is one of the most important tools of functional programming. To facilitate its application to monadic programs, Gibbons and Hinze have proposed a simple axiomatic approach using laws that characterise the computational effects without exposing their implementation details. At the same time Plotkin and Pretnar have proposed algebraic effects and handlers, a mechanism of layered abstractions by which effects can be implemented in terms of other effects.

This paper performs a case study that connects these two strands of research. We consider two ways in which the nondeterminism and state effects can interact: the high-level semantics where every nondeterministic branch has a local copy of the state, and the low-level semantics where a single sequentially threaded state is global to all branches. We give a monadic account of the folklore technique of handling local state in terms of global state, provide a novel axiomatic characterisation of global state and prove that the handler satisfies Gibbons and Hinze’s local state axioms by means of a novel combination of free monads and contextual equivalence. We also provide a model for global state that is necessarily non-monadic.

The post Handling local state with global state appeared first on niche computing science.

]]>[PDF]

The post Calculating a backtracking algorithm: an exercise in monadic program derivation appeared first on niche computing science.

]]>[PDF]

Equational reasoning is among the most important tools that functional programming provides us. Curiously, relatively less attention has been paid to reasoning about monadic programs. In this report we derive a backtracking algorithm for problem specifications that use a monadic unfold to generate possible solutions, which are filtered using a `scanl`

-like predicate. We develop theorems that convert a variation of `scanl`

to a `foldr`

that uses the state monad, as well as theorems constructing hylomorphism. The algorithm is used to solve the `n`

-queens puzzle, our running example. The aim is to develop theorems and patterns useful for the derivation of monadic programs, focusing on the intricate interaction between state and non-determinism.

The post Calculating a backtracking algorithm: an exercise in monadic program derivation appeared first on niche computing science.

]]>[PDF]

The post Equational reasoning for non-determinism monad: the case of Spark aggregation appeared first on niche computing science.

]]>[PDF]

As part of the author’s studies on equational reasoning for monadic programs, this report focus on nondeterminism monad. We discuss what properties this monad should satisfy, what additional operators and

notations can be introduced to facilitate equational reasoning about non-determinism, and put them to the test by proving a number of properties in our example problem inspired by the author’s previous work on proving properties of Spark aggregation.

The post Equational reasoning for non-determinism monad: the case of Spark aggregation appeared first on niche computing science.

]]>