General Recursion using Coindutive Monad

Agda, like many dependently typed functional languages, recognises a variation of structural recursion. We need some tricks to encode general recursion. With well-founded recursion, you add an extra argument to the general-recursive function you would like to define. The extra argument is a proof that one input of the function and the successive value passed to the recursive call are related by a well-founded ordering. Therefore, the recursion cannot go on forever. The function passes the termination check because it is structural recursive on the proof term.

I like well-founded recursion because it relates well with how, in my understanding, termination proofs are done in relational program derivation. For real programming, however, the extra argument clutters the code. For more complex cases, such as the McCarthian maximum function where the termination proof depends on the result of the function, the function has to be altered in ways making it rather incomprehensible. It would be nice to if we could write the program and prove its termination elsewhere.

Adam Megacz published an interesting paper: A Coinductive Monad for Prop-Bounded Recursion, in which he summarised a number of approaches to perform general recursion in a dependently typed language before presenting his own, an extension to Venanzio Capretta‘s General Recursion via Coinductive Types. As a practice, I tried to port his code to Agda.

A Monad for Non-Terminating Computation

The codatatype Comp, representing a possibly non-terminating computation, is a monad-like structure:

codata Comp (a : Set) : Set where
   return : a -> Comp a
   _⟩⟩=_ : Comp a -> (a -> Comp a) -> Comp a

While the monadic bind operator has type M a -> (a -> M b) -> M b, the _⟩⟩=_ operator for Comp has a more specialised type. This suffices when we use it only at the point of recursive calls. I suspect that we may need to generalise the type when we construct mutually recursive functions.

General-recursive programs are written in monadic style. For example, the function div, for integral division, can be written as:

div : ℕ -> ℕ -> Comp ℕ
div zero n ~ return zero
div (suc m) zero ~ return (suc m)
div (suc m) (suc n) ~
       div (suc m ∸ suc n) (suc n) ⟩⟩= \h ->
       return (suc h)

An example harder to tackle would be McCarthy’s 91 function. To be consistent with my earlier blog entry (well, the real reason is that I am lazy. And Megacz has proved the termination of the 91 function anyway), consider the following simplification:

f : ℕ -> ℕ -> Comp ℕ
f k n with k ≤′′? n
... | inj₁ k≤n ~ return n
... | inj₂ n<k ~ f k (suc n) ⟩⟩= \x -> f k x

The function, seen in Bengt Nordström‘s Terminating General Recursion, returns the maximum of k and n, and will be referred to as the McCarthian Maximum function.

Terminating Computation

In the definitions of div and f, we have merely built some (potentially infinite) codata objects representing the computations. Now we will build an interpreter that would actually carry out a computation, provided that it is terminating. The idea here is similar to what we did in well-founded recursion: construct a proof that the computation terminates, and the interpreter would be structurally recursive on the proof term.

The datatype _↓ characterises terminating computations:

data _↓ {a : Set} : Comp a -> Set where
   ↓-return : forall {x} -> (return x) ↓
   ↓-bind : forall {m f} ->
         m ↓ -> (forall x -> m ↓= x -> f x ↓) -> (m ⟩⟩= f) ↓

Certainly, return x for all x always terminates. For m ⟩⟩= f to terminate, both m and f, given the input from m, must terminate. It would be too strong to demand that f x terminates for all x. We only need f x to terminate for those x that could actually be returned by m. For that we define another datatype _↓=_ such that m ↓= x states that a computation m terminates and yields a value x:

data _↓=_ {a : Set} : Comp a -> a -> Set where
   ↓=-return : forall {x} -> (return x) ↓= x
   ↓=-bind : forall {m f x y} ->
         m ↓= x -> (f x) ↓= y -> (m ⟩⟩= f) ↓= y

I also needed the following lemmas. The lemma ↓=-unique states that if a computation yields only one value, if it terminates at all:

↓=-unique : {a : Set} {m : Comp a} {x y : a} ->
      m ↓= x -> m ↓= y -> x ≡ y
↓=-unique ↓=-return ↓=-return = ≡-refl
↓=-unique {x = x} (↓=-bind m↓=x fx↓=y) (↓=-bind m↓=x' fx'↓=y')
   with ↓=-unique m↓=x m↓=x'
... | ≡-refl with ↓=-unique fx↓=y fx'↓=y'
... | ≡-refl = ≡-refl

The next lemma says that a computation that yields a value terminates:

↓=⇒↓ : {a : Set} {m : Comp a} {x : a} -> m ↓= x -> m ↓
↓=⇒↓ ↓=-return = ↓-return
↓=⇒↓ (↓=-bind {m}{f} m↓=x fx↓=y) =
  ↓-bind (↓=⇒↓ m↓=x)
    (\z m↓=z -> ≡-subst (\w -> f w ↓) (↓=-unique m↓=x m↓=z) (↓=⇒↓ fx↓=y))

Safe Computation

Now we construct a function eval evaluating terminating computations. Recall, however, that an important thing about well-founded recursion is that the datatype Acc consists of only one constructor. Thus, like the newtype construct in Haskell, the constructor is superfluous and can be optimised away by the compiler. For eval, we would like the proof of termination to be a one-constructor type too. We define the following datatype Safe. A computation is safe, if all its sub-computations are:

data Safe {a : Set} : Comp a -> Set where
   safe-intro : forall m ->
      (forall m' -> InvokedBy m' m -> Safe m') -> Safe m

Given computations m and m', the relation InvokedBy m' m holds if m' is a sub-computation of m. The relation InvokedBy can be defind by:

data InvokedBy {a : Set} : Comp a -> Comp a -> Set where
  invokes-prev : forall {m f} -> InvokedBy m (m ⟩⟩= f)
  invokes-fun : forall {m f x} ->
    m ↓= x -> InvokedBy (f x) (m ⟩⟩= f)

Notice that Safe is exactly a specialised version of Acc. Safe m states that m is accessible under relation InvokedBy. That is, there is no infinite chain of computations starting from m.

Programmers prove termination of programs by constructing terms of type _↓. The following lemma states that terminating computations are safe:

↓-safe : {a : Set} -> (m : Comp a) -> m ↓ -> Safe m
↓-safe (return x) ↓-return = safe-intro _ fn
  where fn : (m' : Comp _) -> InvokedBy m' (return x) -> Safe m'
      fn _ ()
↓-safe (m ⟩⟩= f) (↓-bind m↓ m↓=x⇒fx↓) = safe-intro _ fn
  where fn : (m' : Comp _) -> InvokedBy m' (m ⟩⟩= f) -> Safe m'
      fn .m invokes-prev = ↓-safe m m↓
      fn ._ (invokes-fun m↓=x) = ↓-safe _ (m↓=x⇒fx↓ _ m↓=x)

We can then use the lemma to convert m ↓ to Safe m, which will be used by eval-safe:

eval-safe : {a : Set} -> (m : Comp a) -> Safe m -> ∃ (\x -> m ↓= x)
eval-safe (return x) (safe-intro ._ safe) = (x , ↓=-return)
eval-safe (m ⟩⟩= f) (safe-intro ._ safe)
  with eval-safe m (safe m invokes-prev)
... | (x , m↓=x) with eval-safe (f x) (safe (f x) (invokes-fun m↓=x))
...   | (y , fx↓y) = (y , ↓=-bind m↓=x fx↓y)

It is structurally recursive because safe is a sub-component of safe-intro ._ safe. Our eval function simply calls eval-safe:

eval : {a : Set} (m : Comp a) -> m ↓ -> a
eval m m↓ with eval-safe m (↓-safe m m↓)
... | (x , m↓=x) = x

Example: Div

Recall the function div:

div : ℕ -> ℕ -> Comp ℕ
div zero n ~ return zero
div (suc m) zero ~ return (suc m)
div (suc m) (suc n) ~
       div (suc m ∸ suc n) (suc n) ⟩⟩= \h ->
       return (suc h)

To prove its termination we use strong induction: if div k n terminates for all k < m, we have div m n terminates as well. The strong induction principle is given by:

strong-induction : (P : ℕ -> Set) ->
  (forall n -> (forall m -> m <′ n -> P m) -> P n) ->
    (forall n -> P n)
strong-induction P Pind n = Pind n (ind n)
  where ind : forall n -> forall m -> m <′ n -> P m
      ind zero _ ()
      ind (suc n) ._ ≤′-refl = strong-induction P Pind n
      ind (suc n) m (≤′-step m<n) = ind n m m<n

Notice how it resembles the accessibility proof ℕ<-wf. Indeed, accessibility is strong induction. We are doing the same proof, but organised differently.

There is a little extra complication, however. In Agda, co-recursively defined functions do not automatically unfold. We have to unfold them manually, a trick mentioned byUlf Norell and Dan Doel on the Agda mailing list. The following function unfold, through pattern matching, expands the definition of a coinductively defined function:

unfold : {a : Set} -> Comp a -> Comp a
unfold (return x) = return x
unfold (m ⟩⟩= f) = m ⟩⟩= f

To prove a property about m, we could prove the same property for unfold m, and then use the fact that unfold m ≡ m, proved below:

≡-unfold : {a : Set} {m : Comp a} -> unfold m ≡ m
≡-unfold {_}{return x} = ≡-refl
≡-unfold {_}{m ⟩⟩= f} = ≡-refl

Here is the proof that div m n terminates for all m, using strong induction:

div↓ : forall m n -> div m n ↓
div↓ m n = strong-induction (\m -> div m n ↓)
            (\m f -> ≡-subst _↓ ≡-unfold (ind m f)) m
   ind : forall m -> (forall k -> k <′ m -> div k n ↓) ->
              unfold (div m n) ↓
   ind zero f = ↓-return
   ind (suc m) f with n
   ... | zero = ↓-return
   ... | (suc n') = ↓-bind (f (suc m ∸ suc n') (sm-sn<sm m n'))
                      (\h _ -> ↓-return)

where sm-sn<sm : forall m n -> (m ∸ n) <′ suc m.

Example: McCarthian Maximum

Now, let us consider the McCarthian Maximum:

f : ℕ -> ℕ -> Comp ℕ
f k n with k ≤′′? n
... | inj₁ k≤n ~ return n
... | inj₂ n<k ~ f k (suc n) ⟩⟩= \x -> f k x

Two peculiarities made this function special. Firstly, for n < k, the value of n gets closer to k at each recursive call. The function terminates immediately for n ≥ k. Therefore, to reason about its termination, we need a downward induction, where n = k is the base case and termination for n = i is inducted from n = i+1, i+2, .. , k. To make this induction easier, I created another datatype expressing "less than or equal to":

data _≤′′_ : ℕ -> ℕ -> Set where
  ≤′′-refl : forall {n} -> n ≤′′ n
  ≤′′-step : forall {m n} -> suc m ≤′′ n -> m ≤′′ n

_<′′_ : ℕ -> ℕ -> Set
m <′′ n = suc m ≤′′ n

Secondly, to reason about the termination of f we need information about the value it returns. In the branch for inj₂ n<k, termination of the first recursive call follows from downward induction: suc n is closer to k than n. On the other hand, we know that the second call terminates only after we notice that f returns the maximum of its arguments. Therefore, f k (suc n) yields k, and f k k terminates.

I proved the termination of f in two parts, respectively for the case when k ≤ n, and n < k:

k≤n⇒fkn↓=n : forall k n -> k ≤′′ n -> f k n ↓= n
n<k⇒fkn↓=k : forall k n -> n <′′ k -> f k n ↓= k

It then follows immediately that:

f↓ : forall k n -> f k n ↓
f↓ k n with k ≤′′? n
... | inj₁ k≤n = ↓=⇒↓ (k≤n⇒fkn↓=n k n k≤n)
... | inj₂ n<k = ↓=⇒↓ (n<k⇒fkn↓=k k n n<k)

Both k≤n⇒fkn↓=n and n<k⇒fkn↓=k, on the other hand, were proved in two stages. I needed to prove an unfolded version of the lemma, which may or may not be mutually recursive on the lemma to be proved. Take n<k⇒fkn↓=k for example:

n<k⇒⟨fkn⟩↓=k : forall k n -> n<′′ k -> unfold (f k n) ↓= k
n<k⇒⟨fkn⟩↓=k k n n<k = ...{- tedious proof, I don't think you want to read it -}

n<k⇒fkn↓=k : forall k n -> n <′′ k -> f k n ↓= k
n<k⇒fkn↓=k k n n<k =
    ≡-subst (\m -> m ↓= k) (≡-unfold {_}{f k n}) (n<k⇒⟨fkn⟩↓=k k n n

As mentioned above, this two-stage pattern was discussed on the Agda mailing list. I do not know whether this will be a recommended programming pattern. Some follow up discussions seemed to show that this is not always necessary. I have got to read more.


Leave a Comment

Your email address will not be published. Required fields are marked *