Proof Irrelevance, Extensional Equality, and the Excluded Middle

It was perhaps in our first lesson in constructive logic when we learnt about the absence of the law of excluded middle, which in a constructive interpretation would imply a decision procedure for every proposition. Therefore I was puzzled by the fact, stated in a number of places including the Stanford Encyclopedia of Philosophy (SEP), that axiom of choice, proof irrelevance, and extensional equality (definitions to be given later) together entail the law of excluded middle. Since a constructive variant of axiom of choice is provable in intuitionistic logic, and both proof irrelevance and extensional equality are properties we would like to have in a type system, it is worrying that they lead to implausible consequences. Thus I was curious to find out what happened.

The observational type theory promises us the best of everything — extensional equality without losing canonicity (please do implement it in Agda soon!), and it does rely on proof irrelevance. There is even an Agda embedding (or, with the information given it is not hard to reconstruct one), so I conducted some experiments in the embedding. For this post, however, it is sufficient to do everything in plain Agda and postulate the missing properties.

Decidable Equality for All Types?

The construction in SEP makes use of functions on propositions, which is not available in the Agda embedding of observational type theory. Instead I experimented with another construction from Benjamin Werner‘s On the Strength of Proof-Irrelevant Type Theories: with (a particular interpretations of) axiom of choice and proof irrelevance, one can construct a function that, given a and b of any type A, decides whether a equals b.

This could also lead to horrifying consequences — we could even compare whether two infinite structure, or two functions are equal, in a finite amount of time.

Axiom of Choice

The axiom of choice, as described on the very informative homepage for the axiom maintained by Eric Schechter, is considered by some the “last great controversy of mathematics.” The axiom comes in many forms, and one of the simplest could be:

Given a collection of non-empty sets, we can choose a member from each set in that collection.

A set of B is usually represented by a characteristic function B → Set. Let the collection of non-empty sets of B‘s be indexed by A, the collection can be modelled by a function mapping indexes in A to sets of Bs, that is, a relation A → B → Set. The action of “choosing” is modelled by the existence of a function returning the choice, that is, a function taking an index in A and returning a element in B that is chosen. One possible formulation of the axiom of choice would thus be:

ac : {A B : Set} → (R : A → B → Set) → 
       (∀ x → ∃ (λ y → R x y)) → 
         (∃ {A → B} (λ f → ∀ x → R x (f x)))

In words: given a collection of sets represented by A → B → Set, if R x is non-empty for every x : A, there exists a function f : A → B, such that f x is in R x for every x : A. Another way to see it is that the axiom simply says we can refine a relation R : A → B → Set to a function A → B provided that the former is total.

It is surprising to some that in constructive logic, the axiom of choice is in fact provable:

ac R g = ((λ x → proj₁ (g x)) , λ x → proj₂ (g x))

The technical reason is that a proof of ∀ x → ∃ (λ y → R x y) contains a witness, which we can just pick as the result of the choice function. I will leave the mathematical and philosophical explanation and implications to better sources, such as the axiom of choice homepage, and the two pages in SEP on the axiom and the axiom in type theory.

Proof Irrelevance

We define a small language of propositions about equalities: † A is the language of propositions whose atoms are equalities between elements of type A (_≐_), and between Boolean values (_≘_):

data † (A : Set) : Set where
   TT  : † A
   FF  : † A
   _≐_ : A → A → † A
   _≘_ : Bool → Bool → † A
   _∧_ : † A → † A → † A
   _⇒_ : † A → † A → † A
   _∨_ : † A → † A → † A
¬ : ∀ {A} → † A → † A
¬ P = P ⇒ FF

The semantics is more or less what one would expect: TT is the unit type, FF the empty type, conjunction encoded by products, and implication by functions. In particular, disjunction is encoded by the sum type:

⌈_⌉ : ∀ {A} → † A → Set
⌈ TT ⌉ = ⊤
⌈ FF ⌉ = ⊥
⌈ a ≐ b ⌉ = a ≡ b
⌈ a ≘ b ⌉ = a ≡ b
⌈ P ∧ Q ⌉ = ⌈ P ⌉ × ⌈ Q ⌉
⌈ P ⇒ Q ⌉ = ⌈ P ⌉ → ⌈ Q ⌉
⌈ P ∨ Q ⌉ = ⌈ P ⌉ ⊎ ⌈ Q ⌉

We shall define the if-and-only-if relation _⇔_ between terms of † A

_⇔_ : ∀ {A} → † A → † A → Set
P ⇔ Q = ⌈ (P ⇒ Q) ∧ (Q ⇒ P) ⌉

The current example, however, could have worked too if we simply take _⇔_ to be _≡_. Its role will be more significant for the next example.

Proof irrelevance asserts that we do not distinguish between proofs of the same proposition. Let p and q be proofs of P and Q respectively. If P and Q turn out to be equivalent propositions, then p and q must be equal:

postulate irr : ∀ {A} (P Q : † A) → (p : ⌈ P ⌉)(q : ⌈ Q ⌉) → P ⇔ Q → p ≅ q

where _≅_ stands for heterogenous equality, needed here because p and q appear to have different types. Note that in Agda, if we assume uniqueness of identity proof (i.e. refl being the only proof of a ≅ b), the axiom irr holds for TT, FF, _≐_, _≘_, and _∧_, and would be true for _⇒_ if we had extensional equality for functions, but not for disjunction _∨_.

Decidable Equality

For a b : A, let oneof a b be the type of things that are either a or b, paired with a proof of equality:

oneof : {A : Set} → (a b : A) → Set
oneof {A} a b = Σ A (λ x → ⌈ (x ≐ a) ∨ (x ≐ b) ⌉)

The relation Ψ a b : oneof a b → Bool → Set relates a z : oneof a b value and a Boolean e if e tells us which value z actually is:

Ψ : {A : Set} → (a b : A) → oneof a b → Bool → Set
Ψ {A} a b z e = ⌈ ((e ≘ false) ∧ (proj₁ z ≐ a)) ∨
                  ((e ≘ true)  ∧ (proj₁ z ≐ b)) ⌉

Certainly Ψ is total: for any z : oneof a b, either true or false satisfies Ψ z. We can prove the totality:

Ψ-total : {A : Set} → (a b : A) →
               (z : oneof a b) → ∃ (λ e → Ψ a b z e)
Ψ-total a b (x , inj₁ x≡a) = (false , inj₁ (refl , x≡a))
Ψ-total a b (x , inj₂ x≡b) = (true  , inj₂ (refl , x≡b))

and therefore extract, using the axiom of choice, a function that actually computes the Boolean:

Ψ-fun : {A : Set} → (a b : A) →
             ∃ {oneof a b → Bool}
               (λ f → (z : oneof a b) → Ψ a b z (f z))
Ψ-fun a b = ac (Ψ a b) (Ψ-total a b)

Now we show how to construct a decision procedure for a ≐ b. Given a and b, we inject them to oneof a b and call the result a' and b' respectively:

eq_dec : {A : Set} → (a b : A) → ⌈ (a ≐ b) ∨ (¬ (a ≐ b)) ⌉
eq_dec {A} a b = body

   a' : oneof a b
   a' = (a , inj₁ refl)

   b' : oneof a b
   b' = (b , inj₂ refl)

In the function body, we extract a choice function f (and also get a proof that f satisfies the relation Ψ) and apply it to both a' and b'. The results could come in four combinations. If f a' is true, whatever f b' is, from the proof of Ψ a b a' (f a') we have a proof of a ≡ b. If f a' and f b' are both false, we also get a proof of b ≡ a:

   body : (a ≡ b) ⊎ (¬ (a ≡ b))
   body with Ψ-fun a b
   ... | (f , f⊆Ψ) with f⊆Ψ a' | f⊆Ψ b'
   ...  | inj₂ (fa'≡tt , a≡b) | _ = inj₁ a≡b
   ...  | inj₁ (fa'≡ff , a≡a) | inj₁ (fb'≡ff , b≡a) = inj₁ (sym b≡a)
   ...  | inj₁ (fa'≡ff , a≡a) | inj₂ (fb'≡tt , b≡b) = inj₂ (cont fa'≡ff fb'≡tt)

For the case f a' is false but f b' is true, we call cont to create a contradiction if a ≡ b.

How do we get the contradition? If we can somehow conclude a' ≡ b', by congruence we have f a' ≡ f b'. However, that gives us false ≡ f a' ≡ f b' ≡ true, which contradicts the assumption that false and true are two distinct elements in Bool:

     where cont : f a' ≡ false → f b' ≡ true → a ≢ b
           cont fa'≡ff fb'≡tt a≡b = 
             let ....
                 fa'≡fb' : f a' ≡ f b'
                 fa'≡fb' = cong f a'≡b'
             in let open EqR (PropEq.setoid Bool)
                in ff≢tt (begin 
                     false ≈⟨ sym fa'≡ff  ⟩
                     f a'  ≈⟨ fa'≡fb' ⟩
                     f b'  ≈⟨ fb'≡tt ⟩
                     true ∎)

So the task narrows down to showing that a' ≡ b' given a ≡ b, which is where proof irrelevance comes in. Recall that a' is a paired with a proof P : (a ≐ a) ∨ (a ≐ b), and b' is b paired with a proof Q : (b ≐ a) ∨ (b ≐ b). Now that a ≡ b, the two propositions are the same, which by proof irrelevance means that their proofs must be equal too. Here is the actual proof in Agda:

                 P≡Q : ((a ≐ a) ∨ (a ≐ b)) ≡ ((b ≐ a) ∨ (b ≐ b))
                 P≡Q = let open EqR (PropEq.setoid († A))
                       in begin 
                           (a ≐ a) ∨ (a ≐ b)
                        ≈⟨ cong (λ x → (a ≐ a) ∨ (a ≐ x)) (sym a≡b) ⟩
                           (a ≐ a) ∨ (a ≐ a)
                        ≈⟨ cong (λ x → (x ≐ a) ∨ (x ≐ x)) a≡b ⟩
                           (b ≐ a) ∨ (b ≐ b)

                 a'≡b' : a' ≡ b'
                 a'≡b' = cong-× a≡b 
                          (irr ((a ≐ a) ∨ (a ≐ b)) ((b ≐ a) ∨ (b ≐ b))
                               (proj₂ a') (proj₂ b') P≡Q)

So, what happened here? When I posed the question to Conor McBride, his first question was “Which variant of the axiom of choice did you use?” (“The one that has a trivial proof,” I answered). The second question was “How did you encode disjunction?” Indeed, disjunction is where we “smuggled” something that should not be there. The two proofs P : (a ≐ a) ∨ (a ≐ b) and Q : (b ≐ a) ∨ (b ≐ b) can be casted to (a ≐ a) ∨ (a ≐ a) when a ≡ b. In Agda, we should have two proofs for (a ≐ a) ∨ (a ≐ a). But proof irrelevance deploys some magic to unify P and Q to one proof only when a ≡ b, which reveals some information we can exploit.

Will we see the same phenomena in observational equality? “That’s why we don’t have disjunction in the propositions of observational type theory!” Conor replied. Or, when we do need disjunction in propositions, it must be done in a safe manner. Perhaps Conor or Thorsten Altenkirch knows how to do that?

Excluded Middle

The construction in SEP deriving the law of excluded middle works in a similar way but in a different level — we will be reasoning about propositions and predicates. We need one more ingredient: extensional equality. Equality of propositions, as mentioned in the previous section, are based on the if-and-only-if relation _⇔_. Equality of functions, on the other hand, is defined pointwise — two functions are equal if they map (extensionally) equal values to (extensionally) equal results. In particular, equality of predicates (functions that return propositions) is given by:

postulate ext : {A B : Set} → (f g : A → † B) → (∀ a → f a ⇔ g a) → f ≡ g

Let X be the proposition for whose validity we are constructing a decision procedure. We define two predicate constants U X and V X:

U : ∀ {A} → † A → Bool → † A 
U X b = X ∨ (b ≘ false)

V : ∀ {A} → † A → Bool → † A
V X b = X ∨ (b ≘ true)

And a type for predicates (on Booleans) that are known to be either U X or V X:

UorV : ∀ A → † A → Set
UorV A X = Σ (Bool → † A) (λ P → ⌈ (P ≐ U X) ∨ (P ≐ V X) ⌉)

Given a predicate P : UorV A X that is either U X or V X, the relation Φ relates P and Boolean b if P (precisely, proj₁ P) is true at b:

Φ :  ∀ {A X} → (P : UorV A X) → Bool → Set
Φ P b = ⌈ proj₁ P b ⌉

Again Φ can be shown to be total, and we may extract a choice function which, given a proposition P : UorV A X, returns a Boolean for which P is true.

Φ-total : ∀ {A X} → (P : UorV A X) → ∃ (λ b → Φ P b)
Φ-fun : ∀ {A X} → ∃ {UorV A X → Bool} (λ f → ∀ P → Φ P (f P))

Now, like in the previous example, we inject U X and V X to UorV A X:

U' : ∀ {A} X → UorV A X
U' X = (U X , inj₁ refl)

V' : ∀ {A} X → UorV A X
V' X = (V X , inj₂ refl)

To determine the validity of X, we apply f to U' X and V' X:

ex_mid : ∀ {A} → (X : † A) → ⌈ X ∨ (¬ X) ⌉
ex_mid {A} X with Φ-fun {X = X}
... | (f , f⊆Φ) with f⊆Φ (U' X) | f⊆Φ (V' X)
...   | inj₁ pX      | _ =  inj₁ pX
...   | _            | inj₁ pX = inj₁ pX
...   | inj₂ fU'≡ff  | inj₂ fV'≡tt = inj₂ negX

If either yields inj₁, we have a proof of X. For the last case, on the other hand, we have that f (U' X) is false and f (V' X) is true, and we try to create a contradiction given a proof of X. Like in the previous example, we do so by claiming U' X ≡ V' X, therefore f (U' X) ≡ f (V' X), and therefore false ≡ true:

  where negX : ⌈ (¬ X) ⌉
        negX pX = let ...

                  in ff≢tt (begin 
                     false     ≈⟨ sym fU'≡ff   ⟩
                     f (U' X)  ≈⟨ fU'≡fV' ⟩
                     f (V' X)  ≈⟨ fV'≡tt ⟩
                     true ∎)

But this time we must first show that U X ≡ V X. Expending the definitions, we have U X b = X ∨ (b ≘ false) and V X b = X ∨ (b ≘ true), and recall that we have a proof of X. We may prove a lemma that:

lemma : ∀ {A}{P Q R : † A} → ⌈ P ⌉ → (P ∨ Q) ⇔ (P ∨ R)

from which we may conclude U X b ⇔ V X b and, by extensional equality of functions, U X ≡ V X:

        negX pX = let U≡V : U X ≡ V X
                      U≡V = let UXb⇔VXb : ∀ b → U X b ⇔ V X b
                                UXb⇔VXb b = lemma {_}{X}{b ≘ false}{b ≘ true} pX
                            in ext (U X) (V X) UXb⇔VXb

The rest is the same as in the previous example.


References and Things to Read

7 thoughts on “Proof Irrelevance, Extensional Equality, and the Excluded Middle”

  1. >I think there are two kinds of ‘extensional equality’ one being eta-conversion and the other being the ext postulate you used

    Eta-conversion is sometimes called “weak extensionality” while “ext” is called “strong extensionality”. To my knowledge, Zhaohui Luo uses this terminology in his thesis of ECC.

    >Proof irrelevence doesn’t appear necessary, AC and extensional equality seem to suffice.

    If I don’t overlook the axiom ExtAC you defined, this is equivalent to the one introduced in the article “100 years of Zermelo’s axiom of choice: what was the problem with it?” by Martin-Löf. According to this, the variant of AC which is used here can be proved without any postulate, but the ExtAC cannot. Furthermore, the excluded middle is implied by ExtAC according to Diaconescu’s theorem and the constructive type theory with ExtAC (W-operation, and an universe) could interprete all of ZFC.

    They are two different variants of axiom of choice, so are their strengths. Thus, that might be the reason that constructive AC and extensional equality are not strong enough to prove the law of excluded middle ?

    1. Thanks for the info, Danko. That looks very interesting. Should I be looking at the PDF document (Zermelo’s well-ordering theorem in type theory)?

  2. >I guess Agda has Prop as well now, but I didn’t know how to use it…

    Yes, it has a type called Prop and a proof-irrelevant option, but they don’t work by now as far as I know. Maybe things are changed ?

  3. > The construction given by SEP and Werner, which I showed in the post, was perhaps interesting in that they showed even if we are not allowed to inspect proofs, there is still a way to get around that and extract some evil information.
    I see! I was suspecting that but I wanted to double check.

    > By the way, since you appear to be a Coq user… :) “Proof irrelevance” seems to mean something different in Coq. My impression is that it just means that the computational portion of a program cannot inspect the inside of a Prop. Is it so?
    Coq implements every aspect of proof irrelevence except the one I really want. I mean it is prepared for it but it doesn’t allow values of sort Prop to be convertible (only beta conversion and so on).
    The main thing about it is that you can’t pattern match (eliminate) on Prop to produce an object in Set (unless that object in Set has zero or one constructors). It’s exactly what I used to prove _|_, but I think your proof did not make any uses of that. I guess Agda has Prop as well now, but I didn’t know how to use it…

    Thanks for the reply!

  4. Hello Shin-Cheng Mu, Thanks for another very thought provoking and interesting post!

    I think that this irr postulate is inconsistent:

    postulate irr : ∀ {A} (P Q : † A) → (p : ⌈ P ⌉)(q : ⌈ Q ⌉) → (P ≡ Q) → p ≅ q
    test-1 : ∀ {A} (P Q : † A) -> _
    test-1 P Q = irr ((P ∧ Q) ⇒ (P ∨ Q)) ((P ∧ Q) ⇒ (P ∨ Q)) 
                         (\pq -> inj₁ (proj₁ pq)) (\pq -> inj₂ (proj₂ pq)) refl
    test-2 : (\(pq : ⊤ × ⊤) -> inj₁ (proj₁ pq)) ≅ (\pq -> inj₂ (proj₂ pq))
    test-2 = test-1 {⊤} TT TT
    use : ∀ {A B : Set} {f g : A -> B} -> f ≅ g -> ∀ x -> f x ≅ g x
    use refl = \_ -> refl
    use' : ∀ {A B : Set} {x y : A} -> x ≅ y -> ∀ (f : A -> B) -> f x ≅ f y
    use' refl = \_ -> refl
    test-3 = use test-2 ( tt , tt )
    data XY : Set where
     X : XY
     Y : XY
    discr : X ≅ Y -> ⊥
    discr ()
    which :  ∀ {A B} -> (A ⊎ B) -> XY
    which (inj₁ _) = X
    which (inj₂ _) = Y
    test-4 = use' test-3 which
    inconsist : ⊥
    inconsist = discr test-4

    I am not 100% sure if this was intended or not though?
    Also I have a question about OTT, I think there are two kinds of ‘extensional equality’ one being eta-conversion and the other being the ext postulate you used. Maybe this distinction is why OTT is still consistent?

    1. Hi Maud,

      Thanks for the comment. You illustrated a tricky bit using a shorter example. There are two proofs for TT ∨ TT: inj₁ tt and inj₂ tt. The irr postulate claims that we can see them as if they are equal (indeed, test-3 : inj₁ tt ≅ inj₂ tt). However, in Agda nothing stops you from inspecting and discriminating them (like in which), which allows you to prove X ≅ Y.

      As far as I know, in OTT it is a rule that one cannot inspect proofs (things having type † A) like which did (I cannot recall how it is enforced). Therefore they may have irr (in fact irr is provable). The construction given by SEP and Werner, which I showed in the post, was perhaps interesting in that they showed even if we are not allowed to inspect proofs, there is still a way to get around that and extract some evil information. To prevent that, OTT does not have disjunction either.

      Also, if I remember right, extensional equality for functions in OTT refers to something like the ext rule: functions are considered equal if they map equal inputs to equal results. Inconsistency is avoided by the mechanism I mentioned above. But I may not be the best person to answer that… :)

      By the way, since you appear to be a Coq user… :) “Proof irrelevance” seems to mean something different in Coq. My impression is that it just means that the computational portion of a program cannot inspect the inside of a Prop. Is it so?

Leave a Comment

Your email address will not be published. Required fields are marked *