“I cannot prove that if a function is injective, it has an inverse,” Hideki Hashimoto posed this question to me a while ago. It turned out that this was not the property he really wanted, but it got me into thinking: is it possible at all?
Preliminaries
Let us start with some basic definitions. A relation from A to B is denoted by the wavy arrow:
_↝_ : Set → Set → Set1
A ↝ B = A → B → Set
Given a relation R : A ↝ B
we can always take its converse R ˘ : B ↝ A
, and a function can be lifted to a relation by fun
:
_˘ : ∀ {A B} → (A ↝ B) → (B ↝ A)
(R ˘) b a = R a b
fun : ∀ {A B} → (A → B) → (A ↝ B)
fun f a b = f a ≡ b
A relation R : A ↝ B
is simple if it does not map one input to multiple outputs. It is entire if everything in A
is mapped to something in B
— a more familiar word may be “total”.
simple : ∀ {A B} → (A ↝ B) → Set
simple R = ∀ {a b₁ b₂} → R a b₁ → R a b₂ → b₁ ≡ b₂
entire : ∀ {A B} → (A ↝ B) → Set
entire R = ∀ a → ∃ (λ b → R a b)
A function is a relation that is simple and entire. Indeed, one can show that fun f
is simple and entire for every f
. Injectivity and surjectivity are similar notions defined for converse of R
:
injective : ∀ {A B} → (A ↝ B) → Set
injective R = ∀ {a₁ a₂ b} → R a₁ b → R a₂ b → a₁ ≡ a₂
surjective : ∀ {A B} → (A ↝ B) → Set
surjective R = ∀ b → ∃ (λ a → R a b)
The (constructive variant of the) axiom of choice states that an entire relation A ↝ B
can be refined to a function A → B
:
ac : ∀ {A B} → (R : A ↝ B) →
(∀ a → ∃ (λ b → R a b)) → ∃ {A → B} (λ f → ∀ a → R a (f a))
ac R R-entire = ((λ a → proj₁ (R-entire a)) , λ a → proj₂ (R-entire a))
See the axiom of choice homepage, or the Stanford Encyclopedia of Philosophy for more information on this axiom.
Inverting Injective and Surjective Functions
Now, let us restate Hashimoto san’s challenge:
Let
fun f : A → B
be injective. Prove thatf
has a left inverse. That is, somef⁻¹
such thatf⁻¹ (f a) = a
foralla
.
It turned out that he forgot a condition: f
is also surjective. If f
is also (provably) surjective, one can pick some g ⊆ f˘
using the axiom of choice (since f
is surjective if and only if f˘
is total) and further prove that g ∘ f = id
using injectivity:
inv-sur : ∀ {A B} → (f : A → B) →
injective (fun f) → surjective (fun f) →
∃ {B → A} (λ f⁻¹ → (∀ a → f⁻¹ (f a) ≡ a))
inv-sur f f-inj f-sur with ac ((fun f) ˘) f-sur
... | (g , fgb≡b) = (g , λ a → f-inj {g (f a)} {a} {f a} (fgb≡b (f a)) refl)
Like the proof of the constructive axiom of choice, the proof above does not really do much. The proof of surjectivity of f
has already provided, for every b : B
, an a : A
such that f a ≡ b
. So we simply let f⁻¹
return that a
.
Can we lift the restriction that f
must be surjective? That is, can this be proved?
inv : ∀ {A B} → (f : A → B) → injective (fun f) →
∃ {B → A} (λ f⁻¹ → (∀ a → f⁻¹ (f a) ≡ a))
To make the scenario clear: we have a (total) function f : A → B
that is injective but not necessarily surjective. The set B
could be “larger” than A
in the sense that there could be some elements b : B
for which no f a
equals b
— that is, B
may not be “fully covered.” Can we construct f⁻¹ : B → A
such that f⁻¹ (f a) ≡ a
for all a : A
?
At the first glance it did not look like something terribly difficult to do. Given b
, if it equals some f a
, let f⁻¹
simply return that a
. Otherwise f⁻¹
could just map b
to any element in A
, since this b
is not used in any invocation of f⁻¹ (f a)
anyway. It should be possible as long as A
is not empty, right?
I tried to construct this f⁻¹
but could not help noticing something funny going on. It turns out that had this function existed, we could, again, prove the law of excluded middle. That is, for any predicate P : B → Set
and any b : B
, there would be a decision procedure telling us whether P b
is true or not.
Provided that we assume proof irrelevance, that is.
Excluded Middle, Again
Here we go. Let B
be some type having decidable equality. That is, there exists some eqB
:
eqB : (b₁ b₂ : B) → (b₁ ≡ b₂) ⊎ (¬ (b₁ ≡ b₂))
where _⊎_
is disjoint sum.
Now take some predicate P : B → Set
. Let A
be defined by:
A : (B → Set) → Set
A P = Σ B (λ b → P b)
That is, A P
is the subset of B
for which P
holds. Each element of A P
is a pair (b, Pb)
where Pb
is a proof of P b
.
Finally, take
f : ∀ {P} → A P → B
f = proj₁
Thus f (b, Pb) = b
.
The function f
is injective if we assume proof irrelevance. That is, if f (b, Pb) = b
and f (b', Pb') = b
, we must have b = b'
and (due to proof irrelevance) Pb = Pb'
, and therefore (b, Pb) = (b', Pb')
. Indeed, if we postulate proof irrelevance:
postulate irr : (P : B → Set) → ∀ {b} → (p₁ : P b) → (p₂ : P b) → p₁ ≡ p₂
We can construct a proof that f
is injective:
f-inj : ∀ {P} → injective (fun (f {P}))
f-inj {P} {(.b , Pb₁)} {(.b , Pb₂)} {b} refl refl = cong (λ p → (b , p)) (irr P Pb₁ Pb₂)
Assume that we have proved inv
. We can now apply inv
and obtain some f⁻¹
, the left inverse of f
.
However, with this particular choice of A
, f
, and f⁻¹
, we can construct a deciding procedure for P
. That is, for any P
and b
, we can determine P b
holds or not:
em : {P : B → Set} → ∀ b → P b ⊎ ¬ (P b)
This is how em
works. Given some b
, let’s apply f⁻¹
to b
. The result is a pair (b', Pb')
. Let’s compare b
and b'
using eqB
:
em {P} b with inv f (f-inj {P})
... | (f⁻¹ , f⁻¹fa≡a) with inspect (f⁻¹ b)
... | (b' , Pb') with-≡ _ with eqB b b'
If b ≡ b'
, Pb'
is a proof of P b'
and also a proof of P b
. Let us just return it (after some type casting):
em {P} b | (f⁻¹ , f⁻¹fa≡a) | (b' , Pb') with-≡ _ | inj₁ b≡b' =
inj₁ (subst P (sym b≡b') Pb')
Consider the case that b
does not equal b'
. We want to show that P b
is not true. That is, a proof of P b
leads to contradiction. Assume we have a proof Pb
of P b
. Since f (b , P b) ≡ b
, we have f⁻¹ b ≡ (b , Pb)
:
em {P} b | (f⁻¹ , f⁻¹fa≡a) | (b' , Pb') with-≡ b'Pb'≡f⁻¹b | inj₂ ¬b≡b' =
inj₂ (λ Pb →
let f⁻¹b≡bPb : f⁻¹ b ≡ (b , Pb)
f⁻¹b≡bPb = f⁻¹fa≡a (b , Pb)
The assumption says that f⁻¹ b = (b' , Pb')
. By transitivity we have (b , Pb) ≡ (b' , Pb')
.
bPb≡b'Pb' : (b , Pb) ≡ (b' , Pb')
bPb≡b'Pb' = sym (trans b'Pb'≡f⁻¹b f⁻¹b≡bPb)
But if we take the first component of both sides, we get b ≡ b'
. That contradicts our assumption that b
does not equal b'
:
b≡b' : b ≡ b'
b≡b' = cong proj₁ bPb≡b'Pb'
in ¬b≡b' b≡b')
which completes the proof.
In retrospect, f⁻¹
cannot exist because for it to work, it has to magically know whether b
is in the range of f
, that is, whether P b
is true or not.
Nakano’s Challenge
When I talked about this to Keisuke Nakano he posed me another related challenge. Set-theoretically, we understand that if there exists an injective function f : A → B
and another injective function g : B → A
, the sets A
and B
are of the same cardinality and there ought to be a bijection A → B
. Can we construct this bijection? That is, can we prove this theorem?
nakano : {A B : Set} →
(f : A → B) → injective (fun f) →
(g : B → A) → injective (fun g) →
∃ {A → B} (λ h → injective (fun h) × surjective (fun h))
I do not yet have a solution. Any one wanna give it a try?
Yes, well,
Image f
isn’t precisely a correct encoding of ‘the type of elements of b that are in the image of f,’ which is what we want g to have as a domain (andrestrict f
as a range). Rather, it’s ‘the type of proofs that a particular element of b is the image of a particular element of a.’ The former is probably some quotient of the latter, but I’m not sure how (or if it’s even possible) to construct it.Instead, my original proof works for all
a -> Image f
, even ones that (non-provably) target said quotient. So, in the constant function case, we can use it to find an inverse for a function out of the unit type, because it’s provably injective then:const-0 : ∀{a} → a → Nat
const-0 _ = 0
restricted-0 : ∀{a} → a → a → Image (const-0 {a})
restricted-0 x _ = x ⟶ 0 ⟨ ≡-refl ⟩
agree-0 : ∀{a x} → agrees (const-0 {a}) (restricted-0 {a} x)
agree-0 _ = ≡-refl
≡-refl-unit : ∀{x y : Unit} → x ≡ y
≡-refl-unit {unit} {unit} = ≡-refl
injective-0 : injective (const-0 {Unit})
injective-0 0≡0 = ≡-refl-unit
inverse-0 : ∃ (λ g → ∀ x → x ≡ g (restricted-0 unit x))
inverse-0 = inj-inv injective-0 (restricted-0 unit) (agree-0 {_} {unit})
even though
restricted-0 unit
does not use its argumentx
in the proof that 0 is in the image.There is something weird. How could every
restrict f
be injective ? For example, the image of a constant function f must be a one-pointed set, andrestrict f : ℕ → {0}
obviously shouldn’t be a injective function.You’re right. Actually, Image f extends the domain of f. (so, what does the proof prove !?) We have to add one more condition to get the correct formalization.
π-pre : ∀ {a b}{f : a → b} → Image f → a
π-pre (x ⇒ _ ⟨ _ ⟩) = x
cond : ∀ {a b : Set}(f : a → b) → Set
cond f = {x y : Image f} → π-img x ≡ π-img y → π-pre x ≡ π-pre y
However, we couldn’t construct any arbitrary inverses from injuctive functions f without the definition of f. well, maybe I’m wrong …
Liang-Ting wrote:
The type of
restrict f
isn’t right. Givenf : A → B
,restrict f
has typeA → Image f
, whereImage f
is in essence a tuple recording the input, the output, and a proof thatf input = output
. Any function can be transformed to an injective function by returning a copy of the input.Yes, I still haven’t figured out any good way of deciding when to create a new data type instead of building up a definition out of existing types. I believe I first saw something like Image in a tutorial-ish paper on either Agda or Epigram, and those tend to be gung-ho about taking the new data path.
If you’re going to build it out of existing types, it’s probably good to look at it like:
Image {a} {b} f = Σ b (λ y → ∃ (λ x → y ≡ f x))
That is, it’s elements y of b, together with proofs that they are the image of some x in a. The data type has the x first because it looks nice with the fancy syntax. :)
However, it’s also important to note that
Image f
is potentially larger than b, or the image of f proper. For instance if we write:restrict : ∀{a b} (f : a → b) → a → Image f
restrict f x = x ⟶ f x ⟨ ≡-refl ⟩
which ostensibly gives us f restricted to its image, it is trivial to find an inverse of
restrict f
, and prove that it is injective, regardless of f:restrict : ∀{a b} (f : a → b) → a → Image f
restrict f x = x ⟶ f x ⟨ ≡-refl ⟩
restrict-inv : ∀{a b} {f : a → b}
→ ∃ {Image f → a} (λ g → ∀ x → g (restrict f x) ≡ x)
restrict-inv {a} {b} {f} = (g , λ x → ≡-refl)
where
g : Image f → a
g (x ⟶ _ ⟨ _ ⟩) = x
restrict-inj : ∀{a b} {f : a → b} → injective (restrict f)
restrict-inj ≡-refl = ≡-refl
Because
restrict f
is obviously pairing all outputs with the particular input that generated them, and Agda can see through that. I compensated for this in the proof involving injective fs by saying that it needs to work for all restricted f’ that agree with f suitably, but I’m not sure that’s the best way to go about things.hum, an interesting idiom.
By the way, we could just define the following type as Image,
Image : {a b : Set}(f : a → b) → Set
Image {a} {b} f = ∃ {a} λ x → ∃ {b} λ y → y ≡ f x
so
π-img
becomesπ-img {a} {b} {f} = proj₁ ∘ proj₂
and
g
becomesproj₁
.Once we restrict the inverse to working on the image of f, the proof becomes not so difficult (my main difficulty was formulating a notion of f-restricted-to-its-image such that it wasn’t injective irrespective of f being injective). I used functions rather than relations:
Hopefully the formatting isn’t too terribly destroyed on the above. I’m never sure how to appropriately mark things up in blog comments.
Dear Dan Doel,
Thanks for visiting this blog, and thanks for the code! The use of
Image f
is a good pattern I should learn about.I marked up your code using <pre><code>program</code></pre>. I am not whether one can mark up their own comments if they don’t have access to the admin. interface, though…
>I cannot prove that if a function is injective, it has an inverse,
BTW, we usually consider the above lemma as “if a function f A -> B is injective, it has an inverse from f(A) -> A”, so we don’t restrict the surjective property to f as well as the non-emptiness of A.
Just a quick skimming. I thought there is a useful reference. The proof of “if there are two injective functions between A and B, A and B have the same cardinality” is called Cantor-Bernstein-Chroeder Theorem. You could refer the following link.
http://en.wikipedia.org/wiki/Cantor%E2%80%93Bernstein%E2%80%93Schroeder_theorem
I’ll check wether it is constructive or not later, but I , however, remember that we have to use the ordinary set operations like subset inductively. Shall we inteprete the set theory in the type theory as Aczel did ?
and there ought to be a bijection A → B…
this is only true for finite sets, but for finite sets it’s trivial to construct the bijection (brute force search).
Suppose f : A -> B, A is empty and B is not. Then f is trivially injective, not surjective, and there does not exist any function g : B -> A so f has no left inverse.
lilac,
Oh, you are right. The surjectivity constraint in
inv-sur
simples thatA
is not empty ifB
is not. However, it is not guaranteed ininv
.I should perhaps add to
inv
a condition thatA
is non-empty, and thatP b
is true for someb
. Hope that doesn’t make it less interesting.Thanks for pointing that out.