Programming Decisions with Coq's Sumbools

➲ Date: Wed, 31 Aug 2011 09:59:59 +0200; Tags: [ocaml; coq].

Today, let's investigate the sumbool type. This post follows the one on the sig type. It is also a Coq development available on-line. All the disclaimers still apply here (coq-beginner, nothing “new”, etc.).

The `sumbool` Type

It is defined in the Specif module. sumbool ‘is a boolean type equipped with the justification of their value’:

Inductive sumbool (A B:Prop) : Set :=
| left : A -> {A} + {B}
| right : B -> {A} + {B}
where "{ A } + { B }" := (sumbool A B) : type_scope.

Add Printing If sumbool.

Indeed, like bool and many other 2-constructors types it can use the “if ... then ... else ...” construct; with the Add Printing If it is also printed this way.

It is used everywhere in Coq's standard library. Just by importing standard lists, we get quite a bunch of decision procedures:

Set Implicit Arguments.
Require Import List.
SearchAbout sumbool.

The search gives about 100 lines of output. Let's have a closer look at one of them:

Check Compare_dec.zerop.

Compare_dec.zerop
     : forall n : nat, {n = 0} + {0 < n}

We can use it in programs:

Eval compute in (if Compare_dec.zerop 42 then 51 else 69).

= 69
: nat

and extraction works as expected:

Extraction Compare_dec.zerop.

(** val zerop : nat -> sumbool **)

let zerop = function
| O -> Left
| S n0 -> Right

Note that it can be made more practical:

Extract Inductive sumbool => "bool" ["true" "false"].
Extraction Compare_dec.zerop.

(** val zerop : nat -> bool **)

let zerop = function
| O -> true
| S n0 -> false

In proofs, sumbool gives us something more: the “justifications”. Let's compare it with this classical bool version:

Definition bool_zerop (n : nat) :=
  match n with
    | O => true
    | S _ => false
  end.

If we are proving something (even stupid):

Goal (if (Compare_dec.zerop 42) then 51 else 69) = 69.
Proof.
destruct (Compare_dec.zerop 42).

There, we have a hypothesis in the context (i.e. as a Prop with Coq's default “=”):

  e : 42 = 0
  ============================
   51 = 69

So, that's easy.

  inversion e.
  reflexivity.
  (* Proof completed. *)
Qed.

Whereas with bool_zerop:

Goal (if (bool_zerop 42) then 51 else 69) = 69.
Proof.
destruct (bool_zerop 42).

we don't have the knowledge:

  
  ============================
   51 = 69

It is not impossible (requires a remember, and then a few theorems) but much more tedious.

Abort.

Before going further, let's add a few notations:

Notation "'Decide_left'" := (left _ _).
Notation "'Decide_right'" := (right _ _).
Notation "'Decide_with' x" :=
(if x then Decide_left else Decide_right) (at level 42).

A Schoolbook Example

The common example is the decision about equality of natural numbers eq_nat_dec (also explained in CPDT). The fixpoint just compares the numbers like anyone would do. All the goals generated by refine are equalities and non-equalities, they can be solved by the congruence tactic which ‘is a decision procedure for ground equalities with uninterpreted symbols’.

Definition eq_nat_dec : forall n m : nat, {n = m} + {n <> m}.
  refine (fix f (n m : nat) : {n = m} + {n <> m} :=
    match n, m with
      | O, O => Decide_left
      | S n', S m' => Decide_with (f n' m')
      | _, _ => Decide_right
    end); congruence.
Defined.

Let's make a few tests:

Eval compute in eq_nat_dec 42 42.

= Decide_left
: {42 = 42} + {42 <> 42}

Eval compute in eq_nat_dec 42 51.

= Decide_right
: {42 = 51} + {42 <> 51}

and have a look at some OCaml code:

Extract Inductive sumbool => "bool" ["true" "false"].
Extraction eq_nat_dec.

(** val eq_nat_dec : nat -> nat -> bool **)

let rec eq_nat_dec n m =
  match n with
  | O ->
    (match m with
     | O -> true
     | S n0 -> false)
  | S n' ->
    (match m with
     | O -> false
     | S m' -> eq_nat_dec n' m')

For automatically dealing with this kind of functions, there is a tactic called decide equality. It solves ‘a goal of the form forall x y:R, {x=y}+{~x=y}, where R is an inductive type such that its constructors do not take proofs or functions as arguments, nor objects in dependent types’:

Definition eq_nat_autodec (n m : nat) : {n = m} + {n <> m}.
decide equality.
Defined.

In this case, the code is as beautiful as before:

Extraction eq_nat_autodec.

(** val eq_nat_autodec : nat -> nat -> bool **)

let rec eq_nat_autodec n m0 =
  match n with
  | O ->
    (match m0 with
     | O -> true
     | S n0 -> false)
  | S n0 ->
    (match m0 with
     | O -> false
     | S n1 -> eq_nat_autodec n0 n1)

Decisions on Character Strings

Let's try to do something with the sumbool type … In the same time, we start exploring some stuff from the standard library: Ascii and String.

Require Import Ascii String.

It gives us

Inductive string : Set :=
| EmptyString : string
| String : ascii -> string -> string.

where

Inductive ascii : Set := Ascii (_ _ _ _ _ _ _ _ : bool).
Definition zero :=
Ascii false false false false false false false false.
Definition one :=
Ascii true false false false false false false false.

So, strings here are lists of 8-bits integers, lower bits on the left.

The String Equality

Let's write string_eq_dec, the equality between strings, using the lower-lever ascii_dec:

Check ascii_dec.

ascii_dec
     : forall a b : ascii, {a = b} + {a <> b}

Definition string_eq_dec:
  forall sa sb : string, { sa = sb } + { sa <> sb }.
refine (fix frec (sa sb : string):
              { sa = sb } + { sa <> sb } :=
  match sa with
    | EmptyString =>
      match sb with
        | EmptyString => Decide_left
        | String _ _ => Decide_right
      end
    | String ch_a tl_a =>
      match sb with
        | EmptyString => Decide_right
        | String ch_b tl_b =>
          if ascii_dec ch_a ch_b
          then Decide_with (frec tl_a tl_b)
          else Decide_right
      end
  end); congruence.
Defined.

Here one single decide equality does not seem to be enough, but repeating it gives something:

Definition string_eq_autodec:
forall sa sb : string, { sa = sb } + { sa <> sb }.
repeat decide equality.
Defined.

The problem is that the repeat goes too far inside the Ascii characters.

Extraction string_eq_autodec.

It gives an ugly piece of code comparing every single bit of Ascii character:

(** val string_eq_autodec : string -> string -> bool **)

let rec string_eq_autodec s sb0 =
  match s with
  | EmptyString ->
    (match sb0 with
     | EmptyString -> true
     | String (a, s0) -> false)
  | String (a, s0) ->
    (match sb0 with
     | EmptyString -> false
     | String (a0, s1) ->
       let Ascii (x, x0, x1, x2, x3, x4, x5, x6) = a in
       let Ascii (b7, b8, b9, b10, b11, b12, b13, b14) = a0 in
       (match x with
        | True ->
          (match b7 with
           | True ->
             (match x0 with
              | True ->
              
                  (* ... hundreds of lines ... *)

The hand-written one looks much better:

Extraction string_eq_dec.

(** val string_eq_dec : string -> string -> bool **)

let rec string_eq_dec sa sb =
  match sa with
  | EmptyString ->
    (match sb with
     | EmptyString -> true
     | String (a, s) -> false)
  | String (ch_a, tl_a) ->
    (match sb with
     | EmptyString -> false
     | String (ch_b, tl_b) ->
       if ascii_dec ch_a ch_b then
         string_eq_dec tl_a tl_b
       else
         false)

Lexicographic Order

We now try to specify and implement the lexicographic order on strings.

Specification

For the order on Ascii characters, we will use nat_of_ascii and then the order on naturals. There are a few theorems on that embedding:

Check ascii_nat_embedding.

ascii_nat_embedding
     : forall a : ascii, ascii_of_nat (nat_of_ascii a) = a

Check nat_ascii_embedding.

nat_ascii_embedding
     : forall n : nat,
         n < 256 -> nat_of_ascii (ascii_of_nat n) = n

We first specify what it means for 2 strings to be in lexicographic order.

Inductive in_lex_order: string -> string -> Prop :=
| lexord_empty_strings: (* "" <= "" *)
  in_lex_order EmptyString EmptyString
| lexord_chars_different: (* "A..." <= "B..." *)
  forall (c1 c2: ascii) (s1 s2: string),
    nat_of_ascii c1 < nat_of_ascii c2 ->
    in_lex_order (String c1 s1) (String c2 s2)
| lexord_chars_equal: (* s1 <= s2  ->   As1 <= As2 *)
  forall (c: ascii) (s1 s2: string),
    in_lex_order s1 s2 ->
    in_lex_order (String c s1) (String c s2)
| lexord_string_shorter: (* "aaa" <= "aaaaa" *)
  forall s, in_lex_order EmptyString s.

We add the 4 constructors to auto's database.

Hint Constructors in_lex_order.

Let's check it with a few examples(Of course, it is by doing the examples that I got the inductive definition right little by little ;)) …

Example ex_lex_01: in_lex_order "AAA" "AAB".
Proof. auto. Qed.
Example ex_lex_02: in_lex_order "AAA" "AAA".
Proof. auto. Qed.

ex_lex_03 made me fight a bit. Here is the first version:

Example ex_lex_03: ~ in_lex_order "AAB" "AAA".
Proof. unfold not. intros.
  Time repeat match goal with
                H: in_lex_order _ _ |- _ =>
                  inversion H; clear H
              end;
  repeat match goal with
           H: nat_of_ascii _ < nat_of_ascii _ |- _ =>
             compute in H
         end;
  repeat match goal with
           H: S _ <= _ |- _ =>
             inversion H; clear H; subst
         end.
  (* Proof Completed. *)

After forcing the unfolding of not,

we invert all the hypotheses using our inductive proposition;
we force the computation of the nat_of_ascii functions;
we invert the (false) inequalities.

We use Time to see that this makes a lot of steps:

Proof completed.
Finished transaction in 1. secs (0.u,1.71174s)

The type-checking of the proof gets even worse:

Time Qed.

ex_lex_03 is defined
Finished transaction in 8. secs (6.997937u,0.046993s)

The problem is the last repeated step, on the inequalities. Just remember, when we play with naturals, omega is our friend.

Require Import Omega.
Example ex_lex_03': ~ in_lex_order "AAB" "AAA".
Proof.
  unfold not; intros.
  repeat match goal with
           H: in_lex_order _ _ |- _ =>
             inversion H; clear H
         end;
  repeat match goal with
           H: nat_of_ascii _ < nat_of_ascii _ |- _ =>
             compute in H
         end;
  match goal with H: S _ <= _ |- _ => omega end.
Time Qed.

ex_lex_03' is defined
Finished transaction in 0. secs (0.040994u,0.s)

Example ex_lex_04: in_lex_order "AAA" "AAAA".
Proof. auto. Qed.

ex_lex_05 works like ex_lex_03:

Example ex_lex_05: ~ in_lex_order "AAAA" "AAA".
Proof. unfold not. intros.
  repeat match goal with
           H: in_lex_order _ _ |- _ =>
             inversion H; clear H
         end;
  repeat match goal with
           H: nat_of_ascii _ < nat_of_ascii _ |- _ =>
             compute in H
         end;
  match goal with H: S _ <= _ |- _ => omega end.
Qed.

Implementation

That was the specification; now the implementation. We just give a name to the intended return type:

Definition lexicographic_decision (a b : string) :=
{ in_lex_order a b } + { ~ (in_lex_order a b) }.

First, we will need this lemma about nat_of_ascii.

Lemma nat_of_ascii_injective:
  forall c1 c2, nat_of_ascii c1 = nat_of_ascii c2 -> c1 = c2.
Proof.
  intros; simpl.
  assert (ascii_of_nat (nat_of_ascii c1) =
              ascii_of_nat (nat_of_ascii c2))
      as Hinvol. auto.
  repeat rewrite ascii_nat_embedding in Hinvol.
  trivial.
Qed.

nat_of_ascii_injective is used to prove the following one, which will allow intuition auto to finish its job nicely:

Lemma tricky_step_in_decide_lex_order:
  forall c1 c2 s1 s2,
    ~ nat_of_ascii c1 < nat_of_ascii c2 ->
    ~ nat_of_ascii c2 < nat_of_ascii c1 ->
    in_lex_order s1 s2 ->
    in_lex_order (String c1 s1) (String c2 s2).
Proof.
  intros.
  assert (nat_of_ascii c1 = nat_of_ascii c2) as c1_c2_nat_eq.
    omega.
  apply nat_of_ascii_injective in c1_c2_nat_eq; subst; auto.
Qed.

And now is the definition of decide_lex_order. We use Compare_dec.lt_dec to compare naturals, it has a sumbool type too(Note that there is also lt_eq_lt_dec but we are not ready to use it for now (it is a sumbool embedded in a sumor, so, maybe another time :)).):

forall n m : nat, {n < m} + {~ n < m}

Require Import Compare_dec.

Definition decide_lex_order:
  forall a b : string, lexicographic_decision a b.

refine (fix fcontinue a b: lexicographic_decision a b :=
  match a, b with
    | EmptyString, EmptyString => Decide_left
    | EmptyString, _ => Decide_left
    | _, EmptyString => Decide_right
    | String c1 s1, String c2 s2 =>
      match lt_dec (nat_of_ascii c1) (nat_of_ascii c2) with
        | Decide_left => Decide_left
        | Decide_right =>
          match lt_dec (nat_of_ascii c2) (nat_of_ascii c1) with
            | Decide_left => Decide_right
            | Decide_right => Decide_with (fcontinue s1 s2)
          end
      end
  end);
  solve [
    intuition auto using tricky_step_in_decide_lex_order
    | match goal with
        |- ~ in_lex_order _ _ =>
          unfold not; intro H; inversion H;
            subst; intuition auto
      end ].
Defined.

Just for the record, the following is how looked my first “Proof Complete” on that one. The idea is to explore with wild ugly tactics or inline lemmas, and then do the factorisation, cleaning, and optimisations.

unfold not in *; intros; auto.

inversion H.

inversion H. intuition.
subst. intuition.

assert (nat_of_ascii c1 = nat_of_ascii c2). omega.

Lemma nat_of_ascii_injective: forall c1 c2,
  nat_of_ascii c1 = nat_of_ascii c2 -> c1 = c2.
  intros; simpl.
  assert (ascii_of_nat (nat_of_ascii c1) =
         ascii_of_nat (nat_of_ascii c2)).
      auto.
  repeat rewrite ascii_nat_embedding in H0.
  trivial.
Qed.

apply nat_of_ascii_injective in H.
subst.
auto.

inversion H; subst.

intuition auto.
intuition auto.
Defined.

We can give it a try:

Eval compute in (decide_lex_order "AAA" "BBB").

     = Decide_left
     : lexicographic_decision "AAA" "BBB"

Time Eval compute in (decide_lex_order "AAA" "ABB").

     = Decide_left
     : lexicographic_decision "AAA" "ABB"
Finished transaction in 4. secs (0.u,3.544461s)

Time Eval compute in (decide_lex_order "AAA" "AAB").

     = Decide_left
     : lexicographic_decision "AAA" "AAB"
Finished transaction in 9. secs (0.u,8.202753s)

Time Eval compute in (decide_lex_order "AA" "A").

     = Decide_right
     : lexicographic_decision "AA" "A"
Finished transaction in 13. secs (0.u,12.995024s)

The tactic vm_compute does not seem to improve the speed there. But anyway, the important performance-wise is the OCaml code, and it looks pretty good:

Extraction decide_lex_order.

(** val decide_lex_order : 
      string -> string -> lexicographic_decision **)

let rec decide_lex_order a b =
  match a with
  | EmptyString -> true
  | String (c1, s1) ->
    (match b with
     | EmptyString -> false
     | String (c2, s2) ->
       if lt_dec (nat_of_ascii c1) (nat_of_ascii c2)
       then true
       else if lt_dec (nat_of_ascii c2) (nat_of_ascii c1)
            then false
            else decide_lex_order s1 s2)

Conclusion

Now we have subsets ({ _ : _ | _ }) and decisions ({ _ } + { _ }); next time we will try to play with the sumor type.

blog comments powered by Disqus

Sebastien Mondet

Menu:

Site generated thanks to bracetax, camlmix, sebib, and more …
Updated on Sat, 10 Dec 2011 12:46:34 -0500.
Sebastien Mondet