File Imp
Imp: Simple Imperative Programs
Z ::= X;; Y ::= 1;; WHILE ~(Z = 0) DO Y ::= Y * Z;; Z ::= Z - 1 END
Require Import mathcomp.ssreflect.all_ssreflect.
Set Implicit Arguments.
Unset Strict Implicit.
Unset Printing Implicit Defensive.
Set Bullet Behavior "Strict Subproofs".
Arithmetic and Boolean Expressions
Module AExp.
These two definitions specify the abstract syntax of
arithmetic and boolean expressions.
Inductive aexp : Type :=
| ANum (n : nat)
| APlus (a1 a2 : aexp)
| AMinus (a1 a2 : aexp)
| AMult (a1 a2 : aexp).
Inductive bexp : Type :=
| BTrue
| BFalse
| BEq (a1 a2 : aexp)
| BLe (a1 a2 : aexp)
| BNot (b : bexp)
| BAnd (b1 b2 : bexp).
In this chapter, we'll mostly elide the translation from the
concrete syntax that a programmer would actually write to these
abstract syntax trees -- the process that, for example, would
translate the string "1 + 2 * 3" to the AST
APlus (ANum 1) (AMult (ANum 2) (ANum 3))
For comparison, here's a conventional BNF (Backus-Naur Form)
grammar defining the same abstract syntax:
Compared to the Coq version above...
It's good to be comfortable with both sorts of notations: informal
ones for communicating between humans and formal ones for carrying
out implementations and proofs.
a ::= nat | a + a | a - a | a * a b ::= true | false | a = a | a <= a | ~ b | b && b
- The BNF is more informal -- for example, it gives some
suggestions about the surface syntax of expressions (like the
fact that the addition operation is written with an infix
+) while leaving other aspects of lexical analysis and
parsing (like the relative precedence of +, -, and
*, the use of parens to group subexpressions, etc.)
unspecified. Some additional information -- and human
intelligence -- would be required to turn this description
into a formal definition, e.g., for implementing a compiler.
- Conversely, the BNF version is lighter and easier to read.
Its informality makes it flexible, a big advantage in
situations like discussions at the blackboard, where
conveying general ideas is more important than getting every
detail nailed down precisely.
Fixpoint aeval (a : aexp) : nat :=
match a with
| ANum n => n
| APlus a1 a2 => (aeval a1) + (aeval a2)
| AMinus a1 a2 => (aeval a1) - (aeval a2)
| AMult a1 a2 => (aeval a1) * (aeval a2)
end.
Example test_aeval1:
aeval (APlus (ANum 2) (ANum 2)) = 4.
Proof. done. Qed.
Similarly, evaluating a boolean expression yields a boolean.
Fixpoint beval (b : bexp) : bool :=
match b with
| BTrue => true
| BFalse => false
| BEq a1 a2 => (aeval a1) == (aeval a2)
| BLe a1 a2 => (aeval a1) <= (aeval a2)
| BNot b1 => negb (beval b1)
| BAnd b1 b2 => andb (beval b1) (beval b2)
end.
Optimization (i.e., a first compiler pass)
Fixpoint optimize_0plus (a:aexp) : aexp :=
match a with
| ANum n => ANum n
| APlus (ANum 0) e2 => optimize_0plus e2
| APlus e1 e2 => APlus (optimize_0plus e1) (optimize_0plus e2)
| AMinus e1 e2 => AMinus (optimize_0plus e1) (optimize_0plus e2)
| AMult e1 e2 => AMult (optimize_0plus e1) (optimize_0plus e2)
end.
To make sure our optimization is doing the right thing we
can test it on some examples and see if the output looks OK.
Example test_optimize_0plus:
optimize_0plus (APlus (ANum 2)
(APlus (ANum 0)
(APlus (ANum 0) (ANum 1))))
= APlus (ANum 2) (ANum 1).
Proof. done. Qed.
But if we want to be sure the optimization is correct --
i.e., that evaluating an optimized expression gives the same
result as the original -- we should prove it.
Theorem optimize_0plus_sound: forall a,
aeval (optimize_0plus a) = aeval a.
Proof.
elim => //= a1 IH1 a2 IH2.
- case: a1 IH1;
first (case => //=; by rewrite IH2);
by (move=> a1 a0 //= IH1; rewrite IH2 IH1).
- by rewrite IH1 IH2.
- by rewrite IH1 IH2.
Qed.
Evaluation as a Relation
Module aevalR_first_try.
Inductive aevalR : aexp -> nat -> Prop :=
| E_ANum n :
aevalR (ANum n) n
| E_APlus (e1 e2: aexp) (n1 n2: nat) :
aevalR e1 n1 ->
aevalR e2 n2 ->
aevalR (APlus e1 e2) (n1 + n2)
| E_AMinus (e1 e2: aexp) (n1 n2: nat) :
aevalR e1 n1 ->
aevalR e2 n2 ->
aevalR (AMinus e1 e2) (n1 - n2)
| E_AMult (e1 e2: aexp) (n1 n2: nat) :
aevalR e1 n1 ->
aevalR e2 n2 ->
aevalR (AMult e1 e2) (n1 * n2).
Module TooHardToRead.
Inductive aevalR : aexp -> nat -> Prop :=
| E_ANum n :
aevalR (ANum n) n
| E_APlus (e1 e2: aexp) (n1 n2: nat)
(H1 : aevalR e1 n1)
(H2 : aevalR e2 n2) :
aevalR (APlus e1 e2) (n1 + n2)
| E_AMinus (e1 e2: aexp) (n1 n2: nat)
(H1 : aevalR e1 n1)
(H2 : aevalR e2 n2) :
aevalR (AMinus e1 e2) (n1 - n2)
| E_AMult (e1 e2: aexp) (n1 n2: nat)
(H1 : aevalR e1 n1)
(H2 : aevalR e2 n2) :
aevalR (AMult e1 e2) (n1 * n2).
Instead, we've chosen to leave the hypotheses anonymous, just
giving their types. This style gives us less control over the
names that Coq chooses during proofs involving aevalR, but it
makes the definition itself quite a bit lighter.
End TooHardToRead.
It will be convenient to have an infix notation for
aevalR. We'll write e \\ n to mean that arithmetic expression
e evaluates to value n.
Notation "e '\\' n"
:= (aevalR e n)
(at level 50, left associativity)
: type_scope.
End aevalR_first_try.
In fact, Coq provides a way to use this notation in the
definition of aevalR itself. This reduces confusion by avoiding
situations where we're working on a proof involving statements in
the form e \\ n but we have to refer back to a definition
written using the form aevalR e n.
We do this by first "reserving" the notation, then giving the
definition together with a declaration of what the notation
means.
Reserved Notation "e '\\' n" (at level 90, left associativity).
Inductive aevalR : aexp -> nat -> Prop :=
| E_ANum (n : nat) :
(ANum n) \\ n
| E_APlus (e1 e2 : aexp) (n1 n2 : nat) :
(e1 \\ n1) -> (e2 \\ n2) -> (APlus e1 e2) \\ (n1 + n2)
| E_AMinus (e1 e2 : aexp) (n1 n2 : nat) :
(e1 \\ n1) -> (e2 \\ n2) -> (AMinus e1 e2) \\ (n1 - n2)
| E_AMult (e1 e2 : aexp) (n1 n2 : nat) :
(e1 \\ n1) -> (e2 \\ n2) -> (AMult e1 e2) \\ (n1 * n2)
where "e '\\' n" := (aevalR e n) : type_scope.
Inference Rule Notation
| E_APlus : forall (e1 e2: aexp) (n1 n2: nat), aevalR e1 n1 -> aevalR e2 n2 -> aevalR (APlus e1 e2) (n1 + n2)...would be written like this as an inference rule:
e1 \\ n1 e2 \\ n2 -------------------- (E_APlus) APlus e1 e2 \\ n1+n2
----------- (E_ANum) ANum n \\ n e1 \\ n1 e2 \\ n2 -------------------- (E_APlus) APlus e1 e2 \\ n1+n2 e1 \\ n1 e2 \\ n2 --------------------- (E_AMinus) AMinus e1 e2 \\ n1-n2 e1 \\ n1 e2 \\ n2 -------------------- (E_AMult) AMult e1 e2 \\ n1*n2
Equivalence of the Definitions
Theorem aeval_iff_aevalR : forall a n,
(a \\ n) <-> aeval a = n.
Proof.
split.
- elim => //= e1 e2 n1 n2 _ -> _ -> //=.
- elim: a n => //=;
first (by move=> n n0 ->; constructor);
move=> a1 IH1 a2 IH2 n <-;
by (constructor ; [apply: IH1 | apply: IH2]).
Qed.
End AExp.
Computational vs. Relational Definitions
Module aevalR_division.
For example, suppose that we wanted to extend the arithmetic
operations with division:
Inductive aexp : Type :=
| ANum (n : nat)
| APlus (a1 a2 : aexp)
| AMinus (a1 a2 : aexp)
| AMult (a1 a2 : aexp)
| ADiv (a1 a2 : aexp).
Extending the definition of aeval to handle this new operation
would not be straightforward (what should we return as the result
of ADiv (ANum 5) (ANum 0)?). But extending aevalR is
straightforward.
Reserved Notation "e '\\' n"
(at level 90, left associativity).
Inductive aevalR : aexp -> nat -> Prop :=
| E_ANum (n : nat) :
(ANum n) \\ n
| E_APlus (a1 a2 : aexp) (n1 n2 : nat) :
(a1 \\ n1) -> (a2 \\ n2) -> (APlus a1 a2) \\ (n1 + n2)
| E_AMinus (a1 a2 : aexp) (n1 n2 : nat) :
(a1 \\ n1) -> (a2 \\ n2) -> (AMinus a1 a2) \\ (n1 - n2)
| E_AMult (a1 a2 : aexp) (n1 n2 : nat) :
(a1 \\ n1) -> (a2 \\ n2) -> (AMult a1 a2) \\ (n1 * n2)
| E_ADiv (a1 a2 : aexp) (n1 n2 n3 : nat) :
(a1 \\ n1) -> (a2 \\ n2) -> (n2 > 0) ->
(mult n2 n3 = n1) -> (ADiv a1 a2) \\ n3
where "a '\\' n" := (aevalR a n) : type_scope.
End aevalR_division.
Module aevalR_extended.
Or suppose that we want to extend the arithmetic operations by a
nondeterministic number generator any that, when evaluated, may
yield any number. (Note that this is not the same as making a
probabilistic choice among all possible numbers -- we're not
specifying any particular probability distribution for the
results, just saying what results are possible.)
Reserved Notation "e '\\' n" (at level 90, left associativity).
Inductive aexp : Type :=
| AAny
| ANum (n : nat)
| APlus (a1 a2 : aexp)
| AMinus (a1 a2 : aexp)
| AMult (a1 a2 : aexp).
Again, extending aeval would be tricky, since now evaluation is
not a deterministic function from expressions to numbers, but
extending aevalR is no problem...
Inductive aevalR : aexp -> nat -> Prop :=
| E_Any (n : nat) :
AAny \\ n
| E_ANum (n : nat) :
(ANum n) \\ n
| E_APlus (a1 a2 : aexp) (n1 n2 : nat) :
(a1 \\ n1) -> (a2 \\ n2) -> (APlus a1 a2) \\ (n1 + n2)
| E_AMinus (a1 a2 : aexp) (n1 n2 : nat) :
(a1 \\ n1) -> (a2 \\ n2) -> (AMinus a1 a2) \\ (n1 - n2)
| E_AMult (a1 a2 : aexp) (n1 n2 : nat) :
(a1 \\ n1) -> (a2 \\ n2) -> (AMult a1 a2) \\ (n1 * n2)
where "a '\\' n" := (aevalR a n) : type_scope.
End aevalR_extended.
At this point you maybe wondering: which style should I use by
default? In the examples we've just seen, relational definitions
turned out to be more useful than functional ones. For situations
like these, where the thing being defined is not easy to express
as a function, or indeed where it is not a function, there is no
real choice. But what about when both styles are workable?
One point in favor of relational definitions is that they can be
more elegant and easier to understand.
Another is that Coq automatically generates nice inversion and
induction principles from Inductive definitions.
On the other hand, functional definitions can often be more
convenient:
Furthermore, functions can be directly "extracted" from Gallina to
executable code in OCaml or Haskell.
Ultimately, the choice often comes down to either the specifics of
a particular situation or simply a question of taste. Indeed, in
large Coq developments it is common to see a definition given in
both functional and relational styles, plus a lemma stating that
the two coincide, allowing further proofs to switch from one point
of view to the other at will.
- Functions are by definition deterministic and defined on all arguments; for a relation we have to show these properties explicitly if we need them.
- With functions we can also take advantage of Coq's computation mechanism to simplify expressions during proofs.
Expressions With Variables
States
From Coq Require Import Strings.String.
Definition state := string -> nat.
Syntax
Inductive aexp : Type :=
| ANum (n : nat)
| AId (x : string)
| APlus (a1 a2 : aexp)
| AMinus (a1 a2 : aexp)
| AMult (a1 a2 : aexp).
Defining a few variable names as notational shorthands will make
examples easier to read:
Definition W : string := "W".
Definition X : string := "X".
Definition Y : string := "Y".
Definition Z : string := "Z".
The definition of bexps is unchanged (except that it now refers
to the new aexps):
Inductive bexp : Type :=
| BTrue
| BFalse
| BEq (a1 a2 : aexp)
| BLe (a1 a2 : aexp)
| BNot (b : bexp)
| BAnd (b1 b2 : bexp).
Notations
Coercion AId : string >-> aexp.
Coercion ANum : nat >-> aexp.
Definition bool_to_bexp (b : bool) : bexp :=
if b then BTrue else BFalse.
Coercion bool_to_bexp : bool >-> bexp.
Bind Scope imp_scope with aexp.
Bind Scope imp_scope with bexp.
Delimit Scope imp_scope with imp.
Notation "x + y" := (APlus x y) (at level 50, left associativity) : imp_scope.
Notation "x - y" := (AMinus x y) (at level 50, left associativity) : imp_scope.
Notation "x * y" := (AMult x y) (at level 40, left associativity) : imp_scope.
Notation "x <= y" := (BLe x y) (at level 70, no associativity) : imp_scope.
Notation "x = y" := (BEq x y) (at level 70, no associativity) : imp_scope.
Notation "x && y" := (BAnd x y) (at level 40, left associativity) : imp_scope.
Notation "'~' b" := (BNot b) (at level 75, right associativity) : imp_scope.
Definition example_aexp := (3 + (X * 2))%imp : aexp.
Definition example_bexp := (true && ~(X <= 4))%imp : bexp.
One downside of these coercions is that they can make it a little
harder for humans to calculate the types of expressions. If you
get confused, try doing Set Printing Coercions to see exactly
what is going on.
Set Printing Coercions.
Print example_bexp.
Unset Printing Coercions.
Evaluation
Fixpoint aeval (st : state) (a : aexp) : nat :=
match a with
| ANum n => n
| AId x => st x
| APlus a1 a2 => (aeval st a1) + (aeval st a2)
| AMinus a1 a2 => (aeval st a1) - (aeval st a2)
| AMult a1 a2 => (aeval st a1) * (aeval st a2)
end.
Fixpoint beval (st : state) (b : bexp) : bool :=
match b with
| BTrue => true
| BFalse => false
| BEq a1 a2 => (aeval st a1) == (aeval st a2)
| BLe a1 a2 => (aeval st a1) <= (aeval st a2)
| BNot b1 => negb (beval st b1)
| BAnd b1 b2 => andb (beval st b1) (beval st b2)
end.
An empty state returns 0 for all variables.
Definition empty_st := fun (_ : string) => 0.
We can write a definition for extending a state with a new binding.
Definition extend_st a x (st : state) :=
(fun (a' : string) => if String.eqb a a' then x
else st a).
(fun (a' : string) => if String.eqb a a' then x
else st a).
Now we can add a notation for a "singleton state" with just one
variable bound to a value.
Notation "a '!->' x" := (extend_st a x empty_st) (at level 100).
And extending an existing state with a new variable.
Notation "a '!->' x ';' s" := (extend_st a x s)
(at level 100, x at next level, right associativity).
Example aexp1 :
aeval (X !-> 5) (3 + (X * 2))%imp
= 13.
Proof. reflexivity. Qed.
Example bexp1 :
beval (X !-> 5) (true && ~(X <= 4))%imp
= true.
Proof. reflexivity. Qed.
(at level 100, x at next level, right associativity).
Example aexp1 :
aeval (X !-> 5) (3 + (X * 2))%imp
= 13.
Proof. reflexivity. Qed.
Example bexp1 :
beval (X !-> 5) (true && ~(X <= 4))%imp
= true.
Proof. reflexivity. Qed.
Commands
Syntax
c ::= SKIP | x ::= a | c ;; c | TEST b THEN c ELSE c FI | WHILE b DO c END
Z ::= X;; Y ::= 1;; WHILE ~(Z = 0) DO Y ::= Y * Z;; Z ::= Z - 1 END
Inductive com : Type :=
| CSkip
| CAss (x : string) (a : aexp)
| CSeq (c1 c2 : com)
| CIf (b : bexp) (c1 c2 : com)
| CWhile (b : bexp) (c : com).
As for expressions, we can use a few Notation declarations to
make reading and writing Imp programs more convenient.
Bind Scope imp_scope with com.
Notation "'SKIP'" :=
CSkip : imp_scope.
Notation "x '::=' a" :=
(CAss x a) (at level 60) : imp_scope.
Notation "c1 ;; c2" :=
(CSeq c1 c2) (at level 80, right associativity) : imp_scope.
Notation "'WHILE' b 'DO' c 'END'" :=
(CWhile b c) (at level 80, right associativity) : imp_scope.
Notation "'TEST' c1 'THEN' c2 'ELSE' c3 'FI'" :=
(CIf c1 c2 c3) (at level 80, right associativity) : imp_scope.
For example, here is the factorial function again, written as a
formal definition to Coq:
Definition fact_in_coq : com :=
(Z ::= X;;
Y ::= 1;;
WHILE ~(Z = 0) DO
Y ::= Y * Z;;
Z ::= Z - 1
END)%imp.
Desugaring notations
- Unset Printing Notations (undo with Set Printing Notations)
- Set Printing Coercions (undo with Unset Printing Coercions)
- Set Printing All (undo with Unset Printing All)
Unset Printing Notations.
Print fact_in_coq.
Set Printing Notations.
Set Printing Coercions.
Print fact_in_coq.
Unset Printing Coercions.
Finding notations
Locate "&&".
Locate ";;".
Locate "WHILE".
Locate ";;".
Locate "WHILE".
Finding identifiers
Locate aexp.
Definition plus2 : com :=
X ::= X + 2.
Definition XtimesYinZ : com :=
Z ::= X * Y.
Definition subtract_slowly_body : com :=
Z ::= Z - 1 ;;
X ::= X - 1.
Definition subtract_slowly : com :=
(WHILE ~(X = 0) DO
subtract_slowly_body
END)%imp.
Definition subtract_3_from_5_slowly : com :=
X ::= 3 ;;
Z ::= 5 ;;
subtract_slowly.
Definition loop : com :=
WHILE true DO
SKIP
END.
Evaluating Commands
Evaluation as a Function (Failed Attempt)
Open Scope imp_scope.
Fixpoint ceval_fun_no_while (st : state) (c : com)
: state :=
match c with
| SKIP =>
st
| x ::= a1 =>
(x !-> (aeval st a1) ; st)
| c1 ;; c2 =>
let st' := ceval_fun_no_while st c1 in
ceval_fun_no_while st' c2
| TEST b THEN c1 ELSE c2 FI =>
if (beval st b)
then ceval_fun_no_while st c1
else ceval_fun_no_while st c2
| WHILE b DO c END =>
st
end.
Close Scope imp_scope.
Fixpoint ceval_fun_no_while (st : state) (c : com)
: state :=
match c with
| SKIP =>
st
| x ::= a1 =>
(x !-> (aeval st a1) ; st)
| c1 ;; c2 =>
let st' := ceval_fun_no_while st c1 in
ceval_fun_no_while st' c2
| TEST b THEN c1 ELSE c2 FI =>
if (beval st b)
then ceval_fun_no_while st c1
else ceval_fun_no_while st c2
| WHILE b DO c END =>
st
end.
Close Scope imp_scope.
In a traditional functional programming language like OCaml or
Haskell we could add the WHILE case as follows:
Coq doesn't accept such a definition ("Error: Cannot guess
decreasing argument of fix") because the function we want to
define is not guaranteed to terminate. Indeed, it doesn't always
terminate: for example, the full version of the ceval_fun
function applied to the loop program above would never
terminate. Since Coq is not just a functional programming
language but also a consistent logic, any potentially
non-terminating function needs to be rejected. Here is
an (invalid!) program showing what would go wrong if Coq
allowed non-terminating recursive functions:
That is, propositions like False would become provable
(loop_false 0 would be a proof of False), which
would be a disaster for Coq's logical consistency.
Thus, because it doesn't terminate on all inputs,
of ceval_fun cannot be written in Coq -- at least not without
additional tricks and workarounds.
Fixpoint ceval_fun (st : state) (c : com) : state := match c with ... | WHILE b DO c END => if (beval st b) then ceval_fun st (c ;; WHILE b DO c END) else st end.
Fixpoint loop_false (n : nat) : False := loop_false n.
Evaluation as a Relation
Operational Semantics
----------------- (E_Skip) st =[ SKIP ]=> st aeval st a1 = n -------------------------------- (E_Ass) st =[ x := a1 ]=> (x !-> n ; st) st =[ c1 ]=> st' st' =[ c2 ]=> st'' --------------------- (E_Seq) st =[ c1;;c2 ]=> st'' beval st b1 = true st =[ c1 ]=> st' --------------------------------------- (E_IfTrue) st =[ TEST b1 THEN c1 ELSE c2 FI ]=> st' beval st b1 = false st =[ c2 ]=> st' --------------------------------------- (E_IfFalse) st =[ TEST b1 THEN c1 ELSE c2 FI ]=> st' beval st b = false ----------------------------- (E_WhileFalse) st =[ WHILE b DO c END ]=> st beval st b = true st =[ c ]=> st' st' =[ WHILE b DO c END ]=> st'' -------------------------------- (E_WhileTrue) st =[ WHILE b DO c END ]=> st''
Reserved Notation "st '=[' c ']=>' st'"
(at level 40).
Inductive ceval : com -> state -> state -> Prop :=
| E_Skip : forall st,
st =[ SKIP ]=> st
| E_Ass : forall st a1 n x,
aeval st a1 = n ->
st =[ x ::= a1 ]=> (x !-> n ; st)
| E_Seq : forall c1 c2 st st' st'',
st =[ c1 ]=> st' ->
st' =[ c2 ]=> st'' ->
st =[ c1 ;; c2 ]=> st''
| E_IfTrue : forall st st' b c1 c2,
beval st b = true ->
st =[ c1 ]=> st' ->
st =[ TEST b THEN c1 ELSE c2 FI ]=> st'
| E_IfFalse : forall st st' b c1 c2,
beval st b = false ->
st =[ c2 ]=> st' ->
st =[ TEST b THEN c1 ELSE c2 FI ]=> st'
| E_WhileFalse : forall b st c,
beval st b = false ->
st =[ WHILE b DO c END ]=> st
| E_WhileTrue : forall st st' st'' b c,
beval st b = true ->
st =[ c ]=> st' ->
st' =[ WHILE b DO c END ]=> st'' ->
st =[ WHILE b DO c END ]=> st''
where "st =[ c ]=> st'" := (ceval c st st').
The cost of defining evaluation as a relation instead of a function is that
we now need to construct proofs that some program evaluates to some result
state, rather than just letting Coq's computation mechanism do it for us.
This will also significantly impact the structure of proofs about
evaluation.
Determinism of Evaluation
Theorem ceval_deterministic: forall c st st1 st2,
st =[ c ]=> st1 ->
st =[ c ]=> st2 ->
st1 = st2.
Admitted.