Lecture notes 20190326

Denotational Semantics 1

Remark. Some material in this lecture is from << Software Foundation >> volume 1 and volume 2.

Require Import PL.Imp6.

Review: Program Expression's Denotational Semantics

We have learnt how to define integer expression's denotational semantics. We can define it using Coq's recursive function. In this definition, state means the set of program states and every program state is a function from program variables to integer values, i.e. var → Z.

Fixpoint aeval (a : aexp) (st : state) : Z :=
  match a with
  | ANum n ⇒ n
  | AId X ⇒ st X (* <----- the value of X on program state st *)
  | APlus a₁ a₂ ⇒ (aeval a₁ st) + (aeval a₂ st)
  | AMinus a₁ a₂ ⇒ (aeval a₁ st) - (aeval a₂ st)
  | AMult a₁ a₂ ⇒ (aeval a₁ st) * (aeval a₂ st)
  end.

This time, we swap the order of two arguments. As a result, aexp_eval can be interpreted as a two-argument function, or a one-argument function which maps integer expresions into functions from program states to integers. In other words, the denotation of an integer expression a is a function from program states to integer values. To be more explicit, we can redefine it as follows.

Definition dadd (d₁ d₂: state → Z): state → Z :=
fun st ⇒ d₁ st + d₂ st.

Here, we define dadd to be the sum of two functions. In Coq, we use fun st ⇒ ... to represent a function which takes st as its argument. The right hand side expression of fun st ⇒ represents the function value given this specific argument st. In summary, dadd is a function which takes two arguments. These two arguments and the function value are themselves functions. The function value is define using fun st ⇒ ... in Coq.

Definition dsub (d₁ d₂: state → Z): state → Z :=
fun st ⇒ d₁ st - d₂ st.

Definition dmul (d₁ d₂: state → Z): state → Z :=
fun st ⇒ d₁ st * d₂ st.

Similarly, we can define the subtraction and multiplication of two functions. Then, the denotation of an aexp can be defined as:

These two definitions aeval and aeval' are obviously equivalent. We can prove it in Coq.

Fact aeval_aeval': ∀a st, aeval a st = aeval' a st.
Proof.
intros.

To prove such a property, we would like to do induction over the syntax tree of a.

  induction a.
  + simpl.
    reflexivity.
  + simpl.
    reflexivity.
  + simpl.
    rewrite IHa1.
    rewrite IHa2.
    unfold dadd.
    reflexivity.
  + simpl.
    rewrite IHa1.
    rewrite IHa2.
    unfold dsub.
    reflexivity.
  + simpl.
    rewrite IHa1.
    rewrite IHa2.
    unfold dmul.
    reflexivity.
Qed.

Expression Equivalence

Based on this definition, we can define behaviorally equivalence between two aexps.

Definition aexp_dequiv (d₁ d₂: state → Z): Prop :=
∀st, d₁ st = d₂ st.

Definition aexp_equiv (a₁ a₂: aexp): Prop :=
aexp_dequiv (aeval a₁) (aeval a₂).

Definition is another keyword for writing definitions in Coq besides Fixpoint. It can beused to define functions and/or values but recursions are not allowed in such definitions.

Here, we define that two aexps are behaviorally equivalent if they evaluate to the same result in every state. We first define the equivalence relation between two denotations (which are functions from program states to integers). Then, we define that two integer expressions are equivalent if their denotations are equivalent. Here are some examples.

Module aexp_equiv_example.
Import Abstract_Pretty_Printing.

Example ex₁: ∀(X: var), aexp_equiv (X - X) 0.
Proof.
  unfold aexp_equiv, aexp_dequiv.
  intros.
  simpl.
  omega.
Qed.

Example ex₂: ∀(X: var), aexp_equiv (X + X) (X * 2).
Proof.
  unfold aexp_equiv, aexp_dequiv.
  intros.
  simpl.
  omega.
Qed.

End aexp_equiv_example.

The Constant-Folding Transformation

An expression is constant when it contains no variable references. Constant folding is an optimization that finds constant expressions and replaces them by their values.

Fixpoint fold_constants_aexp (a : aexp) : aexp :=
  match a with
  | ANum n ⇒ ANum n
  | AId x ⇒ AId x
  | APlus a₁ a₂ ⇒
    match fold_constants_aexp a₁, fold_constants_aexp a₂ with
    | ANum n₁, ANum n₂ ⇒ ANum (n₁ + n₂)
    | _, _ ⇒ APlus (fold_constants_aexp a₁) (fold_constants_aexp a₂)
    end
  | AMinus a₁ a₂ ⇒
    match fold_constants_aexp a₁, fold_constants_aexp a₂ with
    | ANum n₁, ANum n₂ ⇒ ANum (n₁ - n₂)
    | _, _ ⇒ AMinus (fold_constants_aexp a₁) (fold_constants_aexp a₂)
    end
  | AMult a₁ a₂ ⇒
    match fold_constants_aexp a₁, fold_constants_aexp a₂ with
    | ANum n₁, ANum n₂ ⇒ ANum (n₁ * n₂)
    | _, _ ⇒ AMult (fold_constants_aexp a₁) (fold_constants_aexp a₂)
    end
  end.

Here, we see that the match expressions in Coq are very flexible. (1) We can apply pattern matching on not only Coq variables but also any Coq expression whose type is inductively defined. (2) We can apply pattern matching on two expressions at the same time. (3) We can use underscore _ to cover default cases.

Module fold_const_example.
Import Abstract_Pretty_Printing.

Example ex₁ : ∀(X: var),
fold_constants_aexp ((1 + 2) * X)
= (3 * X)%imp.
Proof. intros. reflexivity. Qed.

Note that this version of constant folding doesn't eliminate trivial additions, etc. — we are focusing attention on a single optimization for the sake of simplicity. It is not hard to incorporate other ways of simplifying expressions; the definitions and proofs just get longer.

Example fold_aexp_ex₂ : ∀(X: var) (Y: var),
fold_constants_aexp (X - ((0 * 6) + Y))%imp = (X - (0 + Y))%imp.
Proof. intros. reflexivity. Qed.

End fold_const_example.

Soundness of Constant Folding

Now we need to show that what we've done is correct.

Theorem fold_constants_aexp_sound : ∀a,
aexp_equiv a (fold_constants_aexp a).
Proof.
unfold aexp_equiv, aexp_dequiv. intros.

To prove such a property, we would like to do induction over the syntax tree of a.

  induction a.
  + (* ANum case *)
    simpl.
    reflexivity.
  + (* AId case *)
    simpl.
    reflexivity.
  + (* APlus case *)
    simpl.

Here, we have to do case analysis on fold_constants_aexp a₁ and fold_constants_aexp a₂. The tactic destruct can be used to accomplish this. The following line turns the current proof goal into five goals, each of which corresponds to one different kind of fold_constants_aexp a₁. The eqn clauses in the end indicates that an extra assumption Ha₁ will be generated describing what does fold_constants_aexp a₁ equal to.

    destruct (fold_constants_aexp a₁) eqn:Ha₁.
    - (** In this first case, Ha₁ says that fold_constants_aexp a₁ is a
       constant. In this case, we need another destruct for a₂. *)
      destruct (fold_constants_aexp a₂) eqn:Ha₂.
      * rewrite IHa1.
        rewrite IHa2.
        simpl.
        reflexivity.
      * rewrite IHa1.
        rewrite IHa2.
        simpl.
        reflexivity.
      * rewrite IHa1.
        rewrite IHa2.
        simpl.
        reflexivity.
      * rewrite IHa1.
        rewrite IHa2.
        simpl.
        reflexivity.
      * rewrite IHa1.
        rewrite IHa2.
        simpl.
        reflexivity.
    - rewrite IHa1.
      rewrite IHa2.
      simpl.
      reflexivity.
    - rewrite IHa1.
      rewrite IHa2.
      simpl.
      reflexivity.
    - rewrite IHa1.
      rewrite IHa2.
      simpl.
      reflexivity.
    - rewrite IHa1.
      rewrite IHa2.
      simpl.
      reflexivity.
  + (* AMinus case *)
    simpl.

In the previous case, we duplicate our proof script for many times. The tactic language in Coq provide semicolon ;. Specifically, tac1 ; tac2 says do tac1, then do tac2 in every single proof goal that tac1 generates. Semicolon is right associative.

    destruct (fold_constants_aexp a₁) eqn:Ha₁;
    destruct (fold_constants_aexp a₂) eqn:Ha₂;
    rewrite IHa1;
    rewrite IHa2;
    reflexivity.
  + (* AMult case *)
    simpl.
    destruct (fold_constants_aexp a₁) eqn:Ha₁;
    destruct (fold_constants_aexp a₂) eqn:Ha₂;
    rewrite IHa1;
    rewrite IHa2;
    reflexivity.
Qed.

After proving this soundness property, we want to prove that this optimization really improves something. For stating this property formally, we first need to define the steps of calculation involved in program expressions.

Now, we can prove that an optimized expression never takes more calculation than the original one.

Lemma fold_constants_aexp_improve : ∀a,
  cal_count (fold_constants_aexp a) ≤ cal_count a.
Proof.
  intros.
  induction a.
  + simpl.
    omega.
  + simpl.
    omega.
  + simpl.
    destruct (fold_constants_aexp a₁) eqn:Ha₁;
    destruct (fold_constants_aexp a₂) eqn:Ha₂;
    simpl in IHa1;
    simpl in IHa2;
    simpl;
    omega.
  + simpl.
    destruct (fold_constants_aexp a₁) eqn:Ha₁;
    destruct (fold_constants_aexp a₂) eqn:Ha₂;
    simpl in IHa1;
    simpl in IHa2;
    simpl;
    omega.
  + simpl.
    destruct (fold_constants_aexp a₁) eqn:Ha₁;
    destruct (fold_constants_aexp a₂) eqn:Ha₂;
    simpl in IHa1;
    simpl in IHa2;
    simpl;
    omega.
Qed.

Evaluating Command

Next we need to define what it means to evaluate a command. One idea is to define such evaluation as a function from beginning state and command to ending state. But the fact that WHILE loops don't necessarily terminate means that such evaluating function cannot be a total function; it must be a partial function. Although such definition is no problem in theory, computer scientists choose not to do this since it is less extensible. Also, if you try do write it in Coq, it is nontrivial.

Usually, computer scientists use a set of state pairs S to represent a program c's denotation. Specifically, if a program state pair (st₁, st₂) is an element of S, then executing c from state st₁ may terminate with state st₂. Remark: this is different from Hoare triples. Hoare triples are about assertion pairs but a program's denotation is about program state pairs. In other words, the denotation of a program has type state → state → Prop in Coq. Like what we did for integer expressions' denotations, command denotation equivalence can be defined as follows.

Definition com_dequiv (d₁ d₂: state → state → Prop): Prop :=
∀st₁ st₂, d₁ st₁ st₂ ↔ d₂ st₁ st₂.

A set of program state pairs is also called a binary relation between program states. In Coq, we can use state → state → Prop to present such type. As a preparation, we first define some basic concepts about relations.

Module Relation_Operators.

Definition id {A: Type}: A → A → Prop := fun a b ⇒ a = b.

Definition empty {A B: Type}: A → B → Prop := fun a b ⇒ False.

Definition concat {A B C: Type} (r₁: A → B → Prop) (r₂: B → C → Prop): A → C → Prop :=
fun a c ⇒ ∃b, r₁ a b ∧ r₂ b c.

Definition filter1 {A B: Type} (f: A → Prop): A → B → Prop :=
fun a b ⇒ f a.

Definition filter2 {A B: Type} (f: B → Prop): A → B → Prop :=
fun a b ⇒ f b.

Definition union {A B: Type} (r₁ r₂: A → B → Prop): A → B → Prop :=
fun a b ⇒ r₁ a b ∨ r₂ a b.

Definition intersection {A B: Type} (r₁ r₂: A → B → Prop): A → B → Prop :=
fun a b ⇒ r₁ a b ∧ r₂ a b.

In these definitions, we sometimes use braces "{}" instead of parentheses "()". When braces are used, those arguments are called implicit arguments, i.e. you do not need to write those arguments when you use a function.

End Relation_Operators.

Import Relation_Operators.

Arguments beval b st: simpl never.

Fixpoint loop_free_ceval (c: com): state → state → Prop :=
  match c with
  | CSkip ⇒ id
  | CAss X E ⇒
                fun st₁ st₂ ⇒ st₂ X = aeval E st₁ ∧
                               ∀Y, X ≠ Y → st₁ Y = st₂ Y
  | CSeq c₁ c₂ ⇒
                  concat
                    (loop_free_ceval c₁) (loop_free_ceval c₂)
  | CIf b c₁ c₂ ⇒
                   union
                     (intersection
                       (loop_free_ceval c₁)
                       (filter1 (beval b)))
                     (intersection
                       (loop_free_ceval c₂)
                       (filter1 (beval (BNot b))))
  | CWhile _ _ ⇒ empty
  end.

Using these basic definitions about relation, we can easily define the denotation of empty commands, assignment commands, sequential composition and if-then-else commands. We hope that the evaluation function ceval has the following property:

    ceval (CWhile b c) =
      union
        (intersection
          (concat loop_body
            (ceval (CWhile b c)))
          (filter1 (beval b)))
        (intersection
          id
          (filter1 (beval (BNot b))))

But it is not obvious to find out a definition that satisfies this property. We define it using Bourbaki-Witt fixpoint theorem.

Here, we first demonstrate a concrete semantic definition which will fullfill our requirement. Afterwards, we will introduce the general theory for Bourbaki-Witt fixpoint.

The following recursive function defines the semantics of executing the loop body for exactly n times. In Coq, nat represents nature numbers. Coq users can write functions recursively on nature numbers as recursively defined on lists. Specifically, a natural number n is either zero O (the "O" of Omega) or the successor of another natural number n', written as S n'.

Fixpoint iter_loop_body (b: bexp)
                         (loop_body: state → state → Prop)
                         (n: nat): state → state → Prop :=
  match n with
  | O ⇒
         intersection
           id
           (filter1 (beval (BNot b)))
  | S n' ⇒
            intersection
              (concat
                loop_body
                (iter_loop_body b loop_body n'))
              (filter1 (beval b))
  end.

In short, it says iter_loop_body b loop_body n is defined as:

if n = 0, identity relation with the restriction that b is not true;
if n = n' + 1, first do loop_body then do iter_loop_body b loop_body n' with the restriction that b is true at beginning.

The union of these binary relations is exactly the meaning of while loops. The following relation operator omega_union defines the union of countably many relations.

Module Relation_Operators2.

Definition omega_union {A: Type} (rs: nat → A → A → Prop): A → A → Prop :=
fun st₁ st₂ ⇒ ∃n, rs n st₁ st₂.

End Relation_Operators2.

Import Relation_Operators2.

Definition loop_sem (b: bexp) (loop_body: state → state → Prop):
state → state → Prop :=
omega_union (iter_loop_body b loop_body).

And we can prove that this definition satisfies the recursive equation that we want.

Theorem loop_recur: ∀b loop_body,
  com_dequiv
    (loop_sem b loop_body)
    (union
      (intersection
        (concat loop_body
          (loop_sem b loop_body))
        (filter1 (beval b)))
      (intersection
        id
        (filter1 (beval (BNot b))))).
Proof.
  intros.
  unfold com_dequiv.
  intros.
  split.
  + intros.
    unfold loop_sem, omega_union in H.
    unfold union.
    destruct H as [n H].

Now we need to do case analysis over whether n is zero or not.

    destruct n as [| n'].
    - right.
      simpl in H.
      exact H.
    - left.
      simpl in H.
      unfold concat, intersection in H.
      unfold concat, intersection.
      destruct H as [[st' [? ?]] ?].
      split.
      * ∃st'.
        split.
        { exact H. }
        unfold loop_sem, omega_union.
        ∃n'.
        exact H₀.
      * exact H₁.
  + intros.
    unfold loop_sem, omega_union.
    unfold union in H.
    destruct H.
    - unfold intersection, concat in H.
      destruct H as [[st' [? ?]] ?].
      unfold loop_sem, omega_union in H₀.
      destruct H₀ as [n ?].
      ∃(S n).
      simpl.
      unfold intersection, concat.
      split.
      * ∃st'.
        split.
        { exact H. }
        { exact H₀. }
      * exact H₁.
    - ∃O.
      simpl.
      exact H.
Qed.

With loop_sem which is just defined, we are eventually ready to complete our definition of ceval.

Fixpoint ceval (c: com): state → state → Prop :=
  match c with
  | CSkip ⇒ id
  | CAss X E ⇒
                fun st₁ st₂ ⇒ st₂ X = aeval E st₁ ∧
                               ∀Y, X ≠ Y → st₁ Y = st₂ Y
  | CSeq c₁ c₂ ⇒
                  concat (ceval c₁) (ceval c₂)
  | CIf b c₁ c₂ ⇒
                   union
                     (intersection
                       (ceval c₁) (filter1 (beval b)))
                     (intersection
                       (ceval c₂) (filter1 (beval (BNot b))))
  | CWhile b c ⇒
                  loop_sem b (ceval c)
  end.

Bourbaki-Witt Theorem

For now, we have successfully defined a fixpoint construction loop_sem which satisfies the recursive equation loop_recur. It is actually one special case of Bourbaki-Witt fixpoint theorem.

Partial Order

A partial order (偏序) on a set A is a binary relation R (usually written as ≤) which is reflexive (自反), transitive (传递), and antisymmetric (反对称). Formally,

    ∀x: A, x ≤ x;
    ∀x y z: A, x ≤ y → y ≤ z → x ≤ z;
    ∀x y: A, x ≤ y → y ≤ x → x = y.

The least element of A w.r.t. a partial order ≤ is also called bottom:

∀x: A, bot ≤ x

Chain

A subset of elements in A is called a chain w.r.t. a partial order ≤ if any two elements in this subset are comparable. For example, if a sequence xs: nat → A is monotonically increasing:

∀n: nat, xs n ≤ xs (n + 1),

then it forms a chain.

A partial order ≤ is called complete if every chain has its least upper bound lub and greatest lower bound glb. In short, the set A (companied with order ≤) is called a complete partial ordering, CPO (完备偏序集). Some text books require chains to be nonempty. We do not put such restriction on chain's definition here. Thus, the empty set is a chain. Its least upper bound is the least element of A, in other words, bot.

Monotonic and Continuous Functions

Given two CPOs A, ≤A= and B, ≤B=, a function F: A → B is called monotonic (单调) if it preserves order. Formally,

∀x y: A, x ≤A= y → F(x) ≤B= F(y).

A function F: A → B is called continuous (连续) if it preserves lub. Formally,

∀xs: chain(A), lub(F(xs)) = F(lub(xs))

Here, the lub function on the left hand side means the least upper bound defined by B and the one on the right hand side is defined by A.

The definition of continuous does not require the preservation of glb becasue CPOs are usually defined in a direction that larger elements are more defined .

Least fixpoint

Given a CPO A, we can always construct a sequence of elements as follows:

bot, F(bot), F(F(bot)), F(F(F(bot))), ...

Obviously, bot ≤ F(bot) is true due to the definition of bot. If F is monotonic, it is immediately followed by F(bot) ≤ F(F(bot)). Similarly,

F(F(bot)) ≤ F(F(F(bot))), F(F(F(bot))) ≤ F(F(F(F(bot)))) ...

In other words, if F is monotonic, this sequence is a chain.

Main theorem: given a CPO A, if it has a least element, then every monotonic continuous function F has a fixpoint and the least fixpoint of F is:

lub [bot, F(bot), F(F(bot)), F(F(F(bot))), ...].

Proof.

On one hand, this least upper bound is a fixpoint:

    F (lub [bot, F(bot), F(F(bot)), F(F(F(bot))), ...]) =
    lub [F(bot), F(F(bot)), F(F(F(bot))), F(F(F(F(bot)))), ...] =
    lub [bot, F(bot), F(F(bot)), F(F(F(bot))), ...].

The first equality is true because F is continuous. The second equality is true because bot is less than or equal to all other elements in the sequence.

On the other hand, this fixpoint is the least one. For any other fixpoint x, in other words, suppose F(x) = x. Then,

bot ≤ x

Thus,

F(bot) ≤ F(x) = x

due to the fact that F is monotonic and x is a fixpoint. And so on,

F(F(bot)) ≤ x, F(F(F(bot))) ≤ x, F(F(F(F(bot)))) ≤ x, ...

That means, x is an upper bound of bot, F(bot), F(F(bot)), .... It must be greater than or equal to

lub [bot, F(bot), F(F(bot)), F(F(F(bot))), ...].

QED.

(* Wed Mar 27 17:33:26 UTC 2019 *)