Lecture notes 20210224 Hoare Logic 1
In this lecture, we start to learn the first approach of describing program
specification and/or program semantics.
In order to talk about what properties a program does/should satisfy, we
have to be able to talk about properties of program states (程序状态) first.
Informally, an assertion (断言) is a proposition (命题) which describes a
particular property of program states. Using the following C function as an
example,
In this C function, there are only 5 program variables, a0, a1, a2, i
and n. A program state is determined by these program variables' values and
the followings are typical program assertions.
[[a0]] = 0 AND [[a1]] = 1
[[a0]] < [[a1]]
EXISTS k, [[a0]] = fib(k) AND [[a1]] = fib(k+1) AND [[a2]] = fib(k + 2)
In more general cases, a C program state contains program variables' value,
program variables' address, memory contents etc. In the last lecture, we have
seen a wrong program which computes the sum of elements in a linked list. Here
is a correct program.
And here is an assertion.
( [[t]] ⟼ 0 ) AND ( [[t]] + 4 ⟼ NULL ) AND ( [[s]] = 0 )
We introduce a new predicate in this assertion X ⟼ Y. It means that the
value stored on address X is Y.
As mentioned above, an assertion is a proposition which describes a property
of program states. And we have seen many assertions already. You may still ask:
what is a proposition, formally?
Mainly, it is a philosophical question. We have two answers to it. Answer 1:
a proposition is the sentence itself which describes the property. Answer 2: a
proposition is the meaning of the sentence. The math definitions of "proposition"
beyond these two answers are different. For example, assertions may be defined
as syntax trees (sentences) or sets of program states (meaning of sentences).
Both approaches are accepted by mathematicians and computer scientists. In this
course, we will just say "propositions" when we do not need to distinguish these
two representations.
On one hand, assertions and boolean functions are different.
1. Not all assertions can be represented as boolean function. Here is an
example:
FORALL k, k < [[n]] OR (k is_prime) OR (fib(k) is_not_prime)
2. Not all boolean functions can be represented as assertions. There can be
side effects.
3. Assertions and boolean functions are categorically different. Assertions
describes properties but boolean functions are mainly about computation.
On the other hand, there are some connections between them. Many dynamic
program analysis tools do use boolean functions to represent assertions.
Given two assertions P and Q, if every program state m which satisfies
P also satisfies Q, we say that P is stronger than Q, or written as
P ⊢ Q. If P is stronger than Q and Q is stronger than P at the same
time, we say that P and Q are equivalent with each other. We write
P ⊣ ⊢ Q.
Before we go on and introduce more advanced concepts, it is important that
we can make things really formal. Specifically, we will have a formal
programming language (but a simple one) and a formal assertion language. Since
it is the first time that we use Coq formal definitions in this course, we hide
those Coq code but only show some examples.
Assertions
To talk about specifications of programs, the first thing we need is a way of making asser- tions about properties that hold at particular points during a program's execution — i.e., claims about the current state of the memory when execution reaches that point. --- << Software Foundation, Volume 2 >>
int fib(int n) { int a0 = 0, a1 = 1, a2; int i; for (i = 0; i < n; ++ i) { a2 = a0 + a1; a0 = a1; a1 = a2; } return a0; }
struct list {unsigned int head; struct list *tail;}; unsigned int sumlist (struct list * t) { unsigned int s = 0; while (t) { s = s + (t->head); t = t->tail; } return s; }
What is a "proposition"?
Assertions v.s. boolean functions
Assertion equivalence and comparison
A formally defined toy language
Require Import PL.Imp.
Import Assertion_S.
Import Concrete_Pretty_Printing.
Import Assertion_S.
Import Concrete_Pretty_Printing.
We pack those definitions in another Coq file and we "import" it in Coq by
this line of code above.
The following instructions tell you how do that on your
own laptop. You can also find this instruction from << Software Foundation >>
volume 1, Chapter 2, Induction (slightly different).
BEGINNING of instruction from << Software Foundation >>.
For the Require Import to work, Coq needs to be able to
find a compiled version of Imp.v, called Imp.vo, in a directory
associated with the prefix PL. This file is analogous to the .class
files compiled from .java source files and the .o files compiled from
.c files.
First create a file named _CoqProject containing the following line
(if you obtained the whole volume "Logical Foundations" as a single
archive, a _CoqProject should already exist and you can skip this step):
-Q . PL
This maps the current directory (".", which contains Imp.v,
RTClosure.v, etc.) to the prefix (or "logical directory") "PL".
PG and CoqIDE read _CoqProject automatically, so they know to where to
look for the file Imp.vo corresponding to the library PL.Imp.
Once _CoqProject is thus created, there are various ways to build
Imp.vo:
If you have trouble (e.g., if you get complaints about missing
identifiers later in the file), it may be because the "load path"
for Coq is not set up correctly. The Print LoadPath. command
may be helpful in sorting out such issues.
In particular, if you see a message like
Compiled library Foo makes inconsistent assumptions over
library Bar
check whether you have multiple installations of Coq on your machine.
It may be that commands (like coqc) that you execute in a terminal
window are getting a different version of Coq than commands executed by
Proof General or CoqIDE.
END of instruction from << Software Foundation >>.
- In Proof General: The compilation can be made to happen automatically
when you submit the Require line above to PG, by setting the emacs
variable coq-compile-before-require to t.
- In CoqIDE: Open RTClosure.v; then in the "Compile" menu, click
on "Compile Buffer"; Open Imp.v; then, in the "Compile" menu, click
on "Compile Buffer".
- From the command line: Generate a Makefile using the coq_makefile
utility, that comes installed with Coq (if you obtained the whole
volume as a single archive, a Makefile should already exist
and you can skip this step):
- Another common reason is that the library Bar was modified and recompiled without also recompiling Foo which depends on it. Recompile Foo, or everything if too many files are affected. (Using the third solution above: make clean; make.)
Module Playground_for_Program_Variables_and_Assertions.
This toy language only have one kind of program variables—-variables with
integer type. And we can introduce some new program variables as below.
Local Instance a0: var := new_var().
Local Instance a1: var := new_var().
Local Instance a2: var := new_var().
Local Instance a1: var := new_var().
Local Instance a2: var := new_var().
And now, we can use assertions to talk about some properties.
Definition assert1: Assertion := [[a0]] = 0 AND [[a1]] = 1.
Definition assert2: Assertion := [[a0]] < [[a1]].
Definition assert2: Assertion := [[a0]] < [[a1]].
Fibonacci numbers can be easily defined in Coq. But we do not bother to
define it here; we assume that such function exists.
Hypothesis fib: Z -> Z.
Z means integer in math. And this hypothesis says fib is a function from
integers to integers. We can use this function in Coq-defined Assertions as
well.
Definition assert3: Assertion :=
EXISTS k, [[a0]] = fib(k) AND [[a1]] = fib(k+1) AND [[a2]] = fib(k + 2).
End Playground_for_Program_Variables_and_Assertions.
EXISTS k, [[a0]] = fib(k) AND [[a1]] = fib(k+1) AND [[a2]] = fib(k + 2).
End Playground_for_Program_Variables_and_Assertions.
To make things simple, we only allow two different kinds of expressions in
this toy language. Also, only limited arithmetic operators, logical operators
and programs commands are supported. Here is a brief illustration of its syntax.
a ::= Z
| var
| a + a
| a - a
| a * a
b ::= true
| false
| a == a
| a <= a
| ! b
| b && b
c ::= Skip
| var ::= a
| c ;; c
| If b Then c Else c Endif
| While b Do c EndWhile
No function call, pointer, no memory space, no break or continue commands are in
this language. Also, we assume that there is no bound on arithmetic results.
Although this language is simple, it is enough for us to write some interesting
programs.
Module Playground_for_Programs.
Local Instance A: var := new_var().
Local Instance B: var := new_var().
Local Instance TEMP: var := new_var().
Definition swap_two_int: com :=
TEMP ::= A;;
A ::= B;;
B ::= TEMP.
Definition decrease_to_zero: com :=
While ! (A ≤ 0) Do
A ::= A - 1
EndWhile.
Definition ABSOLUTE_VALUE: com :=
If A ≤ 0
Then B ::= 0 - A
Else B ::= A
EndIf.
End Playground_for_Programs.
Local Instance A: var := new_var().
Local Instance B: var := new_var().
Local Instance TEMP: var := new_var().
Definition swap_two_int: com :=
TEMP ::= A;;
A ::= B;;
B ::= TEMP.
Definition decrease_to_zero: com :=
While ! (A ≤ 0) Do
A ::= A - 1
EndWhile.
Definition ABSOLUTE_VALUE: com :=
If A ≤ 0
Then B ::= 0 - A
Else B ::= A
EndIf.
End Playground_for_Programs.
One important property of this simple programming language is that it is
type-safe, i.e. there is no run-time-error problem. We intensionally delete "/"
and pointer operations to achieve this. This enables us to introduce new
concepts and theories in a concise way. But these theories can all be
generalized to complicated real programming languages, like C.
Remark. Some material in this section and the next section is from <<
Software Foundation >> volume 2.
Next, we need a way of making formal claims about the behavior of commands.
In general, the behavior of a command is to transform one state to another, so
it is natural to express claims about commands in terms of assertions that are
true before and after the command executes:
Such a claim is called a Hoare Triple (霍尔三元组). The assertion P is
called the precondition (前条件) of c, while Q is the postcondition
(后条件).
This kind of claims about programs are widely used as specifications.
Computer scientists use the following notation to represent it.
Quiz:
Does this specific program satisfy its specification?
What about this one?
What about this one?
What about this one?
What about this one?
This one?
This one?
Till now, we have learnt to use pre/postconditions to make formal claims
about programs. In other words, given a pair of precondition and postcondition,
we get a program specification.
Now, we turn to the other side. We will use Hoare triples to describe
program behavior. Formally speaking, we will use Hoare triples to define the
program semantics of our simple imperative programming language (指令式编程语言).
Remark 1. We have not yet describe how a program of com will execute! We
only have some intuition on it by the similarity between this simple language
and some other practical languages. Now we will do it formally for the first
time.
Remark 2. When we talk about "program specification", we say whether a
specific program satisfies a program specification or not. When we talk about
"program semantics", we say the program semantics of some programming language,
which defines the behavior of specific programs.
The following axiom defines the behavior of sequential compositions.
Pre/postconditions
- "If command c is started in a state satisfying assertion P, and if c eventually terminates in some final state, then this final state will satisfy the assertion Q."
{{ P }} c {{ Q }}
{{True}} X ::= 5 {{ [[X]] = 5 }}
Yes
{{ [[X]] = 2 AND [[X]] = 3 }} X ::= 5 {{ [[X]] = 0 }}
Yes. Because no program state satisfies this precondition.
{{True}} Skip {{False}}
No.
{{False}} Skip {{True}}
Yes.
{{True}} While true Do Skip EndWhile {{False}}
Yes.
{{ [[X]] = 0 }}
While X == 0 Do X ::= X + 1 EndWhile
{{ [[X]] = 1 }}
Yes.
While X == 0 Do X ::= X + 1 EndWhile
{{ [[X]] = 1 }}
{{ [[X]] = 1 }}
While !(X == 0) Do X ::= X + 1 EndWhile
{{ [[X]] = 100 }}
Yes.
While !(X == 0) Do X ::= X + 1 EndWhile
{{ [[X]] = 100 }}
Hoare triples as program semantics
Sequence
Axiom hoare_seq : ∀(P Q R: Assertion) (c1 c2: com),
{{P}} c1 {{Q}} ->
{{Q}} c2 {{R}} ->
{{P}} c1;;c2 {{R}} .
{{P}} c1 {{Q}} ->
{{Q}} c2 {{R}} ->
{{P}} c1;;c2 {{R}} .
This axiom says, if the command c1 takes any state where P holds to a
state where Q holds, and if c2 takes any state where Q holds to one where
R holds, then doing c1 followed by c2 will take any state where P holds
to one where R holds.
Remark. If we instantiate P, Q, R and c1, c2 with concrete
commands and assertions, this rule is only about the logical relation among
three concrete Hoare triples, or in other words, only describe how the behavior
of two concrete program c1 and c2 relates to their sequential combination.
But this rule is not about concrete programs and concrete assertions! It talks
about sequential combination in general. That's why we say that we are using
the relation among Hoare triples to define the semantics of this simple
programming language.
We want to prove that the following program always swaps the values of
variables X and Y. Or, formally, for any x and y,
Here is a program that swaps the values of two variables using addition and
subtraction instead of by assigning to a temporary variable.
Since Skip doesn't change the state, it preserves any assertion P.
Example: Swapping
{{ [[X]] = x AND [[Y]] = y }}
TEMP ::= X;;
X ::= Y;;
Y ::= TEMP
{{ [[X]] = y AND [[Y]] = x }} .
First, the following three triples are obviously true.
TEMP ::= X;;
X ::= Y;;
Y ::= TEMP
{{ [[X]] = y AND [[Y]] = x }} .
1. {{ [[X]] = x AND [[Y]] = y }}
TEMP ::= X
{{ [[Y]] = y AND [[TEMP]] = x }}
2. {{ [[Y]] = y AND [[TEMP]] = x }}
X ::= Y
{{ [[X]] = y AND [[TEMP]] = x }}
3. {{ [[X]] = y AND [[TEMP]] = x }}
Y ::= TEMP
{{ [[X]] = y AND [[Y]] = x }} .
Then, from 2 and 3, we know:
TEMP ::= X
{{ [[Y]] = y AND [[TEMP]] = x }}
2. {{ [[Y]] = y AND [[TEMP]] = x }}
X ::= Y
{{ [[X]] = y AND [[TEMP]] = x }}
3. {{ [[X]] = y AND [[TEMP]] = x }}
Y ::= TEMP
{{ [[X]] = y AND [[Y]] = x }} .
4. {{ [[Y]] = y AND [[TEMP]] = x }}
X ::= Y;;
Y ::= TEMP
{{ [[X]] = y AND [[Y]] = x }} .
In the end, from 1 and 4:
X ::= Y;;
Y ::= TEMP
{{ [[X]] = y AND [[Y]] = x }} .
5. {{ [[X]] = x AND [[Y]] = y }}
TEMP ::= X;;
X ::= Y;;
Y ::= TEMP
{{ [[X]] = y AND [[Y]] = x }} .
TEMP ::= X;;
X ::= Y;;
Y ::= TEMP
{{ [[X]] = y AND [[Y]] = x }} .
Example: Swapping Using Addition and Subtraction
X ::= X + Y;;
Y ::= X - Y;;
X ::= X - Y
Again, we can prove it correct by three triples for assignments and hoare_seq.
Y ::= X - Y;;
X ::= X - Y
1. {{ [[X]] = x AND [[Y]] == y }}
X ::= X + Y
{{ [[X]] = x + y AND [[Y]] = y }}
2. {{ [[X]] = x + y AND [[Y]] = y }}
Y ::= X - Y
{{ [[X]] = x + y AND [[Y]] = x }}
3. {{ [[X]] = x + y AND [[Y]] = x }}
X ::= X - Y
{{ [[X]] = y AND [[Y]] = x }} .
X ::= X + Y
{{ [[X]] = x + y AND [[Y]] = y }}
2. {{ [[X]] = x + y AND [[Y]] = y }}
Y ::= X - Y
{{ [[X]] = x + y AND [[Y]] = x }}
3. {{ [[X]] = x + y AND [[Y]] = x }}
X ::= X - Y
{{ [[X]] = y AND [[Y]] = x }} .
Skip
Axiom hoare_skip : ∀P,
{{P}} Skip {{P}} .
{{P}} Skip {{P}} .
Condition
Axiom hoare_if_first_try : ∀P Q b c1 c2,
{{P}} c1 {{Q}} ->
{{P}} c2 {{Q}} ->
{{P}} If b Then c1 Else c2 EndIf {{Q}} .
{{P}} c1 {{Q}} ->
{{P}} c2 {{Q}} ->
{{P}} If b Then c1 Else c2 EndIf {{Q}} .
However, this is rather weak. For example, using this rule, we will fail to
show that the following program satisfies the following Hoare triple since the
rule above tells us nothing about the state in which the assignments take place
in the "then" and "else" branches.
In other words, this axiom above does not define the program semantics in a
complete sense.
Module Playground_for_Counterexample.
Local Instance X: var := new_var().
Local Instance Y: var := new_var().
Definition a_counterexample :=
{{ True }}
If X == 0
Then Y ::= 2
Else Y ::= X + 1
EndIf
{{ [[X]] ≤ [[Y]] }} .
End Playground_for_Counterexample.
Local Instance X: var := new_var().
Local Instance Y: var := new_var().
Definition a_counterexample :=
{{ True }}
If X == 0
Then Y ::= 2
Else Y ::= X + 1
EndIf
{{ [[X]] ≤ [[Y]] }} .
End Playground_for_Counterexample.
If we try to use hoare_if_first_try here, we have to show that
That means, we need a better proof rule which can reason about if-then-else
in a more precise mannar. For example, in the "then" branch, we know that the
boolean expression b evaluates to true, and in the "else" branch, we know it
evaluates to false. Making this information available in the premises of the
rule forms a more complete definition of program semantics. Here is the Coq
formalization:
{{ True }}
Y ::= 2
{{ [[X]] ≤ [[Y]] }}
and
Y ::= 2
{{ [[X]] ≤ [[Y]] }}
{{ True }}
Y ::= X + 1
{{ [[X]] ≤ [[Y]] }} .
They correspond to two assumptions of hoare_if_first_try. But it is obvious
that the first triple of them is not true.
Y ::= X + 1
{{ [[X]] ≤ [[Y]] }} .
Axiom hoare_if : ∀P Q b c1 c2,
{{ P AND [[b]] }} c1 {{ Q }} ->
{{ P AND NOT [[b]] }} c2 {{ Q }} ->
{{ P }} If b Then c1 Else c2 EndIf {{ Q }} .
{{ P AND [[b]] }} c1 {{ Q }} ->
{{ P AND NOT [[b]] }} c2 {{ Q }} ->
{{ P }} If b Then c1 Else c2 EndIf {{ Q }} .
Program correctness proof in Coq
Example: Swapping
Module swapping.
Import Axiomatic_semantics.
Local Instance X: var := new_var().
Local Instance Y: var := new_var().
Local Instance TEMP: var := new_var().
We are going to prove:
{{ [[X]] = x AND [[Y]] = y }}
TEMP ::= X;;
X ::= Y;;
Y ::= TEMP
{{ [[X]] = y AND [[Y]] = x }} .
This is based on 3 Hoare triples about assignment commands. In fact, we have not
proved them yet in any precise way. They are just true by own intuition. Thus we
wrote them down as hypothesis here.
TEMP ::= X;;
X ::= Y;;
Y ::= TEMP
{{ [[X]] = y AND [[Y]] = x }} .
Hypothesis triple1: ∀x y: Z,
{{ [[X]] = x AND [[Y]] = y }}
TEMP ::= X
{{ [[Y]] = y AND [[TEMP]] = x }} .
Hypothesis triple2: ∀x y: Z,
{{ [[Y]] = y AND [[TEMP]] = x }}
X ::= Y
{{ [[X]] = y AND [[TEMP]] = x }} .
Hypothesis triple3: ∀x y: Z,
{{ [[X]] = y AND [[TEMP]] = x }}
Y ::= TEMP
{{ [[X]] = y AND [[Y]] = x }} .
{{ [[X]] = x AND [[Y]] = y }}
TEMP ::= X
{{ [[Y]] = y AND [[TEMP]] = x }} .
Hypothesis triple2: ∀x y: Z,
{{ [[Y]] = y AND [[TEMP]] = x }}
X ::= Y
{{ [[X]] = y AND [[TEMP]] = x }} .
Hypothesis triple3: ∀x y: Z,
{{ [[X]] = y AND [[TEMP]] = x }}
Y ::= TEMP
{{ [[X]] = y AND [[Y]] = x }} .
Then we start our theorem proving. Usually, a theorem statement starts with
"Theorem", "Lemma", "Corollary", "Fact" or "Example".
Fact swaping_correct:
∀x y: Z,
{{ [[X]] = x AND [[Y]] = y }}
TEMP ::= X;;
X ::= Y;;
Y ::= TEMP
{{ [[X]] = y AND [[Y]] = x }} .
Proof.
intros.
∀x y: Z,
{{ [[X]] = x AND [[Y]] = y }}
TEMP ::= X;;
X ::= Y;;
Y ::= TEMP
{{ [[X]] = y AND [[Y]] = x }} .
Proof.
intros.
We have seen this command "intros" in our introduction to this course. It
will move universally quantified variables in the conclusion into assumptions.
apply hoare_seq with ([[Y]] = y AND [[TEMP]] = x)%assert.
This tactic says: we choose to use hoare_seq to prove our conclusion. The
"with" clause indicates the middle condition. We use %assert to let CoqIDE
know that this argument is an assertion. This tactic reduces the original proof goal into two smaller ones—-one is a
Hoare triple for the first command and the other is a Hoare triple for the last
two assignment commands. They corresponds to two assumptions of hoare_seq
respectively. This is reasonable—-in order to prove something using
hoare_seq, one have to prove its assumptions first.
apply triple1.
The first proof goal is our first hypothesis.
apply hoare_seq with ([[X]] = y AND [[TEMP]] = x)%assert.
The second proof goal needs hoare_seq again.
apply triple2.
apply triple3.
apply triple3.
In the end, we write "Qed" to complete our proof.
Qed.
If you go through this proof above, you may feel that it is in a backward
direction—-we reduced our proof goal step by step and achieve our assuptions
in the end. In fact, you can also write forward proofs in Coq.
Fact swaping_correct_again:
∀x y: Z,
{{ [[X]] = x AND [[Y]] = y }}
TEMP ::= X;;
X ::= Y;;
Y ::= TEMP
{{ [[X]] = y AND [[Y]] = x }} .
Proof.
intros.
pose proof triple1 x y.
pose proof triple2 x y.
pose proof triple3 x y.
pose proof hoare_seq
([[Y]] = y AND [[TEMP]] = x)
([[X]] = y AND [[TEMP]] = x)
([[X]] = y AND [[Y]] = x)
(X ::= Y)
(Y ::= TEMP)
H0
H1.
∀x y: Z,
{{ [[X]] = x AND [[Y]] = y }}
TEMP ::= X;;
X ::= Y;;
Y ::= TEMP
{{ [[X]] = y AND [[Y]] = x }} .
Proof.
intros.
pose proof triple1 x y.
pose proof triple2 x y.
pose proof triple3 x y.
pose proof hoare_seq
([[Y]] = y AND [[TEMP]] = x)
([[X]] = y AND [[TEMP]] = x)
([[X]] = y AND [[Y]] = x)
(X ::= Y)
(Y ::= TEMP)
H0
H1.
When you are able to derive a new conclusion from assumptions, the
"pose proof" tactic can be used to put that conclusion above the line.
clear H0 H1.
At this point, if you feel that some assumptions are redundant above the
line, you can use clear to remove them.
pose proof hoare_seq _ ([[Y]] = y AND [[TEMP]] = x) _ _ _ H H2.
You do not need to type all arguments manually. Use underscore _ if Coq
can infer that.
exact H0.
In the end, what we want to prove is already proved. We use "exact".
Qed.
End swapping.
(* 2021-02-24 19:29 *)
End swapping.
(* 2021-02-24 19:29 *)