Lecture notes 20190301
Hoare Logic 1
In this lecture, we start to learn the first approach of describing program
specification and/or program semantics.
In order to talk about what properties a program does/should satisfy, we
have to be able to talk about properties of program states (程序状态) first.
Informally, an assertion (断言) is a proposition (命题) which describes a
particular property of program states. Using the following C function as an
example,
In this C function, there are only 5 program variables, a0, a1, a2, i
and n. A program state is determined by these program variables' values and
the followings are typical program assertions.
[[a0]] == 0 AND [[a1]] == 1
[[a0]] < [[a1]]
EXISTS k, [[a0]] == fib(k) AND [[a1]] == fib(k+1) AND [[a2]] == fib(k + 2)
In more general cases, a C program state contains program variables' value,
program variables' address, memory contents etc. In the last lecture, we have
seen a wrong program which computes the sum of elements in a linked list. Here
is a correct program.
And here is an assertion.
( [[t]] ⟼ 0 ) AND ( [[t]] + 4 ⟼ NULL ) AND ( [[s]] == 0 )
We introduce a new predicate in this assertion X ⟼ Y. It means that the
value stored on address X is Y.
As mentioned above, an assertion is a proposition which describes a property
of program states. And we have seen many assertions already. You may still ask:
what is a proposition, formally?
Mainly, it is a philosophical question. We have two answers to it. Answer 1:
a proposition is the sentence itself which describes the property. Answer 2: a
proposition is the meaning of the sentence. The math definitions of "proposition"
beyond these two answers are different. For example, assertions may be defined
as syntax trees (sentences) or sets of program states (meaning of sentences).
Both approaches are accepted by mathematicians and computer scientists. In this
course, we will just say "propositions" when we do not need to distinguish these
two representations.
On one hand, assertions and boolean functions are different.
1. Not all assertions can be represented as boolean function. Here is an
example:
FORALL k, k < [[n]] OR (k is_prime) OR (fib(k) is_not_prime)
2. Not all boolean functions can be represented as assertions. There can be
side effects.
3. Assertions and boolean functions are categorically different. Assertions
describes properties but boolean functions are mainly about computation.
On the other hand, there are some connections between them. Many dynamic
program analysis tools do use boolean functions to represent assertions.
Given two assertions P and Q, if every program state m which satisfies
P also satisfies Q, we say that P is stronger than Q, or written as
P ⊢ Q. If P is stronger than Q and Q is stronger than P at the same
time, we say that P and Q are equivalent with each other. We write
P ⊣⊢ Q.
Before we go on and introduce more advanced concepts, it is important that
we can and do make things really formal. Specifically, we will have a formal
programming language (but a simple one) and a formal assertion language. Since
it is the first time that we use Coq formal definitions in this course, we hide
those Coq code but only show some examples.
Assertions
To talk about specifications of programs, the first thing we need is a way of making asser- tions about properties that hold at particular points during a program's execution — i.e., claims about the current state of the memory when execution reaches that point. --- << Software Foundation, Volume 2 >>
int fib(int n) { int a0 = 0, a1 = 1, a2; int i; for (i = 0; i < n; ++ i) { a2 = a0 + a1; a0 = a1; a1 = a2; } return a0; }
struct list {unsigned int head; struct list *tail;}; unsigned int sumlist (struct list * t) { unsigned int s = 0; while (t) { s = s + (t->head); t = t->tail; } return s; }
What is a "proposition"?
Assertions v.s. boolean functions
Assertion equivalence and comparison
A formally defined toy language
Require Import PL.Imp1.
We pack those definitions in another Coq file and we "import" it in Coq by
this line of code above.
The following instructions tell you how do that on your
own laptop. You can also find this instruction from << Software Foundation >>
volume 1, Chapter 2, Induction (slightly different).
BEGINNING of instruction from << Software Foundation >>.
For the Require Import to work, Coq needs to be able to
find a compiled version of Imp1.v, called Imp1.vo, in a directory
associated with the prefix PL. This file is analogous to the .class
files compiled from .java source files and the .o files compiled from
.c files.
First create a file named _CoqProject containing the following line
(if you obtained the whole volume "Logical Foundations" as a single
archive, a _CoqProject should already exist and you can skip this step):
-Q . PL
This maps the current directory (".", which contains Imp1.v,
Induction.v, etc.) to the prefix (or "logical directory") "PL".
PG and CoqIDE read _CoqProject automatically, so they know to where to
look for the file Imp1.vo corresponding to the library PL.Imp1.
Once _CoqProject is thus created, there are various ways to build
Imp1.vo:
If you have trouble (e.g., if you get complaints about missing
identifiers later in the file), it may be because the "load path"
for Coq is not set up correctly. The Print LoadPath. command
may be helpful in sorting out such issues.
In particular, if you see a message like
Compiled library Foo makes inconsistent assumptions over
library Bar
check whether you have multiple installations of Coq on your machine.
It may be that commands (like coqc) that you execute in a terminal
window are getting a different version of Coq than commands executed by
Proof General or CoqIDE.
END of instruction from << Software Foundation >>.
- In Proof General: The compilation can be made to happen automatically
when you submit the Require line above to PG, by setting the emacs
variable coq-compile-before-require to t.
- In CoqIDE: Open Imp1.v; then, in the "Compile" menu, click
on "Compile Buffer".
- From the command line: Generate a Makefile using the coq_makefile
utility, that comes installed with Coq (if you obtained the whole
volume as a single archive, a Makefile should already exist
and you can skip this step):
- Another common reason is that the library Bar was modified and recompiled without also recompiling Foo which depends on it. Recompile Foo, or everything if too many files are affected. (Using the third solution above: make clean; make.)
Module Playground_for_Program_Variables_and_Assertions.
This toy language only have one kind of program variables—-variables with
integer type. And we can introduce some new program variables as below.
Local Instance a0: var := new_var().
Local Instance a1: var := new_var().
Local Instance a2: var := new_var().
Local Instance a1: var := new_var().
Local Instance a2: var := new_var().
And now, we can use assertions to talk about some properties.
Definition assert1: Assertion := [[a0]] == 0 AND [[a1]] == 1.
Definition assert2: Assertion := [[a0]] < [[a1]].
Definition assert2: Assertion := [[a0]] < [[a1]].
Fibonacci numbers can be easily defined in Coq. But we do not bother to
define it here; we assume that such function exists.
Hypothesis fib: Z → Z.
Z means integer in math. And this hypothesis says fib is a function from
integers to integers. We can use this function in Coq-defined Assertions as
well.
Definition assert3: Assertion :=
EXISTS k, [[a0]] == fib(k) AND [[a1]] == fib(k+1) AND [[a2]] == fib(k + 2).
End Playground_for_Program_Variables_and_Assertions.
EXISTS k, [[a0]] == fib(k) AND [[a1]] == fib(k+1) AND [[a2]] == fib(k + 2).
End Playground_for_Program_Variables_and_Assertions.
To make things simple, we only allow two different kinds of expressions in
this toy language. Also, only limited arithmetic operators, logical operators
and programs commands are supported. Here is a brief illustration of its syntax.
a ::= Z
| var
| a + a
| a - a
| a * a
b ::= true
| false
| a == a
| a <= a
| ! b
| b && b
c ::= Skip
| var ::= a
| c ;; c
| If b Then c Else c Endif
| While b Do c EndWhile
No function call, pointer, no memory space, no break or continue commands are in
this language. Also, we assume that there is no bound on arithmetic results.
Although this language is simple, it is enough for us to write some interesting
programs.
Module Playground_for_Programs.
Local Instance A: var := new_var().
Local Instance B: var := new_var().
Local Instance TEMP: var := new_var().
Definition swap_two_int: com :=
TEMP ::= A;;
A ::= B;;
B ::= TEMP.
Definition decrease_to_zero: com :=
While ! (A ≤ 0) Do
A ::= A - 1
EndWhile.
Definition ABSOLUTE_VALUE: com :=
If A ≤ 0
Then B ::= 0 - A
Else B ::= A
EndIf.
End Playground_for_Programs.
Local Instance A: var := new_var().
Local Instance B: var := new_var().
Local Instance TEMP: var := new_var().
Definition swap_two_int: com :=
TEMP ::= A;;
A ::= B;;
B ::= TEMP.
Definition decrease_to_zero: com :=
While ! (A ≤ 0) Do
A ::= A - 1
EndWhile.
Definition ABSOLUTE_VALUE: com :=
If A ≤ 0
Then B ::= 0 - A
Else B ::= A
EndIf.
End Playground_for_Programs.
One important property of this simple programming language is that it is
type-safe, i.e. there is no run-time-error problem. We intensionally delete "/"
and pointer operations to achieve this. This enables us to introduce new
concepts and theories in a concise way. But these theories can all be
generalized to complicated real programming languages, like C.
Remark. Some material in this section and the next section is from <<
Software Foundation >> volume 2.
Next, we need a way of making formal claims about the behavior of commands.
In general, the behavior of a command is to transform one state to another, so
it is natural to express claims about commands in terms of assertions that are
true before and after the command executes:
Such a claim is called a Hoare Triple (霍尔三元组). The assertion P is
called the precondition (前条件) of c, while Q is the postcondition
(后条件).
This kind of claims about programs are widely used as specifications.
Computer scientists use the following notation to represent it.
Till now, we have learnt to use pre/postconditions to make formal claims
about programs. In other words, given a pair of precondition and postcondition,
we get a program specification.
Now, we turn to the other side. We will use Hoare triples to describe
program behavior. Formally speaking, we will use Hoare triples to define the
program semantics of our simple imperative programming language (指令式编程语言).
Remark 1. We have not yet describe how a program of com will execute! We
only have some intuition on it by the similarity between this simple language
and some other practical languages. Now we will do it formally for the first
time.
Remark 2. When we talk about "program specification", we say whether a
specific program satisfy a program specification or not. When we talk about
"program semantics", we say the program semantics of some programming language,
which defines the behavior of specific programs.
The following axiom defines the behavior of sequential compositions.
Pre/postconditions
- "If command c is started in a state satisfying assertion P, and if c eventually terminates in some final state, then this final state will satisfy the assertion Q."
{{ P }} c {{ Q }}
Hoare triples as program semantics
Sequence
Axiom hoare_seq : ∀(P Q R: Assertion) (c1 c2: com),
{{P}} c1 {{Q}} →
{{Q}} c2 {{R}} →
{{P}} c1;;c2 {{R}}.
{{P}} c1 {{Q}} →
{{Q}} c2 {{R}} →
{{P}} c1;;c2 {{R}}.
This axiom says, if the command c1 takes any state where P holds to a
state where Q holds, and if c2 takes any state where Q holds to one where
R holds, then doing c1 followed by c2 will take any state where P holds
to one where R holds.
Remark. If we instantiate P, Q, R and c1, c2 with concrete
commands and assertions, this rule is only about the logical relation among
three concrete Hoare triples, or in other words, only describe how the behavior
of two concrete program c1 and c2 relates to their sequential combination.
But this rule is not about concrete programs and concrete assertions! It talks
about sequential combination in general. That's why we say that we are using
the relation among Hoare triples to define the semantics of this simple
programming language.
Since Skip doesn't change the state, it preserves any assertion P.
Skip
Axiom hoare_skip : ∀P,
{{P}} Skip {{P}}.
{{P}} Skip {{P}}.
Condition
Axiom hoare_if_first_try : ∀P Q b c1 c2,
{{P}} c1 {{Q}} →
{{P}} c2 {{Q}} →
{{P}} If b Then c1 Else c2 EndIf {{Q}}.
{{P}} c1 {{Q}} →
{{P}} c2 {{Q}} →
{{P}} If b Then c1 Else c2 EndIf {{Q}}.
But we can say something more precise. In the "then" branch, we know that
the boolean expression b evaluates to true, and in the "else" branch, we
know it evaluates to false. Making this information available in the premises
of the rule forms a more complete definition of program semantics. Here is the
Coq formalization:
Axiom hoare_if : ∀P Q b c1 c2,
{{ P AND [[b]] }} c1 {{ Q }} →
{{ P AND NOT [[b]] }} c2 {{ Q }} →
{{ P }} If b Then c1 Else c2 EndIf {{ Q }}.
(* Fri Mar 1 17:58:19 CST 2019 *)
{{ P AND [[b]] }} c1 {{ Q }} →
{{ P AND NOT [[b]] }} c2 {{ Q }} →
{{ P }} If b Then c1 Else c2 EndIf {{ Q }}.
(* Fri Mar 1 17:58:19 CST 2019 *)