With quantified variables, we can write rich equality propositions over functions. However, there is an essential detail in our claims that we have glossed over until now. Recall our list-append
claim from the previous readings:
Claim: for any lists l1
and l2
, (+ (length l1) (Length l2)) ≡ (length (list-append l1 l2))
.
We snuck under the radar that we assumed that l1
and l2
were lists! This fact is essential because Racket is happy to let us call our functions with any values that we want. The only caveat is that we may not like the results!
> (list-append "banana" "orange")
car: contract violationpair?
expected: "banana" given:
Sometimes we might get lucky and get a runtime error indicating that our assumption was violated. However, in other cases, we aren’t so lucky:
> (list-append '() "wrong")
"wrong"
Here we feed list-append
a string for its second argument. Recall that list-append
performs case analysis on its first argument. When that first argument is null?
, then the second argument is returned immediately. So even though the second exchange has an incorrect type, the function still “works.”
Nothing stops the user of list-append
from passing inappropriate arguments. list-append
assumes that the arguments passed obey specific properties, in this particular case, that the arguments are lists. These assumptions are integral to the correct behavior of the function. Consequently, when we analyze a program, we will also need to track and utilize these assumptions in our reasoning. This realization will lead us to a formal definition of proof that will guide our remaining study of the foundation of mathematics.
Review: Preconditions and Postconditions
In CSC 151, we captured assumptions about our functions’ requirements and behavior as preconditions and postconditions.
Precondition: a property about the inputs to a function that (a) the caller of the function must fulfill and (b) the function assumes are true.
Postcondition: a property about the behavior of the function that (a) the caller can assume holds provided they fulfilled the preconditions of the function and (b) the function ensures during its execution.
Preconditions and postconditions form a contract between the caller of the function and the function itself, also called the callee. The caller agrees to fulfill the preconditions of the function. In return, the callee fulfills its postconditions under the assumption that the preconditions have been fulfilled.
Preconditions and postconditions are an integral aspect of program design. They formalize the notion that while a function, in theory, can guard against every possible erroneous input, it is impractical to do so. For example, recall that mergesort relies on a helper function merge
that takes two lists and combines them into a single sorted list:
;;; (merge l1 l2) -> listof number?
;;; l1 : ??
;;; l2 : ??
;;; Merges lists l1 and l2 into a sorted whole.
define merge
(lambda (l1 l2)
(
... ))
The postcondition of merge
is that the list it returns is sorted, i.e., using our language of propositions:
Postcondition: (sorted? (merge l1 l2)) ≡ #t
.
sorted?
is a custom function that takes a list l
as input and returns #t
if and just l
is sorted.
However, what preconditions do we place upon the inputs, l1
and l2
?
l1
andl2
must both be lists.- The elements of
l1
andl2
must all be comparable to each other (presumably with the(<)
operator). l1
andl2
must be sorted.
If the implementor of merge
was paranoid, they might consider implementing explicit checks for these three preconditions. However, imagine what these checks might look like in code. All the checks require that we either:
- Scan the lists
l1
andl2
multiple times, once for each precondition we wish to check. - Integrate all three checks into the merging behavior of the function.
The former option is modular—we can write one helper function per check—but inefficient. The latter option is more efficient but highly invasive to the merge
function and leads to an unreadable implementation.
Both options are undesirable. This fact is why we usually choose a third option: make the preconditions documented assumptions about the inputs of the function.
Checking Preconditions in Code: note that how we account for preconditions in our programs is ultimately an engineering decision. You have to consider the context you are in—development ecosystem, company culture–and perform a cost-benefit analysis to determine what the “correct” approach is for your code.
In CSC 151, we introduced a “kernel-husk” implementation that separated precondition checks from the function’s primary behavior. While inefficient, this approach is desirable in some situations because Racket provides little static error checking. Other languages provide different mechanisms to capture precondition checks without incurring runtime costs when they are important. For example, Java has an assert
statement that you can use to check preconditions only when you compile the program in debug mode. When you release your software, you can easily switch to release mode, which removes all the assert
checks.
Preconditions as Assumptions During Proof
When we reason about concrete propositions, preconditions are unnecessary because we work with actual values. However, with abstract propositions, preconditions become assumptions about (universally quantified) variables.
We initially obtain these assumptions as part of the proposition we are verifying. For example, in our list-append
claim:
Claim: for any lists l1
and l2
, (+ (length l1) (Length l2)) ≡ (length (list-append l1 l2))
.
The text “for any lists …” introduces the precondition that l1
and l2
are lists.
As another example, consider the postcondition of merge
now realized as a proposition:
Claim: for any lists of numbers l1
and l2
, if (sorted? l1) ≡ #t
and (sorted? l2) ≡ #t
then (sorted? (merge l1 l2)) ≡ #t
.
In addition to the assumptions we have about the type of the variables, we also have arbitrary properties about these variables. The text “if … then …” captures the preconditions between the “if” and “then” that we assume hold when we go to prove the claim which follows “then.” In this example, the additional preconditions are that:
(sorted? l1) ≡ #t
(sorted? l2) ≡ #t
And then the claim we prove with these assumptions is:
(sorted? (merge l1 l2))
.
Short-hand for Propositions Involving Booleans
As we discuss preconditions throughout this reading and beyond, we will frequently work with preconditions that assert an equivalence between two expressions of boolean type. For example, to state the proposition that x
is less than y
, we would declare that it is equivalent to #t
:
(< x y) ≡ #t
This notation is quite taxing to write where we have many preconditions in our reasoning. Instead, we’ll use short-hand, taking advantage of the similarities between boolean expressions and propositions. When we write the following proposition:
(< x y)
We really mean the complete equivalence (< x y) ≡ #t
. With this notation, we can rewrite the above claim above more elegantly as:
Claim: for any lists of numbers l1
and l2
, if (sorted? l1)
and (sorted? l2)
then (sorted? (merge l1 l2))
.
On Mathematical Short-hand: this short-hand is our first example of a common mathematics practice. In our pursuit of a rigorous, formal definition, we might create a situation where it is highly inconvenient to use this definition in prose. Perhaps this definition is too verbose. Or maybe it is highly parameterized, but in the common case, the parameterization is unnecessary.
In these situations, we introduce short-hand notation in mathematics to concisely write down our ideas. An example of this notation are symbols, e.g., \(x | y\) for “\(x\) divides \(y\).” Alternatively, we may introduce puns, seemingly invalid mathematical syntax, given special meaning in context. For example, a Racket expression is not a program equivalence. However, in the above example, we introduce the pun that a boolean Racket expression is, implicitly, a program equivalence with #t
.
Mathematical short-hand is a secret sauce of writing mathematics. It allows us to write beautifully concise yet thoroughly precise prose. However, if you aren’t aware of the meaning of symbols and puns in a piece of mathematical writing, you can quickly become lost. Keep an eye out for such short-hand when reading mathematical prose, and you will find your comprehension to go up rapidly as a result.
Exercise: Write a claim about the behavior of the following function that implies its correctness. Make sure to include any necessary preconditions in your claim.
;;; Returns the (0-based) index of element x in list l.
(index-of x l)
Tracking and Utilizing Assumptions
Assumptions about our variables initially come from preconditions we write into our claims. How do we use these assumptions in our proofs of program correctness? It turns out we have been doing this already without realizing it!
For example, in our analysis of the boolean my-and
function from our previous reading, we said the following:
Because
b
is a boolean, eitherb
is#t
orb
is#f
.
This line of reason is only correct because we assumed via a precondition:
Claim: for all booleans b
, (my-and b #f) ≡ #f
.
That b
is indeed a boolean. If we did not have such a precondition, this line of reasoning would be invalid!
When we have an assumption that asserts the type of a universally quantified variable, we can use that assumption to conclude that the variable is of a particular shape according to that type. For example:
- A variable that is a
number?
is indeed a number, e.g.,-1
, \(\frac{3}{4}\), and3.4170
. - A variable that is a
procedure?
is bound to alambda
value, e.g.,(lambda (n) (+ n 1))
. - A variable that is a
boolean?
is either#t
or#f
.
Furthermore, if an operation requires a value of a particular type as a precondition, a type assumption about a variable allows us to conclude that the variable fulfills that precondition. For example, if we know that x
is a number?
then we know that (> x 3)
will produce a boolean result instead of potentially throwing an error.
Utilizing General Assumptions
Besides type assumptions, we can also assert general properties about our variables as preconditions. In our merge
example, we assumed that the input lists satisfied the sorted?
predicate, i.e., were sorted. We can then use this fact to deduce other properties of our lists that will help us ascertain correctness, e.g., that the smallest elements of each list are at the front.
Because these properties are general, we have to reason about them in a context-specific manner. Let’s look at a simple example of reasoning about one common kind of assumption: numeric constraints. Consider the following simple numeric function:
define double
(lambda (n)
(2))) (* n
And suppose we want to prove the following claim about this function:
Claim: For all numbers n
, if (> n 0)
then (> (double n) 0)
.
Employing our symbolic techniques, we first assume that n
is an arbitrary number. However, the claim that follows the quantification of n
includes a precondition—it is of the form “if … then …”. Therefore, in addition to n
, we also gain the assumption that (> n 0)
, i.e., n
is positive.
When we go prove that (> (double n) 0)
, we know from our model of computation that:
> (double n) 0) --> (> (* n 2) 0) (
This resulting expression is not always true. In particular, if n
is non-negative, then (* n 2)
will be negative and thus less than 0
. However, our assumption that (> n 0)
tells us this is not the case!
Here is a complete proof of the claim that employs this observation.
Claim: For all numbers n
, if (> n 0)
then (> (double n) 0)
.
Proof. Let n
be a number and assume that (> n 0)
. By the definition of double
we have that:
> (double n) 0) --> (> (* n 2) 0) (
However, by our assumption that (> n 0)
we know that (* n 2)
is a positive quantity—multiplying two positive numbers results in a positive number. Therefore, (> (* n 2) 0)
holds.
Being Explicit When Invoking Assumptions
In the above proof, I was explicit about:
- When I used an assumption in a step of reasoning, and
- How I used that assumption to infer new information.
At this stage of your development as a mathematician, you should do the same for any other assumption you employ in a proof. As an example of what not to do, here is the same proof but without the explicit call-out of the assumption:
Claim: For all numbers n
, if (> n 0)
then (> (double n) 0)
.
Proof. Let n
be a number and assume that (> n 0)
. By the definition of double
we have that:
> (double n) 0) --> (> (* n 2) 0) (
Therefore, (> (* n 2) 0)
holds.
This proof is valid. The steps presented and the conclusion are sound. However, this is a less formal proof because it has left some steps of reasoning implicit. In this course, we aim for rigorous, formal proof, and so we should not leave any steps implicit in our reasoning.
That being said, we will quickly find that dogmatically following this guidance will lead us to ruin. Analogous to programming, if we write our proofs at too low of a level, we will get lost in the weeds with low-level details and lose sight of the bigger picture.
For now, let’s be ultra-explicit about our reasoning and follow the formula above whenever we invoke an assumption. However, because reasoning about type constraints, e.g., happens so frequently, it is okay if you elide when you use a type assumption in your proofs. For example, instead of saying:
By our assumptions, we know
b
is a boolean. Therefore,b
must either be#t
or#f
.
You can, instead, say:
b
is either#t
or#f
.
However, you should still be clear when you are assuming the type of a variable using a “let”-style statement, e.g.,
Let
b
be a boolean.
Assumptions that Arise During Analysis
Assumptions arise not only when we initially process our claim. We also gain new assumptions during the proving process. For example, let’s consider the following simple function:
define nat-sub
(lambda (x y)
(if (< x y)
(0
- x y)))) (
And the following claim about this function:
Claim: for all numbers x
and y
, (>= (nat-sub x y) 0)
.
Here is a partial proof of this claim:
Proof. Let x
and y
be numbers. Then we have that:
(nat-sub x y)if (< x y) 0 (- x y)) --> (
But now, we’re stuck! We need to evaluate the guard to proceed, but we don’t know how the expression (< x y)
will evaluate. However, we do know the following:
x
andy
are numbers.(< x y)
will produce a result becausex
andy
are numbers.- The result of
(< x y)
is either#t
or#f
because the expression has boolean type.
Because of this, we can proceed by case analysis on the result of (< x y)
: it evaluates to either #t
or #f
. Let us consider each case in turn:
(< x y) --> #t
: In this case:if (< x y) 0 (- x y)) --> (if #t 0 (- x y)) --> 0 (
So
(nat-sub x y) -->* 0
which is greater-than or equal to0
.(< x y) --> #f
: In this caseif (< x y) 0 (- x y)) --> (if #f 0 (- x y)) --> (- x y) (
Here, we seem to be stuck again. We don’t know how to proceed with the subtraction since x
and y
are held abstract. However, we must remember that in this case, we are assuming that (< x y)
is #f
. Therefore, we know that x
is greater than or equal to y
. Because we know the difference between a larger number and a smaller number is positive, we can conclude that (<= (- x y) 0)
as desired.
Let’s see this reasoning together in a complete proof of our claim.
Claim: for all numbers x
and y
, (>= (nat-sub x y) 0)
.
Proof. Let x
and y
be numbers. Then we have that:
(nat-sub x y)if (< x y) 0 (- x y)) --> (
Either (< x y) --> #t
or (< x y) --> #f
.
(< x y) --> #t
. Then we have:if (< x y) 0 (- x y)) --> (if #t 0 (- x y)) --> 0 (
Thus,
(nat-sub x y) -->* 0
, a non-negative result.(< x y) --> #f
. Then we have:if (< x y) 0 (- x y)) --> (if #f 0 (- x y)) --> (- x y)` (
From our case analysis, we assume that
(< x y)
holds. Because we know subtracting a smaller number from a larger number results in a non-negative quantity, we can conclude that(<= (- x y) 0)
.
In summary, we can obtain assumptions from places other than our claims, e.g., through case analysis or the postcondition of an applied function. We can then use these newly acquired facts to complete our proofs.
Exercise (Nesting): Recall that a cond
expression is like a collection of nested conditionals. We test each branch of a cond
in top-down order until we find a true branch. We then substitute the branch’s expression for the overall cond
expression.
Consider the following cond
expression:
cond [(< x 0) e1]
(<= x 3) e2]
[(= x 5) e3]
[(<= x 8) e4]
[(else e5]) [
For each branch expressions e1
through e5
above, describe the assumptions about x
you can make by entering that branch of the cond
.
(Hint: make sure you account for the fact that to reach a branch, all the previous branches must have returned false.)
Proof States and Proving
In summary, we have introduced assumptions into our proofs of program correctness. These assumptions come from preconditions or through analysis of our code. We use these assumptions to prove our claim. Because our claim evolves throughout the proof, e.g., it is more accurate to say that we use assumptions to prove a goal proposition where the initial goal proposition is our original claim.
Surprisingly, it turns out that all mathematical proofs can be thought of in these terms!
Definition (Proof State): the state of a proof or proof state is a pair of a set of assumptions and a goal proposition to prove.
Definition (Proof): a proof is a sequence of logical steps that manipulate a proof state towards a final, provable result.
When either reading or writing a mathematical proof, we should keep these definitions in mind. In particular, we have to be aware at every point of the proof:
- What is our set of assumptions?
- What are we trying to prove?
Exercise (Entry Fees) Consider the following function that calculates the entry fee of a single ticket to Disney World in 2021 (mined from the Disney website):
define calculate-price
(lambda (day-of-week age park-hopper?)
(+ 115
(if (< age 10) -5 0)
(if (friday-or-weekend? day-of-week) 14 0)
(if park-hopper? 65 0)))) (
Prove the following claim:
Claim: for all days of the week d
, natural numbers n
, and boolean b
, if b ≡ #t
then (>= (calculate-price d n b) 175)
.
(Note: yea, Disney World is expensive, huh?)