Where have we been?
Impcore:
-
no new programming-language ideas
-
programming wth assignments, etc.
-
function calls
-
-
lots of new math
-
operational semantics (judgements and inference rules)
-
operational semantics (theory and metatheory)
-
Where are we going?
Scheme:
-
programming with recursive data structures
-
programming with first-class functions
For a new language, five powerful questions
As a lens for understanding, you can ask these questions about any language:
-
What is the abstract syntax? What are the syntactic categories and what are the terms in each category?
-
What are the values? What do expressions/terms evaluate to?
Small aside: Why consider values separate from abstract syntax? Abstract syntax corresponds to things that we write down in source code, while values correspond to things that are manipulated during the evaluation of a program. Sometimes these overlap; for example, a number may be both abstract syntax (a numeric literal written in source code) and a value (a machine integer manipulated during evaluation). But, there are important cases where these two concepts do not overlap. Consider objects in Java. We write
new C()
in source code, but at runtime we manipulate an actual object (combination of class name and instance variables). We cannot write down an object value directly. Also consider pointers in C. We can writemalloc(4)
in source code, but at runtime we manipulate an actual pointer. In proper C, we cannot write down a pointer value directly; instead, they only exist during the evaluation of a program. -
What are the environments? What can names stand for?
-
How are terms evaluated? What are the judgments and inference rules?
-
What is in the initial basis? Primitives and otherwise, what is built in?
Introduction to Scheme
Question #2: What are the values?
Two new kinds of data:
-
cons
cell: pointer to automatically managed (i.e., garbage collected) pair of values -
function closure: first-class functions; a powerful new feature, not just a dataum
Values of Scheme
Values are S-expressions (symbolic expressions).
Simplification for now:
-
An S-expression is an integer literal, a boolean literal, a symbol, or a list of S-expressions.
-
A list of S-expressions is either
'()
(the empty list) or an S-expression followed by a list of S-expressions.
Like any other abstract data type
-
creators create new values of the type:
-
1
,#t
,'a
,'()
-
-
producers make new values from existing values
-
(+ i j)
,(not b)
,(cons x xs)
-
-
observers examine values of the type
-
number?
,boolean?
,symbol?
,null?
,pair?
,car
,cdr
-
-
mutators change values of the type
-
none in uScheme
-
Lists
Lists are a subset of S-expressions.
Definition of lists of numbers:
Two ways of defining lists of numbers (as S-expressions):
-
\(\mathrm{IntList}\) is the smallest set satisfying
\[\mathrm{IntList} = \{ \mathtt{'()} \} \cup \{ \mathtt{(}\mathtt{cons}~a~as\mathtt{)} ~|~ a \in \mathrm{Int}, as \in \mathrm{IntList} \}\]where \(\mathrm{Int}\) is the set of integer literal values.
-
\(z \in \mathrm{IntList}\) is a judgement defined by the inference rules
\[\frac{ }{ \mathtt{'()} \in \mathrm{IntList} }~(\mathit{Empty}) \quad\quad\quad \frac{ a \in \mathrm{Int} \quad as \in \mathrm{IntList} }{ \mathtt{(}\mathtt{cons}~a~as\mathtt{)} \in \mathrm{IntList} }~(\mathit{Cons})\]
Definition of lists:
More generally, two ways of defining lists of \(A\)s, where \(A\) is some other well-defined set of values:
-
\(\mathrm{List}(A)\) is the smallest set satisfying
\[\mathrm{List}(A) = \{ \mathtt{'()} \} \cup \{ \mathtt{(}\mathtt{cons}~a~as\mathtt{)} ~|~ a \in A, as \in \mathrm{List}(A) \}\] -
\(z \in \mathrm{List}(A)\) is a judgement defined by the inference rules
\[\frac{ }{ \mathtt{'()} \in \mathrm{List}(A) }~(\mathit{Empty}) \quad\quad\quad \frac{ a \in A \quad as \in \mathrm{List}(A) }{ \mathtt{(}\mathtt{cons}~a~as\mathtt{)} \in \mathrm{List}(A) }~(\mathit{Cons})\]
Lists as an abstract datatype
-
creators/producers:
'()
,(cons x xs)
-
observers:
null?
,pair?
,car
,cdr
(also known as "first"/"rest" and "head"/"tail", and many other names) -
algebraic laws
-
(null? '()) == #t
-
(null? (cons v vs)) == #f
-
(pair? '()) == #f
-
(pair? (cons v vs)) == #t
-
(car (cons v vs)) == v
-
(cdr (cons v vs)) == vs
-
Why are lists useful?
-
Sequences are a frequently used abstraction
-
Can easily approximate a set
-
Can implement finite maps with association lists (aka dictionaries)
-
You don’t have to manage memory
These "cheap and cheerful" representations are less efficient than balanced search trees, but are very easy to implement and work with; book has many examples.
The only thing new here is automatic memory management. Everything else you could do in C. (You can have automatic memory management in C as well.)
Recursive functions on lists
Lists are inductively defined; lists are recursively processed.
Any list is constructed with either '()
or cons
.
-
What observers allow you to tell the difference?
length
(define length (xs)
(if (null? xs) 0
(+ 1 (length (cdr xs)))))
-> (length '(1 2 3 4)) 4 -> (length '(1 (2 3 (4 5 6) 7 8) 9)) 3
total-length
Note that the length
defined above only counts the number of elements in the
outermost list structure. If one element of the outermost list is itself a
list, then it only counts as 1
towards the length.
What if we wanted the total length of a list, counting not just the number of elements in the outermost list structure, but also the number of elements of lists that are themselves elements of other lists?
(define total-length (xs)
(if (null? xs) 0
(if (list? (car xs))
(+ (total-length (car cs)) (total-length (cdr xs)))
(+ 1 (total-length (cdr xs))))))
append
Consider the algebraic laws that we want append
to satisfy. We use
informal "math" notation with ..
for "followed by" and e
for the
empty sequence:
-
xs .. e == xs
-
e .. ys == ys
-
(x .. xs) .. ys == x .. (xs .. ys)
-
xs .. (y .. ys) == (xs .. y) .. ys
Some of these ..
correspond to append
(of two lists, xs .. ys
), some
correspond to cons
(of an element and a list, x .. xs
), and some correspond
to snoc
(of a list and an element, xs .. y
).
-
But, we have no
snoc
; strike the last law -
The first law is extraneous, since the second and third laws are complete case analysis on the first argument.
Use the second and third laws to guide the Scheme implementation:
(define append (xs ys)
(if (null? xs) ys
(cons (car xs) (append (cdr xs) ys))))
The dominant cost is cons
(i.e., allocation of a new list element).
How many cons
cells are allocated by (append xs ys)
,
in terms of the lengths of xs
and ys
?
naive reverse
Consider the algebraic laws that we want reverse
to satisfy:
-
reverse e == e
-
reverse (x .. xs) = (reverse xs) .. x
Some correspond to cons
(of an element and a list, x .. xs
), and some
correspond to snoc
(of a list and an element, (reverse xs) .. x
).
-
We can define
snoc
in terms ofappend
:(define snoc (xs x) (append xs (list1 x)))
Use the two laws to guide the Scheme implementation:
(define reverse (xs)
(if (null? xs) '()
(append (reverse (cdr xs)) (list1 (car xs)))))
How many cons
cells are allocated by (reverse xs)
,
in terms of the lengths of xs
?
accumulating reverse
Consider a different set of algebraic laws that we want reverse
to satisfy:
-
reverse e .. zs == zs
-
reverse (x .. xs) .. zs == (reverse xs) .. (x .. zs)
Some of these ..
correspond to append
(of two lists) and some correspond to
cons
(of an element and a list).
(define revapp (xs zs)
(if (null? xs) zs
(revapp (cdr xs) (cons (car xs) zs))))
(define reverse (xs)
(revapp xs '()))
Parameter zs
is the accumulating parameter.
(A powerful, general technique.)
How many cons
cells are allocated by (reverse xs)
,
in terms of the lengths of xs
?
Algebraic Laws, Equational Reasoning, and Calculational Proofs
One might question whether the accumulating reverse
is really the
same function as the naive reverse
. We can use equational
reasoning to prove that the two functions really are equivalent.
Equational reasoning is a simple, but powerful, proof technique that
only requires expanding (or contracting) the definitions of functions
and substituting equals for equal. When applied to a recursive
structure like lists, the proofs are by structural induction.
Structural induction simply requires proving the algebraic law for the
empty list (the base case) and, assuming that the algebraic law hold
for the list zs
(the induction hypothesis), proving that the law
holds for (cons z zs)
(the step case).
The key to proving that the accumulating reverse
is equivalent to
the naive reverse
is proving that revapp
is equivalent to the
append of the (naive) reverse of the first argument and the second
argument. Thus, we prove
Theorem: (revapp xs zs) == (append (reversenaive xs) zs)
Proof: by structural induction on the list xs
and equational reasoning
-
Case
xs == '()
(revapp xs zs) = { xs == '() } (revapp '() zs) = { defn of revapp } (if (null? '()) zs (revapp (cdr '()) (cons (car '()) zs))) = { null?-empty law } (if #t zs (revapp (cdr '()) (cons (car '()) zs))) = { if-#t law } zs = { if-#t law } (if #t zs (cons (car '()) (append (cdr '()) zs))) = { null?-empty law } (if (null? '()) zs (cons (car '()) (append (cdr '()) zs))) = { defn of append } (append '() zs) = { if-#t law } (append (if #t '() (append (reversenaive (cdr '())) (cons (car '()) '()))) zs) = { null?-empty law } (append (if (null? '()) '() (append (reversenaive (cdr '())) (cons (car '()) '()))) zs) = {defn of reversenaive } (append (reversenaive '()) zs) = { xs == '() } (append (reversenaive xs) zs)
-
Case
xs == (cons a as)
with IH(revapp as bs) == (append (reversenaive as) bs)
(for anybs
)(revapp xs zs) = { xs == (cons a s) } (revapp (cons a as) zs) = { defn of revapp } (if (null? (cons a as)) zs (revapp (cdr (cons a as)) (cons (car (cons a as)) zs))) = { null?-cons law } (if #f zs (revapp (cdr (cons a as)) (cons (car (cons a as)) zs))) = { if-#f law } (revapp (cdr (cons a as)) (cons (car (cons a as)) zs)) = { cdr-cons law } (revapp as (cons (car (cons a as)) zs)) = { car-cons law } (revapp as (cons a zs)) = { IH } (append (reversenaive as) (cons a zs)) = { append-sing-left-law (proved in PL:BPC, p. 115) } (append (reversenaive as) (append (cons a '()) zs)) = { append-associative law (assumed below) } (append (append (reversenaive as) (cons a '())) zs) = { car-cons law } (append (append (reversenaive as) (cons (car (cons a as)) '())) zs) = { cdr-cons law } (append (append (reversenaive (cdr (cons a as))) (cons (car (cons a as)) '())) zs) = { if-#f law } (append (if #f '() (append (reversenaive (cdr (cons a as))) (cons (car (cons a as)) '()))) zs) = { null?-cons law } (append (if (null? (cons a as)) '() (append (reversenaive (cdr (cons a as))) (cons (car (cons a as)) '()))) zs) = {defn of reversenaive } (append (reversenaive (cons a as)) zs) = { xs == (cons a as) } (append (reversenaive xs) zs)
Note that this proof assumes that append
is associative, which we
leave as an exercise for the reader.
Theorem: (append xs (append ys zs)) == (append (append xs ys) zs)
Proof: by structural induction on the list xs
Our next proof will assume that append
of the empty list on the
right is the identity:
Theorem: (append xs '()) == xs
Proof: by structural induction on the list xs
and equational reasoning
-
Case
xs == '()
(append xs '()) = { xs == '() } (append '() '()) = { denf of append } (if (null? '()) '() (cons (car '()) (append (cdr '()) '()))) = { null?-empty law } (if #t '() (cons (car '()) (append (cdr '()) '()))) = { if-#t law } '()
-
Case
xs == (cons a as)
with IH(append as '()) == as
(append xs '()) = { xs == (cons a as) } (append (cons a as) '()) = { denf of append } (if (null? (cons a as)) '() (cons (car (cons a as)) (append (cdr (cons a as)) '()))) = { null?-cons law } (if #f '() (cons (car (cons a as)) (append (cdr (cons a as)) '()))) = { if-#f law } (cons (car (cons a as)) (append (cdr (cons a as)) '())) = { car-cons law } (cons a (append (cdr (cons a as)) '())) = { cdr-cons law } (cons a (append as '())) = { IH } (cons a as) = { xs == (cons a as) } xs
Finally, we can complete our argument that reverseaccum
and
reversenaive
are equivalent:
Theorem: (reverseaccum xs) == (reversenaive xs)
Proof: by equational reasoning
(reverseaccum xs) = {defn of reverseaccum } (revapp xs '()) = { revapp-specification law (proved above) } (append (reversenaive xs) '()) = { append-empty-right law (assumed above) } (reversenaive xs)
More Truth about S-expressions
Correcting our simplification of S-expressions.
-
An S-expression is an integer literal, a boolean literal, a symbol, the empty list or a pair of two S-expressions.
A cons
can pair any two values, not just an element and a list.
-
a "list" might have elements of different types:
(cons 1 (cons #t (cons 'a '())))
-
a "cons" need not have a list as its second element:
(cons 1 2)
A proper list is either the empty list or a pair whose second element is a proper list.
Definition of S-expressions:
Two ways of defining S-expressions:
-
\(\mathrm{Atom}\) and \(\mathrm{SExp}\) are the smallest sets satisfying
\[\begin{array}{l} \mathrm{Atom} = \mathrm{Num} \cup \mathrm{Bool} \cup \mathrm{Sym} \cup \{ \mathtt{'()} \} \\ \mathrm{SExp} = \mathrm{Atom} \cup \{ \mathtt{(}\mathtt{cons}~v_1~v_2\mathtt{)} ~|~ v_1 \in \mathrm{SExp}, v_2 \in \mathrm{SExp} \} \end{array}\] -
\(z \in \mathrm{Atom}\) and \(z \in \mathrm{SExp}\) are judgements defined by the inference rules
\[\begin{array}{c} \frac{ z \in \mathrm{Num} }{ z \in \mathrm{Atom} } \quad\quad\quad \frac{ z \in \mathrm{Bool} }{ z \in \mathrm{Atom} } \quad\quad\quad \frac{ z \in \mathrm{Sym} }{ z \in \mathrm{Atom} } \quad\quad\quad \frac{ }{ \mathtt{'()} \in \mathrm{Atom} } \\ \frac{ z \in \mathrm{Atom} }{ z \in \mathrm{SExp} } \quad\quad\quad \frac{ v_1 \in \mathrm{SExp} \quad v_2 \in \mathrm{SExp} }{ \mathtt{(}\mathtt{cons}~v_1~v_2\mathtt{)} \in \mathrm{SExp} } \end{array}\]
Structural Equality of S-expressions:
uScheme provides a primitive =
that works on numbers, booleans,
symbols, and the empty list, but never cons cells. It is only useful
for comparing atoms
Define equal?
, which will identify isomorphic S-expressions,
including lists as a special case.
(define atom? (x) (or (number? x) (or (symbol? x) (or (boolean? x) (null? x))))) (define equal? (s1 s2) (if (or (atom? s1) (atom? s2)) (= s1 s2) (if (and (pair? s1) (pair? s2)) (and (equal? (car s1) (car s2)) (equal? (cdr s1) (cdr s2))) #f)))