Why formalize a programming language?
(Or, why have Section 1.5 of the textbook in addition to Section 1.2 and Section 1.6.)
Consider some of the ways to "present" a programming language:
-
Prose tutorial (like Section 1.2 of the textbook)
-
Reference implementation (like Section 1.6 of the textbook)
-
Formal (i.e., mathematical) definition (like Section 1.5 of the textbook)
Why isn’t a prose tutorial enough? A tutorial usually gives a number of concrete examples. However, a tutorial clearly cannot give all programs (and their behaviors). A tutorial is also usually organized from simple concepts to more complex concepts, but this sometimes requires revisiting a previous topic when more advanced concepts are available. (For example, revisit the topic on method declarations in Java to discuss the exception list after exceptions have been introduced.) A tutorial may therefore miss explaining some "interesting" interactions between language features.
Why isn’t a reference implementation enough? An implementation necessarily needs to make particular choices about how to implement language features, but it would be a mistake to consider these implementation details are part of the language. For example, an implementation will (usually) generate informative error messages when the source program has syntax errors, but the exact form of the error messages are not part of the definition of the programming language. (Indeed, different compilers can "compete" to provide the best error messages, while remaining implementations of the same programming language.)
What does a formal definition provide? A formal definition is a focused, concise, and precise specification of a programming language. It should be unambiguous, meaning that different people will conclude the same things about the programming language from the definition and different implementations based on the definition will behave the same.
Note that all of these ways of presenting a programming language have their place in the programming language’s ecosystem. An implementation (reference or otherwise) is needed to actually execute programs. And prose tutorials are needed to introduce the language; it would be inappropriate to teach CS1 and CS2 by just presenting a formal definition of Python and/or Java. A formal definition does not tell us how to use the programming language or how to use it well.
Operational Semantics
Operational semantics is one way to formally define a programming language. In essence, an operational semantics for a programming language is formal rules for interpretation of a program in an abstract machine.
At a high-level, an operational semantics answers three questions:
-
What are the expressions (source programs)?
-
What are the values (final results)?
-
What are the rules for turning expressions into values?
Typically, an operational semantics consists of:
-
abstract syntax of programs (and program components) --- what gets evaluated
In Impcore, we have expressions
-
abstract syntax of values --- the final results of evaluation
In Impcore, we have integers
-
abstract machine states (and initial and final abstract machines states) --- program/value plus any other components necessary for evaluation
In Impcore, we have \(\langle e, \xi, \phi, \rho \rangle\) (an expression plus three environments) and \(\langle v, \xi, \phi, \rho \rangle\) (a value plus three environments)
-
judgements and inference rules --- how to evaluate
In Impcore, we have \(\langle e, \xi, \phi, \rho \rangle \Downarrow \langle v, \xi', \phi, \rho' \rangle\).
The Chess analogy
Consider how the game of Chess is "defined" in the pamphlet rulebook that accompanies a typical chess board. There are usually stylized diagrams that demonstrate how each piece moves. These diagrams do not usually draw the entire chess board; rather, they use a small (3x3 or 4x4) square with rough-cut edges to indicate that this represents any position on the board. Similarly, these diagrams do not usually draw each and every move of a chess piece; rather, they use arrows to indicate the directions that a piece may move. Thus, each of these diagrams serves as a "template" that can be instantiated at any position on the board and for any length of move.
For example, a single diagram depicting how a rook moves can represent the 896 possible moves of a rook (64 positions on the board, 7 horizontal and 7 vertical positions to move). The small number of diagrams/rules capture a much larger number of legal moves in a game. This concise specification that gives rise to a complex game is certainly a part of the lasting appeal of chess.
The stylized diagrams also help to clarify that while there are certain physical characteristics of a given chess board (e.g., size of the board, shapes of the pieces), such details are not important to the "game" of chess.
Environments
An environment maps names to meanings.
Impcore uses three environments:
-
\(\xi\): the global variable environment, mapping variables to values (machine integers)
-
\(\phi\): the function environment, mapping function names to primitives or function definitions
-
\(\rho\): the formal variable environment, mapping variables to values (machine integers)
The formal variable environment is written \(\rho = \{ x_1 \mapsto n_1, \ldots, x_k \mapsto n_k \}\), mapping variable \(x_i\) to value \(n_i\).
Operations on environments:
-
\(x \in \mathrm{dom}~\rho\) - variable \(x\) is defined in environment \(\rho\)
-
\(\rho(x)\) - the meaning of variable \(x\) in environment \(\rho\)
-
\(\rho\{x \mapsto n\}\) - extends/modifies environment \(\rho\) to map \(x\) to \(n\)
Judgements and Inference Rules
A judgement is just notation for a proposition, a true/false statement.
Impcore uses two judgements:
-
\(\langle e, \xi, \phi, \rho \rangle \Downarrow \langle v, \xi', \phi, \rho' \rangle\) meaning "Evaluating the expression \(e\) in the global-variable environment \(\xi\), the function-definition environment \(\phi\), and the formal-parameter environment \(\rho\) produces the value \(v\) and the new global-variable environment \(\xi'\), the unchanged function-definition environment \(\phi\), and the new formal-parameter environment \(\rho'\)."
-
\(\langle d, \xi, \phi \rangle \rightarrow \langle \xi', \rho' \rangle\) meaning "Evaluating the definition \(d\) in the global-variable environment \(\xi\) and the function-definition environment \(\phi\) produces the new global-variable environment \(\xi'\) and the new formal-parameter environment \(\rho'\)."
A set of inference rules form a proof system that define when a judgement is true (or, more accurately, provable).
Note
|
There is no "right" or "wrong" set of inference rules to define
a judgement. That is to say, there is no inherently correct way to
define a judgement. The meaning of a programming language is up to
the language designer. On the other hand, this doesn’t mean that some
definitions/designs are (subjectively) better than others. For
example, we would normally consider a semantics/language that made the
+ symbol perform multiplication to be "worse" than one that made the
+ symbol perform addition. Similarly, we might prefer a
semantics/language that interpreted the condition expression of if
and while the same way.
|
Anatomy of an inference rule
Consider an inference rule for the "\(\langle e, \xi, \phi, \rho \rangle \Downarrow \langle v, \xi', \phi, \rho' \rangle\)" judgement:
-
The item below the line is the conclusion. It must be an instance of the judgement being defined.
-
The items above the line are the hypotheses. They must be either an instance of the judgement being defined, an instance of a judgement previously defined, or a (normal, "math") logical formula.
-
Judgements in conclusion or hypotheses may be constrained to elements of a specific syntactic form.
-
The parenthesized item to the side is the name of the inference rule. It simply allows us to reference this particular inference rule by a short memorable name.
Operational Semantics for Impcore: Expressions
An expression is evaluated in an environment (really, three environments) to produce a value and a modified environment.
Abstract Syntax for Impcore: Expressions
Exp = LITERAL (Int)
| VAR (Name)
| SET (Name, Exp)
| IFX (Exp, Exp, Exp)
| WHILEX (Exp, Exp)
| BEGIN (Explist)
| APPLY (Name, Explist)
Operational Semantics: Judgement form for Expressions
The "judgement" for evaluation of expressions is written
and reads
Evaluating the expression \(e\) in the global-variable environment \(\xi\), the function-definition environment \(\phi\), and the formal-parameter environment \(\rho\) produces the value \(v\) and the new global-variable environment \(\xi'\), the unchanged function-definition environment \(\phi\), and the new formal-parameter environment \(\rho'\).
Operational Semantics: Literals
Operational Semantics: Variables
Operational Semantics: Assignment
Operational Semantics: If
Operational Semantics: While
Operational Semantics: Begin
Operational Semantics: Application
Operational Semantics for Impcore: Definitions
Abstract Syntax for Impcore: Definitions
Def = VAL (Name, Exp)
| DEFINE (Name, Namelist, Exp)
| EXP (Exp)
Operational Semantics: Judgement form for Definitions
The "judgement" for evaluation of definitions is written
and reads
Evaluating the definition \(d\) in the global-variable environment \(\xi\) and the function-definition environment \(\phi\) produces the new global-variable environment \(\xi'\) and the new formal-parameter environment \(\rho'\).
Operational Semantics: Variable Definition
Operational Semantics: Function Definition
Operational Semantics: Top-level Expression
Exercises
Do-Until
Suppose that a do-until expression is added to Impcore, with the following concrete syntax:
exp ::= ... | (do-until exp exp)
and the following abstract syntax:
Exp = ... | DoUntil (Exp, Exp)
Informally, (do-until e1 e2)
evaluates e1
and then evaluates
e2
; if the evaluation of e2
is non-zero, then return the
result of the evaluation of evaluation of e1
; if the evaluation of
e2
is zero, then loop.
Give two inference rules to concisely and precisely specify the behavior of the do-until expression.
Could you define the do-until expression using syntactic sugar?
Solution
Note that in the \(\mathit{DoUntilEnd}\) rule, the final value comes from the evaluation of \(e_1\) but the final environments come from the evaluation of \(e_2\).
The do-until expression can almost be defined as syntactic sugar:
-
(do-until e1 e2)
=(begin e1 (while (if e2 0 1) e1))
but this desugaring always returns 0 (not the value of the final
evaluation of e1
). Note that replacing (if e2 0 1)
with (not
e2)
would require that the not
function is never redefined.
Repeat
Suppose that a repeat expression is added to Impcore, with the following concrete syntax:
exp ::= ... | (repeat exp exp)
and the following abstract syntax:
Exp = ... | Repeat (Exp, Exp)
Informally, (repeat e1 e2)
evaluates e1
to an integer
\(n\) and then evaluates e2
\(n\) times and returns 0. If
\(n\) is negative, then e2
is not evaluated.
Give one or more inference rules to concisely and precisely specify the behavior of the repeat expression.
Could you define the repeat expression using syntactic sugar?
Solution
One set of inference rules expresses the looping with \(\cdots\), similar to that used in the \(\mathit{Begin}\) rule.
Another set of inference rules loops with \(\mathrm{LITERAL}(n-1)\) replacing \(e_1\).
The repeat expression can almost be defined as syntactic sugar:
-
(repeat e1 e2)
=(begin (set cnt e1) (while (>= cnt 0) (begin e2 (set cnt (- cnt 1)))))
but this desugaring only works if the variable cnt
is not used in
e2
and if the functions >=
and -
are never redefined.
Theory and Metatheory
Theory |
Proofs about evaluations; for example, prove a fact about a specific program by instantiating inference rules. |
Metatheory |
Proofs about derivations; for example, prove a fact about all programs by considering the applicable inference rules. |
Theory
Reasoning about a particular evaluation is theory.
A derivation (or derivation tree) is a syntactic object (data structure) that composes inference rules so as to prove a judgement. At the "bottom" (root) of the derivation is the judgement being proven, at the "tops" (leaves) of the derivation are axioms (inference rules with no hypotheses), and in the "middle" (branches) of the derivation are inference rules (with hypotheses).
To prove that (+ 3 4)
evaluates to 7 (which is a fact about a
specific program), we give a derivation:
Metatheory
Reasoning about entire classes of evaluations (or even all evaluations) is metatheory.
Examples:
-
If \(\langle e, \xi, \phi, \rho \rangle \Downarrow \langle v_1, \xi_1, \phi, \rho_1 \rangle\) and \(\langle e, \xi, \phi, \rho \rangle \Downarrow \langle v_2, \xi_2, \phi, \rho_2 \rangle\), then \(v_1 = v_2\) and \(\xi_1 = \xi_2\) and \(\phi_1 = \phi_2\).
This property establishes that Impcore evaluation is deterministic. We can be confident that if we run an Impcore program multiple times, then we will always get the same anser. Furthermore, if we run an Impcore program under multiple interpreters, then we will always get the same answer. (And, if we don’t, then we know that an interpreter is incorrect.)
-
If \(\langle e, \xi, \phi, \rho \rangle \Downarrow \langle v', \xi', \phi, \rho' \rangle\), then \(\mathrm{dom}~\xi = \mathrm{dom}~\xi'\) and \(\mathrm{dom}~\rho = \mathrm{dom}~\rho'\).
This property convinces us that Impcore evaluation doesn’t change the domain of the global and formal environments. Therefore, we could use an array (of length equal to the number of formals of a function) to efficiently implement the formal environment.
-
If \(\langle d, \xi, \phi \rangle \rightarrow \langle \xi', \rho' \rangle\), then \(\mathrm{dom}~\xi \subseteq \mathrm{dom}~\xi'\) and \(\mathrm{dom}~\phi \subseteq \mathrm{dom}~\phi'\).
We could be even more precise and prove that \(\xi'\) and \(\phi'\) are at most one mapping bigger than \(\xi\) and \(\phi\).
Each of these is a (meta)theorem about derivations. (Note that in (meta)theorem statements, when we mention a judgement, we mean that there is a derivation of that judgement.) Since a derivation is a (recursive) data structure, a metatheoretic proof proceeds by structural induction on the derivation.
Induction Principle
To prove a property \(\mathcal{P}\) for all derivations \(\mathcal{D}\) of a judgement, perform case analysis, with one case per rule that can end the derivation \(\mathcal{D}\)
-
foreach \((\mathit{Rule})\),
-
Consider all derivations \(\mathcal{D}\) ending in \((\mathit{Rule})\)
-
Each hypothesis of \((\mathit{Rule})\) corresponds to a subderivation of \(\mathcal{D}\)
-
Assume \(\mathcal{P}(\mathcal{D}_i)\) for each subderivation; these are the induction hypotheses
-
Prove \(\mathcal{P}(\mathcal{D})\)
-
Summary
Theory involves proofs about individual derivations (the evaluation of single programs).
Metatheory involves proofs about collections of derivations (the evaluation of entire classes of programs or even all programs).
Theoretic proofs are derivations.
Metatheoretic proofs proceed by induction over derivations.