TypesType Systems

Require Export Smallstep.

Hint Constructors multi.

Our next major topic is type systems — static program analyses that classify expressions according to the "shapes" of their results. We'll begin with a typed version of a very simple language with just booleans and numbers, to introduce the basic ideas of types, typing rules, and the fundamental theorems about type systems: type preservation and progress. Then we'll move on to the simply typed lambda-calculus, which lives at the core of every modern functional programming language (including Coq).

Typed Arithmetic Expressions

To motivate the discussion of type systems, let's begin as usual with an extremely simple toy language. We want it to have the potential for programs "going wrong" because of runtime type errors, so we need something a tiny bit more complex than the language of constants and addition that we used in chapter Smallstep: a single kind of data (just numbers) is too simple, but just two kinds (numbers and booleans) already gives us enough material to tell an interesting story.

The language definition is completely routine. The only thing to notice is that we are not using the asnum/aslist trick that we used in chapter HoareList to make all the operations total by forcibly coercing the arguments to + (for example) into numbers. Instead, we simply let terms get stuck if they try to use an operator with the wrong kind of operands: the step relation doesn't relate them to anything.

Syntax

Informally:

Formally:

Values are true, false, and numeric values...

Inductive bvalue : tm → Prop :=
  | bv_true : bvalue ttrue
  | bv_false : bvalue tfalse.

Inductive nvalue : tm → Prop :=
  | nv_zero : nvalue tzero
  | nv_succ : ∀t, nvalue t → nvalue (tsucc t).

Definition value (t:tm) := bvalue t ∨ nvalue t.

Hint Constructors bvalue nvalue.
Hint Unfold value.
Hint Unfold extend.

Operational Semantics

Informally:

	(ST_IfTrue)

if true then t₁ else t₂ ⇒ t₁

	(ST_IfFalse)

if false then t₁ else t₂ ⇒ t₂

t₁ ⇒ t₁'	(ST_If)

if t₁ then t₂ else t₃ ⇒
if t₁' then t₂ else t₃

t₁ ⇒ t₁'	(ST_Succ)

succ t₁ ⇒ succ t₁'

	(ST_PredZero)

pred 0 ⇒ 0

numeric value v₁	(ST_PredSucc)

pred (succ v₁) ⇒ v₁

t₁ ⇒ t₁'	(ST_Pred)

pred t₁ ⇒ pred t₁'

	(ST_IszeroZero)

iszero 0 ⇒ true

numeric value v₁	(ST_IszeroSucc)

iszero (succ v₁) ⇒ false

t₁ ⇒ t₁'	(ST_Iszero)

iszero t₁ ⇒ iszero t₁'

Formally:

Reserved Notation "t₁ '⇒' t₂" (at level 40).

Inductive step : tm → tm → Prop :=
  | ST_IfTrue : ∀t₁ t₂,
      (tif ttrue t₁ t₂) ⇒ t₁
  | ST_IfFalse : ∀t₁ t₂,
      (tif tfalse t₁ t₂) ⇒ t₂
  | ST_If : ∀t₁ t₁' t₂ t₃,
      t₁ ⇒ t₁' →
      (tif t₁ t₂ t₃) ⇒ (tif t₁' t₂ t₃)
  | ST_Succ : ∀t₁ t₁',
      t₁ ⇒ t₁' →
      (tsucc t₁) ⇒ (tsucc t₁')
  | ST_PredZero :
      (tpred tzero) ⇒ tzero
  | ST_PredSucc : ∀t₁,
      nvalue t₁ →
      (tpred (tsucc t₁)) ⇒ t₁
  | ST_Pred : ∀t₁ t₁',
      t₁ ⇒ t₁' →
      (tpred t₁) ⇒ (tpred t₁')
  | ST_IszeroZero :
      (tiszero tzero) ⇒ ttrue
  | ST_IszeroSucc : ∀t₁,
       nvalue t₁ →
      (tiszero (tsucc t₁)) ⇒ tfalse
  | ST_Iszero : ∀t₁ t₁',
      t₁ ⇒ t₁' →
      (tiszero t₁) ⇒ (tiszero t₁')

where "t₁ '⇒' t₂" := (step t₁ t₂).

Notice that the step relation doesn't care about whether expressions make global sense — it just checks that the operation in the next reduction step is being applied to the right kinds of operands.

For example, the term succ true (i.e., tsucc ttrue in the formal syntax) cannot take a step, but the almost as obviously nonsensical term

succ (if true then true else true)

can take a step (once, before becoming stuck).

Normal Forms and Values

The first interesting thing about the step relation in this language is that the strong progress theorem from the Smallstep chapter fails! That is, there are terms that are normal forms (they can't take a step) but not values (because we have not included them in our definition of possible "results of evaluation"). Such terms are stuck.

Notation step_normal_form := (normal_form step).

Definition stuck (t:tm) : Prop :=
step_normal_form t ∧ ¬ value t.

Hint Unfold stuck.

Exercise: 2 stars (some_term_is_stuck)

Example some_term_is_stuck :
∃t, stuck t.

Proof.
(* FILL IN HERE *) Admitted.

☐

However, although values and normal forms are not the same in this language, the former set is included in the latter. This is important because it shows we did not accidentally define things so that some value could still take a step.

Exercise: 3 stars, advanced (value_is_nf)

Hint: You will reach a point in this proof where you need to use an induction to reason about a term that is known to be a numeric value. This induction can be performed either over the term itself or over the evidence that it is a numeric value. The proof goes through in either case, but you will find that one way is quite a bit shorter than the other. For the sake of the exercise, try to complete the proof both ways.

Lemma value_is_nf : ∀t,
value t → step_normal_form t.

Proof.
(* FILL IN HERE *) Admitted.

☐

Exercise: 3 stars, optional (step_deterministic)

Using value_is_nf, we can show that the step relation is also deterministic...

Theorem step_deterministic:
deterministic step.
Proof with eauto.
(* FILL IN HERE *) Admitted.

☐

Typing

The next critical observation about this language is that, although there are stuck terms, they are all "nonsensical", mixing booleans and numbers in a way that we don't even want to have a meaning. We can easily exclude such ill-typed terms by defining a typing relation that relates terms to the types (either numeric or boolean) of their final results.

Inductive ty : Type :=
| TBool : ty
| TNat : ty.

In informal notation, the typing relation is often written ⊢ t ∈ T, pronounced "t has type T." The ⊢ symbol is called a "turnstile". (Below, we're going to see richer typing relations where an additional "context" argument is written to the left of the turnstile. Here, the context is always empty.)

	(T_True)

⊢ true ∈ Bool

	(T_False)

⊢ false ∈ Bool

⊢ t₁ ∈ Bool ⊢ t₂ ∈ T ⊢ t₃ ∈ T	(T_If)

⊢ if t₁ then t₂ else t₃ ∈ T

	(T_Zero)

⊢ 0 ∈ Nat

⊢ t₁ ∈ Nat	(T_Succ)

⊢ succ t₁ ∈ Nat

⊢ t₁ ∈ Nat	(T_Pred)

⊢ pred t₁ ∈ Nat

⊢ t₁ ∈ Nat	(T_IsZero)

⊢ iszero t₁ ∈ Bool

Reserved Notation "'⊢' t '∈' T" (at level 40).

Inductive has_type : tm → ty → Prop :=
  | T_True :
       ⊢ ttrue ∈ TBool
  | T_False :
       ⊢ tfalse ∈ TBool
  | T_If : ∀t₁ t₂ t₃ T,
       ⊢ t₁ ∈ TBool →
       ⊢ t₂ ∈ T →
       ⊢ t₃ ∈ T →
       ⊢ tif t₁ t₂ t₃ ∈ T
  | T_Zero :
       ⊢ tzero ∈ TNat
  | T_Succ : ∀t₁,
       ⊢ t₁ ∈ TNat →
       ⊢ tsucc t₁ ∈ TNat
  | T_Pred : ∀t₁,
       ⊢ t₁ ∈ TNat →
       ⊢ tpred t₁ ∈ TNat
  | T_Iszero : ∀t₁,
       ⊢ t₁ ∈ TNat →
       ⊢ tiszero t₁ ∈ TBool

where "'⊢' t '∈' T" := (has_type t T).

Examples

It's important to realize that the typing relation is a conservative (or static) approximation: it does not calculate the type of the normal form of a term.

Example has_type_1 :
⊢ tif tfalse tzero (tsucc tzero) ∈ TNat.

Proof.
  apply T_If.
    apply T_False.
    apply T_Zero.
    apply T_Succ.
      apply T_Zero.
Qed.

(Since we've included all the constructors of the typing relation in the hint database, the auto tactic can actually find this proof automatically.)

Example has_type_not :
¬ (⊢ tif tfalse tzero ttrue ∈ TBool).

Proof.
intros Contra. solve by inversion 2. Qed.

Exercise: 1 star, optional (succ_hastype_nat__hastype_nat)

Example succ_hastype_nat__hastype_nat : ∀t,
  ⊢ tsucc t ∈ TNat →
  ⊢ t ∈ TNat.
Proof.
  (* FILL IN HERE *) Admitted.

☐

Canonical forms

The following two lemmas capture the basic property that defines the shape of well-typed values. They say that the definition of value and the typing relation agree.

Lemma bool_canonical : ∀t,
⊢ t ∈ TBool → value t → bvalue t.

Proof.
  intros t HT HV.
  inversion HV; auto.

  induction H; inversion HT; auto.
Qed.

Lemma nat_canonical : ∀t,
⊢ t ∈ TNat → value t → nvalue t.

Proof.
  intros t HT HV.
  inversion HV.
  inversion H; subst; inversion HT.

  auto.
Qed.

Progress

The typing relation enjoys two critical properties. The first is that well-typed normal forms are values (i.e., not stuck).

Theorem progress : ∀t T,
⊢ t ∈ T →
value t ∨ ∃t', t ⇒ t'.

Exercise: 3 stars (finish_progress)

Complete the formal proof of the progress property. (Make sure you understand the informal proof fragment in the following exercise before starting — this will save you a lot of time.)

Proof with auto.
  intros t T HT.
  has_type_cases (induction HT) Case...
  (* The cases that were obviously values, like T_True and
     T_False, were eliminated immediately by auto *)
  Case "T_If".
    right. inversion IHHT1; clear IHHT1.
    SCase "t₁ is a value".
    apply (bool_canonical t₁ HT1) in H.
    inversion H; subst; clear H.
      ∃t₂...
      ∃t₃...
    SCase "t₁ can take a step".
      inversion H as [t₁' H1].
      ∃(tif t₁' t₂ t₃)...
  (* FILL IN HERE *) Admitted.

☐

Exercise: 3 stars, advanced (finish_progress_informal)

Complete the corresponding informal proof:

Theorem: If ⊢ t ∈ T, then either t is a value or else t ⇒ t' for some t'.

Proof: By induction on a derivation of ⊢ t ∈ T.

If the last rule in the derivation is T_If, then t = if t₁ then t₂ else t₃, with ⊢ t₁ ∈ Bool, ⊢ t₂ ∈ T and ⊢ t₃ ∈ T. By the IH, either t₁ is a value or else t₁ can step to some t₁'.
- If t₁ is a value, then by the canonical forms lemmas and the fact that ⊢ t₁ ∈ Bool we have that t₁ is a bvalue — i.e., it is either true or false. If t₁ = true, then t steps to t₂ by ST_IfTrue, while if t₁ = false, then t steps to t₃ by ST_IfFalse. Either way, t can step, which is what we wanted to show.
- If t₁ itself can take a step, then, by ST_If, so can t.

(* FILL IN HERE *)
☐

This is more interesting than the strong progress theorem that we saw in the Smallstep chapter, where all normal forms were values. Here, a term can be stuck, but only if it is ill typed.

Exercise: 1 star (step_review)

Quick review. Answer true or false. In this language...

Every well-typed normal form is a value.
Every value is a normal form.
The single-step evaluation relation is a partial function (i.e., it is deterministic).
The single-step evaluation relation is a total function.

☐

Type Preservation

The second critical property of typing is that, when a well-typed term takes a step, the result is also a well-typed term.

This theorem is often called the subject reduction property, because it tells us what happens when the "subject" of the typing relation is reduced. This terminology comes from thinking of typing statements as sentences, where the term is the subject and the type is the predicate.

Theorem preservation : ∀t t' T,
  ⊢ t ∈ T →
  t ⇒ t' →
  ⊢ t' ∈ T.

Exercise: 2 stars (finish_preservation)

Complete the formal proof of the preservation property. (Again, make sure you understand the informal proof fragment in the following exercise first.)

Proof with auto.
  intros t t' T HT HE.
  generalize dependent t'.
  has_type_cases (induction HT) Case;
         (* every case needs to introduce a couple of things *)
         intros t' HE;
         (* and we can deal with several impossible
            cases all at once *)
         try (solve by inversion).
    Case "T_If". inversion HE; subst; clear HE.
      SCase "ST_IFTrue". assumption.
      SCase "ST_IfFalse". assumption.
      SCase "ST_If". apply T_If; try assumption.
        apply IHHT1; assumption.
    (* FILL IN HERE *) Admitted.

☐

Exercise: 3 stars, advanced (finish_preservation_informal)

Complete the following proof:

Theorem: If ⊢ t ∈ T and t ⇒ t', then ⊢ t' ∈ T.

Proof: By induction on a derivation of ⊢ t ∈ T.

If the last rule in the derivation is T_If, then t = if t₁ then t₂ else t₃, with ⊢ t₁ ∈ Bool, ⊢ t₂ ∈ T and ⊢ t₃ ∈ T.

Inspecting the rules for the small-step reduction relation and remembering that t has the form if ..., we see that the only ones that could have been used to prove t ⇒ t' are ST_IfTrue, ST_IfFalse, or ST_If.
- If the last rule was ST_IfTrue, then t' = t₂. But we know that ⊢ t₂ ∈ T, so we are done.
- If the last rule was ST_IfFalse, then t' = t₃. But we know that ⊢ t₃ ∈ T, so we are done.
- If the last rule was ST_If, then t' = if t₁' then t₂ else t₃, where t₁ ⇒ t₁'. We know ⊢ t₁ ∈ Bool so, by the IH, ⊢ t₁' ∈ Bool. The T_If rule then gives us ⊢ if t₁' then t₂ else t₃ ∈ T, as required.

(* FILL IN HERE *)
☐

Exercise: 3 stars (preservation_alternate_proof)

Now prove the same property again by induction on the evaluation derivation instead of on the typing derivation. Begin by carefully reading and thinking about the first few lines of the above proof to make sure you understand what each one is doing. The set-up for this proof is similar, but not exactly the same.

Theorem preservation' : ∀t t' T,
  ⊢ t ∈ T →
  t ⇒ t' →
  ⊢ t' ∈ T.
Proof with eauto.
  (* FILL IN HERE *) Admitted.

☐

Type Soundness

Putting progress and preservation together, we can see that a well-typed term can never reach a stuck state.

Definition multistep := (multi step).
Notation "t₁ '⇒*' t₂" := (multistep t₁ t₂) (at level 40).

Corollary soundness : ∀t t' T,
  ⊢ t ∈ T →
  t ⇒* t' →
  ~(stuck t').

Proof.
  intros t t' T HT P. induction P; intros [R S].
  Case "multi_refl".
    destruct (progress x T HT); auto.
  Case "multi_step".
    apply IHP. apply (preservation x y T HT H).
    unfold stuck. split; auto. Qed.

Aside: the normalize Tactic

When experimenting with definitions of programming languages in Coq, we often want to see what a particular concrete term steps to — i.e., we want to find proofs for goals of the form t ⇒* t', where t is a completely concrete term and t' is unknown. These proofs are simple but repetitive to do by hand. Consider for example reducing an arithmetic expression using the small-step relation astep.

Definition amultistep st := multi (astep st).
Notation " t '/' st '⇒_a×' t' " := (amultistep st t t')
(at level 40, st at level 39).

Example astep_example1 :
  (APlus (ANum 3) (AMult (ANum 3) (ANum 4))) / empty_state
  ⇒_a× (ANum 15).
Proof.
  apply multi_step with (APlus (ANum 3) (ANum 12)).
    apply AS_Plus2.
      apply av_num.
      apply AS_Mult.
  apply multi_step with (ANum 15).
    apply AS_Plus.
  apply multi_refl.
Qed.

We repeatedly apply multi_step until we get to a normal form. The proofs that the intermediate steps are possible are simple enough that auto, with appropriate hints, can solve them.

Hint Constructors astep aval.
Example astep_example1' :
  (APlus (ANum 3) (AMult (ANum 3) (ANum 4))) / empty_state
  ⇒_a× (ANum 15).
Proof.
  eapply multi_step. auto. simpl.
  eapply multi_step. auto. simpl.
  apply multi_refl.
Qed.

The following custom Tactic Notation definition captures this pattern. In addition, before each multi_step we print out the current goal, so that the user can follow how the term is being evaluated.

Tactic Notation "print_goal" := match goal with ⊢ ?x ⇒ idtac x end.
Tactic Notation "normalize" :=
   repeat (print_goal; eapply multi_step ;
             [ (eauto 10; fail) | (instantiate; simpl)]);
   apply multi_refl.

Example astep_example1'' :
  (APlus (ANum 3) (AMult (ANum 3) (ANum 4))) / empty_state
  ⇒_a× (ANum 15).
Proof.
  normalize.
  (* At this point in the proof script, the Coq response shows
     a trace of how the expression evaluated.

   (APlus (ANum 3) (AMult (ANum 3) (ANum 4)) / empty_state ==>a* ANum 15)
   (multi (astep empty_state) (APlus (ANum 3) (ANum 12)) (ANum 15))
   (multi (astep empty_state) (ANum 15) (ANum 15))
*)
Qed.

The normalize tactic also provides a simple way to calculate what the normal form of a term is, by proving a goal with an existential variable in it.

Example astep_example1''' : ∃e',
  (APlus (ANum 3) (AMult (ANum 3) (ANum 4))) / empty_state
  ⇒_a× e'.
Proof.
  eapply ex_intro. normalize.

(* This time, the trace will be:

    (APlus (ANum 3) (AMult (ANum 3) (ANum 4)) / empty_state ==>a* ??)
    (multi (astep empty_state) (APlus (ANum 3) (ANum 12)) ??)
    (multi (astep empty_state) (ANum 15) ??)

   where ?? is the variable ``guessed'' by eapply.
*)
Qed.

Exercise: 1 star (normalize_ex)

Theorem normalize_ex : ∃e',
  (AMult (ANum 3) (AMult (ANum 2) (ANum 1))) / empty_state
  ⇒_a× e'.
Proof.
  (* FILL IN HERE *) Admitted.

☐

Exercise: 1 star, optional (normalize_ex')

For comparison, prove it using apply instead of eapply.

Theorem normalize_ex' : ∃e',
  (AMult (ANum 3) (AMult (ANum 2) (ANum 1))) / empty_state
  ⇒_a× e'.
Proof.
  (* FILL IN HERE *) Admitted.

☐

Additional Exercises

Exercise: 2 stars (subject_expansion)

Having seen the subject reduction property, it is reasonable to wonder whether the opposity property — subject expansion — also holds. That is, is it always the case that, if t ⇒ t' and ⊢ t' ∈ T, then ⊢ t ∈ T? If so, prove it. If not, give a counter-example. (You do not need to prove your counter-example in Coq, but feel free to do so if you like.)

(* FILL IN HERE *)
☐

Exercise: 2 stars (variation1)

Suppose, that we add this new rule to the typing relation:

      | T_SuccBool : ∀t,
           ⊢ t ∈ TBool →
           ⊢ tsucc t ∈ TBool

Which of the following properties remain true in the presence of this rule? For each one, write either "remains true" or else "becomes false." If a property becomes false, give a counterexample.

Determinism of step
Progress
Preservation

☐