CSC448: Phases

Contents [0/8]

Phases: Abstract Syntax [1/8]
Phases: Three-Address-Code Abstract Syntax [2/8]
Phases: Well-formedness [3/8]
Phases: Typechecking [4/8]
Phases: Simple Transforms [5/8]
Phases: Eliminate High Level Control [6/8]
Phases: Introduce Three Address Code [7/8]
Phases: Code Generation [8/8]

Phases: Abstract Syntax [1/8]

Clogs abstract syntax (for well-formed, typed programs)

            n : Integer         (integer)
        f,l,x : Identifier      (identifier: function, label, variable)
          T,S ::= int | T[]     (non-void type)

      program ::= extdecl1 ... extdecln

      extdecl ::= decl
                | T f (S1 x1,...,Sn xn) compoundStat      (FunDef)
                | void f (T1 x1,...,Tn xn) compoundStat   (FunDef)

         decl ::= T x;                                    (Decl)
                | T x = exp;                              (Decl)

 compoundStat ::= { decl1 ... decln stat1 ... statm }     (StatCompound)

         stat ::= compoundStat                            
                | exp;                                    (StatExp)
                | goto l;                                 (StatGoto)
                | if (exp) statT else statF               (StatIf)
                | return;                                 (StatReturn)
                | return exp;                             (StatReturn)
                | skip;                                   (StatSkip)
                | while (exp) stat                        (StatWhile)

          exp ::= expA[expI]                              (ExpArrayAccess)
                | x = expR                                (ExpAssign)
                | expA[expI] = expR                       (ExpAssign)
                | expL binop expR                         (ExpBinOp)
                | expL , expR                             (ExpComma)
                | f(exp1,...,expn)                        (ExpFunCall)
                | n                                       (ExpInt)
                | new T[exp]                              (ExpNew)
                | "value"                                 (ExpString)
                | unop exp                                (ExpUnOp)
                | x                                       (ExpVar)

Phases: Three-Address-Code Abstract Syntax [2/8]

Abstract syntax for three-address code:

            n : Integer         (integer)
        f,l,x : Identifier      (identifier: function, label, variable)
            v ::= n | x         (integer or variable)
          T,S ::= int | T[]     (non-void type)

      program ::= extdecl1 ... extdecln

      extdecl ::= decl
                | T f (S1 x1,...,Sn xn) compoundStat      (FunDef)
                | void f (T1 x1,...,Tn xn) compoundStat   (FunDef)

         decl ::= T x;                                    (Decl)

 compoundStat ::= { decl1 ... decln stat1 ... statm }     (StatCompound)

         stat ::= compoundStat                            
                | exp;                                    (StatExp)
                | goto l;                                 (StatGoto)
                | if (exp) goto l1; else goto l2;         (StatIf)
                | return;                                 (StatReturn)
                | return exp;                             (StatReturn)
                | skip;                                   (StatSkip)

          exp ::= rhs
                | lhs = rhs                               (ExpAssign)

          lhs ::= x                                       (ExpVar)
                | x[v]                                    (ExpArrayAccess)

          rhs ::= x[v]                                    (ExpArrayAccess)
                | vL binop vR                             (ExpBinOp)
                | f(v1,...,vn)                            (ExpFunCall)
                | n                                       (ExpInt)
                | new T[v]                                (ExpNew)
                | "value"                                 (ExpString)
                | unop v                                  (ExpUnOp)
                | x                                       (ExpVar)

Phases: Well-formedness [3/8]

Well-formedness addresses the following issues:

Phases: Typechecking [4/8]

Contexts have the form:

    G ::= empty
        | G,x:T                (variable declaration)
        | G,f:(S1,...,Sn)->T   (function declaration)

The judgments are written:

      |- extdecl1 ... extdecln : Program
    G |- extdecl : Declaration
    G |- stat : Statement<declaredReturnType>
    G |- exp : T
      |- exp is lvalue

Every variable and array access is an lvalue. Nothing else is an lvalue.

    -----------------------------
      |- (expA[expI]) is lvalue
    ----------------
      |- x is lvalue

To check a bunch of global variable and function declarations, we put them in the context, then start checking them; this allows mutually recursive definitions.

    context(extdecl1 ... extdecln) |- extdecl1 : Declaration
    ...
    context(extdecl1 ... extdecln) |- extdecln : Declaration
    ------------------------------------------------------------------
                                   |- extdecl1 ... extdecln : Program

To check a variable declaration, we just need to make sure that the initializer has the correct type:

    ------------------------------
    G |- (T x;) : Declaration
    G |- exp : T
    -------------------------------
    G |- (T x = exp;) : Declaration

To check a function declaration, we check the body:

    G,x1:S1,...,xn:Sn |- compoundStat : Statement<T>
    -------------------------------------------------------
    G |- (T f (S1 x1,...,Sn xn) compoundStat) : Declaration

To check a statement, do a case analysis.

case StatCompound:

    G,x1:T1,...,xn:Tn |- (T1 x1 = exp1;) : Declaration
    ...
    G,x1:T1,...,xn:Tn |- (Tn xn = expn;) : Declaration
    G,x1:T1,...,xn:Tn |- stat1 : Statement<T>
    ...
    G,x1:T1,...,xn:Tn |- statm : Statement<T>
    -----------------------------------------------------------------------
    G |- { T1 x1 = exp1; ... Tn xn = expn; stat1 ... statm } : Statement<T>

case StatExp:

    G |- exp : S    (for some type S)
    ---------------------------------------
    G |- (exp;) : Statement<T>

case StatGoto:

    -----------------------------------
    G |- (goto l;) : Statement<T>

case StatIf:

    G |- exp : int
    G |- statT : Statement<T>
    G |- statF : Statement<T>
    -----------------------------------------------------
    G |- (if (exp) statT else statF) : Statement<T>

case StatReturn:

    ---------------------------------------
    G |- (return;) : Statement<void>

    G |- exp : T
    ---------------------------------------
    G |- (return exp;) : Statement<T>

case StatSkip:

    ---------------------------------
    G |- (skip;) : Statement<T>

case StatWhile:

    G |- exp : int
    G |- statW : Statement<T>
    -----------------------------------------------------
    G |- (while (exp) statW) : Statement<T>

To check an expression, do a case analysis.

case ExpInt:

    ------------
    G |- n : int

case ExpVar:

    G(x) = T
    ----------
    G |- x : T

case ExpString:

    --------------------
    G |- "value" : int[]

case ExpNew:

    G |- exp : int
    --------------------------------------------
    G |- (new T[exp]) : T[]

case ExpArrayAccess:

    G |- expA : T[]
    G |- expI : int
    -----------------------------
    G |- (expA[expI]) : T

case ExpAssign:

    G |- expL : T
    G |- expR : T
      |- expL is lval
    ------------------------------
    G |- (expL = expR) : T

case ExpUnOp:

    G |- exp : int
    ---------------------
    G |- (unop exp) : int

case ExpBinOp:

    G |- expL : int
    G |- expR : int
    -----------------------------------
    G |- (expL binop expR) : int

case ExpComma:

    G |- expL : S
    G |- expR : T
    -----------------------------
    G |- (expL , expR) : T

case ExpFunCall:

    G(f) = (S1,...,Sn)->T
    G |- exp1 : S1
    ...
    G |- expn : Sn
    ------------------------------------
    G |- (f(exp1,...,expn)) : T

Phases: Simple Transforms [5/8]

Eliminate Initializers

Rename Labels Uniquely

Phases: Eliminate High Level Control [6/8]

case StatIf:

        if (exp) statT else statF
    -->
        if (exp)
          goto l1;
        else
          goto l2;
      l1:
        skip;
        statT
        goto l3;
      l2:
        skip;
        statF
        goto l3;
      l3:
        skip

case StatWhile:

        while (exp) stat
    -->
        goto l2;
      l1:
        skip;
        stat
      l2:
        if (exp)
          goto l1
        else
          goto l3;
      l3:
        skip;

Phases: Introduce Three Address Code [7/8]

Write compound statements as

    { decls stats }

Write instances of DeclsStats as

    < decls | stats >

Write instances of DeclsStatsVariable as

    < decls | stats | Optional<v> >

If the exp must be present, then:

    < decls | stats | v >

The relevant functions are:

    transformStat (stat) = < decls | stats >
    transformExp (exp) = < decls | stats | Optional<v> >

To transform a statement, do a case analysis.

case StatCompound:

    transformStat (stat_0) = < decls_0 | stats_0 >
    ...
    transformStat (stat_n) = < decls_n | stats_n >
    -----------------------------------------------------
    transformStat (({decls stat_0 ... stat_n}))
      = < nil | { decls decls_0 ... decls_n stats_0 ... stats_m } >

case StatExp:

    transformExp (exp) = < decls | stats | ... >
    -----------------------------------------------------
    transformStat ((exp;)) = < decls | stats >

case StatGoto:

    -----------------------------------------------------
    transformStat ((goto l;)) = < nil | (goto l;) >

case StatIf:

    transformExp (exp) = < decls | stats | v >
    -----------------------------------------------------
    transformStat ((if (exp) goto l1; else goto l2;))
      = < decls | stats  (if (v) goto l1; else goto l2;) >

case StatReturn:

    -----------------------------------------------------
    transformStat ((return;)) = < nil | (return;) >

    transformExp (exp) = < decls | stats | v >
    -----------------------------------------------------------------
    transformStat ((return exp;)) = < decls | stats (return v;) >

case StatSkip:

    -----------------------------------------------------
    transformStat ((skip;)) = < nil | (skip;) >

To transform an expression, do a case analysis.

case ExpInt:

    -----------------------------------------------------
    transformExp (n) = < nil | nil | n >

case ExpVar:

    -----------------------------------------------------
    transformExp (x) = < nil | nil | x >

case ExpString:

    -----------------------------------------------------
    transformExp (("value"))
      = < (int[] x;) | (x = ("value");) | x >

case ExpNew:

    transformExp (exp) = < decls | stats | v >
    -----------------------------------------------------
    transformExp ((new T[exp];))
      = < decls (T[] x;) | stats (x = (new T[v]);) | x >

case ExpArrayAccess:

    transformExp (expA) = < decls1 | stats1 | v1 >
    transformExp (expI) = < decls2 | stats2 | v2 >
    ------------------------------------------------------------------------------
    transformExp ((expA[expI]) :: T)
      = < decls1 decls2 (T x;) | stats1 stats2 (x = (v1[v2]);) | x >

case ExpAssign:

    transformExp (expR) = < decls3 | stats3 | v3 >
    -----------------------------------------------------
    transformExp ((x = expR))
      = < decls3 | stats3 (x = v3;) | v3 >

    transformExp (expA) = < decls1 | stats1 | v1 >
    transformExp (expI) = < decls2 | stats2 | v2 >
    transformExp (expR) = < decls3 | stats3 | v3 >
    -------------------------------------------------------------------------------
    transformExp ((expA[expI] = expR))
      = < decls1 decls2 decls3 | stats1 stats2 stats3 (v1[v2] = v3;) | v3 >

case ExpUnOp:

    transformExp (exp) = < decls | stats | v >
    ------------------------------------------------------------
    transformExp ((unop exp) :: T)
      = < decls (T x;) | stats (x = (unop v);) | x >

case ExpBinOp:

    transformExp (expL)  = < decls1 | stats1 | v1 >
    transformExp (expR) = < decls2 | stats2 | v2 >
    ----------------------------------------------------------------------------------
    transformExp ((expL op expR) :: T)
      = < decls1 decls2 (T x;) | stats1 stats2 (x = (v1 op v2);) | x >

case ExpComma:

    transformExp (expL)  = < decls1 | stats1 | Optional<exp1> >
    transformExp (expR) = < decls2 | stats2 | Optional<exp2> >
    -----------------------------------------------------------
    transformExp ((expL , expR))
      = < decls1 decls2 | stats1 stats2 | Optional<exp2> >

case ExpFunCall:

    transformExp (exp1) = < decls1 | stats1 | v1 >
    ...
    transformExp (expn) = < declsn | statsn | vn >
    -------------------------------------------------------------------------
    transformExp ((f(exp1,...,expn)) :: void)
      = < decls1 ... declsn | stats1 ... statsn (f(v1,...,vn)) | nil >

    transformExp (exp1) = < decls1 | stats1 | v1 >
    ...
    transformExp (expn) = < declsn | statsn | vn >
    -------------------------------------------------------------------------
    transformExp ((f(exp1,...,expn)) :: T)
      = < decls1 ... declsn (T x;) | stats1 ... statsn (x = (f(v1,...,vn))) | x >

Phases: Code Generation [8/8]

For programs, suppose that all the global variable declarations precede the function declarations, then we have:

    code (T1 x1) = asmd1
    ...
    code (Tn xn) = asmdn
    context((decl1 ... decln fundef1 ... fundefm)) |- code (fundef1) = asmf1
    ...
    context((decl1 ... decln fundef1 ... fundefm)) |- code (fundefm) = asmfm
    ------------------------------------------------------------------------
    code ((decl1 ... decln fundef1 ... fundefm)) =
      .text
      asmf1
      ...
      asmfm
      .data
      asmd1
      ...
      asmdm

For global variable declarations:

    --------------------------------------------
    code ((T x;)) =
      .align 4
      .global x
      x:
      .long  0

For function declarations:

    G,x1:S1,...,xn:Sn |- code (compoundStat) = asm
    -------------------------------------------------
    G |- code ((T f(S1 x1,...,Sn xn) compoundStat)) =
          .align 4
          .global f
          f:
          pushl   %ebp
          movl    %esp, %ebp
          asm

Recall the three-address syntax for statements:

    stat ::= { decl1 ... decln stat1 ... statm }     (StatCompound)
           | exp;                                    (StatExp)
           | goto l;                                 (StatGoto)
           | if (exp) goto l1; else goto l2;         (StatIf)
           | return;                                 (StatReturn)
           | return exp;                             (StatReturn)
           | skip;                                   (StatSkip)

case StatCompound:

    G,x1:T1,...,xn:Tn |- code (stat1) = asm1
    ...
    G,x1:T1,...,xn:Tn |- code (statm) = asmm
    ---------------------------------------------------
    G |- code ({ T1 x1; ... Tn xn; stat1 ... statm }) =
          subl $(n*intSize), %esp
          asm1
          ...
          asmm
          addl $(n*intSize), %esp

case StatExp:

    G |- code (exp) = asm
    ---------------------------------------------------
    G |- code ((exp;)) = asm

case StatGoto:

    ---------------------------------------------------
    G |- code ((goto l;)) = jmp l

case StatIf:

    G |- location (v) = v''
    ---------------------------------------------
    G |- code ((if (v) goto l1; else goto l2;)) =
          movl   v'', %eax
          testl  %eax, %eax
          jne    %l1
          jmp    %l2

case StatReturn:

    ---------------------------------------------
    G |- code ((return;)) =
          movl   %ebp, %esp
          popl   %ebp
          ret
    
    G |- location (v) = v''
    ---------------------------------------------
    G |- code ((return v;)) =
          movl   v'', %eax
          movl   %ebp, %esp
          popl   %ebp
          ret

case StatSkip:

    ---------------------------------------------
    G |- code ((skip;)) = nop

Recall the three-address syntax for expressions:

    exp ::= rhs
          | lhs = rhs                               (ExpAssign)

    lhs ::= x                                       (ExpVar)
          | x[v]                                    (ExpArrayAccess)

    rhs ::= n                                       (ExpInt)
          | x                                       (ExpVar)
          | "value"                                 (ExpString)
          | new T[v]                                (ExpNew)
          | x[v]                                    (ExpArrayAccess)
          | unop v                                  (ExpUnOp)
          | vL binop vR                             (ExpBinOp)
          | f(v1,...,vn)                            (ExpFunCall)

First generate code for the RHS, and put the result in %eax

case ExpInt:

    ---------------------------------------------
    G |- code (n) =
          movl   $n, %eax

case ExpVar:

    G |- location (x) = x''
    ---------------------------------------------
    G |- code (x) =
          movl   x'', %eax

case ExpString:

    ---------------------------------------------
    G |- code (("value")) =
          movl lab_string, %eax
    
          lab_string:
          .int  'v'
          .int  'a'
          .int  'l'
          .int  'u'
          .int  'e'
          .int  0

case ExpNew:

    G |- location (v) = v''
    ---------------------------------------------
    G |- code ((new T[v])) =
          movl   v'', %eax
          sall   $(log_2(intSize)), %eax
          push   %eax
          call   _malloc
          addl   $(1*intSize), %esp

case ExpArrayAccess:

    G |- location (x) = x''
    G |- location (v) = v''
    ---------------------------------------------
    G |- code (x[v]) =
          movl   x'', %edx
          movl   v'', %ecx
          sall   $(log_2(intSize)), %ecx
          addl   %ecx, %edx
          movl   0(%edx), %eax

case ExpUnOp:

    G |- location (v) = v''
    ---------------------------------------------
    G |- code ((- v)) =
          movl   v1'', %eax
          negl   %eax

case ExpBinOp:

    G |- location (v1) = v1''
    G |- location (v2) = v2''
    ---------------------------------------------
    G |- code ((v1 + v2)) =
          movl   v1'', %eax
          movl   v2'', %ecx
          addl   %ecx, %eax

case ExpFunCall:

    G |- location (v1) = v1''
    ...
    G |- location (vn) = vn''
    ---------------------------------------------
    G |- code ((f(v1,...,vn))) =
          pushl  vn''
          ...
          pushl  v1''
          call   f
          addl   $(n*intSize), %esp

Then generate code for the LHS, if any, getting the result from %eax

case ExpVar:

      G |- location (x) = x''
      ---------------------------------------------
      G |- code (x) =
            movl   %eax, x''

case ExpArrayAccess:

      G |- location (x) = x''
      G |- location (v) = v''
      ---------------------------------------------
      G |- code (x[v]) =
            movl   x'', %edx
            movl   v'', %ecx
            sall   $(log_2(intSize)), %ecx
            addl   %ecx, %edx
            movl   %eax, 0(%edx)

Revised: 2008/01/31 17:19