Skip to content

Static Single Assignment Compiler Theory

  • 1.

    A. V. Aho, R. Sethi, and J. D. Ullman. Code Optimization and Finite Church-Rosser Systems. In Design and Optimization of Compilers, R. Rustin, ed. Prentice Hall, 1971, pp. 89–105. 118Google Scholar

  • 2.

    B. Alpern, M. N. Wegman, and F. K. Zadeck. Detecting Equality of Variables in Programs. Proceedings of the Fifteenth Annual ACM Symposium on Principles of Programming Languages, 1988, pp. 1–11. 111Google Scholar

  • 3.

    A.W. Appel. SSA is Functional Programming. ACM SIGPLAN 33, 4 (April 1998), pp. 17–20. 111, 112CrossRefGoogle Scholar

  • 4.

    A. W. Appel. Modern Compiler Implementation in Java. Cambridge, 1998. 111Google Scholar

  • 5.

    M. M. Brandis and H. Mössenböck. Single-Pass Generation of Static Single-Assignment Form for Structured Languages. ACM TOPLAS 16, 6 (November 1994), pp. 1684–1698. 122CrossRefGoogle Scholar

  • 6.

    P. Briggs, T. Harvey, and T. Simpson. Static Single Assignment Construction, Version 1.0. Unpublished document, 1995. 115Google Scholar

  • 7.

    J.-D. Choi, R. Cytron, and J. Ferrante. Automatic Construction of Sparse Data Flow Evaluation Graphs. ACM POPL’ 91, pp. 55–66. 122Google Scholar

  • 8.

    F. Chow, S. Chan, R. Kennedy, S.-M. Liu, R. Lo, and P. Tu. A New Algorithm for Partial Redundancy Elimination based on SSA Form. ACM PLDI’ 97, pp. 273–286. 123Google Scholar

  • 9.

    R. K. Cytron and J. Ferrante. Efficiently Computing φ-Nodes On-The-Fly. ACM TOPLAS 17, 3 (May 1995), pp. 487–506. 122CrossRefGoogle Scholar

  • 10.

    R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Efficiently Computing Static Single-Assignment Form and the Control Dependence Graph. ACM TOPLAS 13, 4 (October 1991), pp. 451–490. 111, 112, 119, 120, 122, 123CrossRefGoogle Scholar

  • 11.

    R. Cytron, A. Lowry, K. Zadeck. Code Motion of Control Structures in High-Level Languages. Proceedings of the Thirteenth Annual ACM Symposium on Principles of Programming Languages, 1986, pp. 70–85. 122Google Scholar

  • 12.

    M. J. Fischer. Efficiency of Equivalence Algorithms. In Complexity of Computer Computations, R. E. Miller and J. W. Thatcher, eds. Plenum Press, 1972. 120Google Scholar

  • 13.

    M. S. Hecht. Flow Analysis of Computer Programs, North-Holland, 1977. 116, 118Google Scholar

  • 14.

    M. S. Hecht and J. D. Ullman. Flow Graph Reducibility. SIAM Journal of Computing 1, 2 (June 1972), pp. 188–202. 115, 118, 119CrossRefMathSciNetGoogle Scholar

  • 15.

    J. E. Hopcroft and J. D. Ullman. Set Merging Algorithms. SIAM Journal of Computing 2, 4 (December 1973), pp. 294–303. 120CrossRefMathSciNetGoogle Scholar

  • 16.

    R. Johnson, D. Pearson, and K. Pingali. The Program Structure Tree: Computing Control Regions in Linear Time. ACM PLDI’ 94, pp. 171–185. 122Google Scholar

  • 17.

    D. E. Knuth. An Empirical Study of FORTRAN Programs. Software — Practice and Experience 1, 1971, pp. 105–133. 112Google Scholar

  • 18.

    D. E. Knuth. The Art of Computer Programming, Volume 1: Fundamental Algorithms, Addison Wesley, 1997. 120Google Scholar

  • 19.

    D. B. Loveman and R. A. Faneuf. Program Optimization — Theory and Practice. Conference on Programming Languages and Compilers for Parallel and Vector Machines, 1975, pp. 97–102. 122Google Scholar

  • 20.

    R. Morgan. Building an Optimizing Compiler, Digital Press, 1998. 115Google Scholar

  • 21.

    S. S. Muchnick. Advanced Compiler Design and Implementation. Morgan Kaufmann, 1997. 111Google Scholar

  • 22.

    B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Global Value Numbers and Redundant Computations. Proceedings of the Fifteenth Annual ACM Symposium on Principles of Programming Languages, 1988, pp. 12–27. 122Google Scholar

  • 23.

    R. M. Shapiro and H. Saint. The Representation of Algorithms. Rome Air Development Center TR-69-313, Volume II, September 1969. 122Google Scholar

  • 24.

    V. C. Sreedhar and G. R. Gao. A Linear Time Algorithm for Placing φ-Nodes. Proceedings of the Twenty-Second Annual ACM Symposium on Principles of Programming Languages, 1995, pp. 62–73. 122Google Scholar

  • 25.

    R. E. Tarjan. Efficiency of a Good But Not Linear Set Union Algorithm. JACM 22, 2 (April 1975), pp. 215–225. 120CrossRefMathSciNetGoogle Scholar

  • 26.

    J. D. Ullman. Fast Algorithms for the Elimination of Common Subexpressions. Acta Informatica 2, 1973, pp. 191–213. 119Google Scholar

  • 27.

    M. N. Wegman and F. K. Zadeck. Constant Propagation with Conditional Branches. ACM TOPLAS 13, 2 (April 1991), pp. 181–210. 111, 112CrossRefGoogle Scholar


  • 13.3 Static Single Assignment

    Most of the tree optimizers rely on the data flow information provided by the Static Single Assignment (SSA) form. We implement the SSA form as described in R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and K. Zadeck. Efficiently Computing Static Single Assignment Form and the Control Dependence Graph. ACM Transactions on Programming Languages and Systems, 13(4):451-490, October 1991.

    The SSA form is based on the premise that program variables are assigned in exactly one location in the program. Multiple assignments to the same variable create new versions of that variable. Naturally, actual programs are seldom in SSA form initially because variables tend to be assigned multiple times. The compiler modifies the program representation so that every time a variable is assigned in the code, a new version of the variable is created. Different versions of the same variable are distinguished by subscripting the variable name with its version number. Variables used in the right-hand side of expressions are renamed so that their version number matches that of the most recent assignment.

    We represent variable versions using nodes. The renaming process in wraps every real and virtual operand with an node which contains the version number and the statement that created the . Only definitions and virtual definitions may create new nodes.

    Sometimes, flow of control makes it impossible to determine the most recent version of a variable. In these cases, the compiler inserts an artificial definition for that variable called PHI function or PHI node. This new definition merges all the incoming versions of the variable to create a new name for it. For instance,

    if (…) a_1 = 5; else if (…) a_2 = 2; else a_3 = 13; # a_4 = PHI <a_1, a_2, a_3> return a_4;

    Since it is not possible to determine which of the three branches will be taken at runtime, we don’t know which of , or to use at the return statement. So, the SSA renamer creates a new version which is assigned the result of “merging” , and . Hence, PHI nodes mean “one of these operands. I don’t know which”.

    The following functions can be used to examine PHI nodes

    Function: gimple_phi_result()

    Returns the created by PHI node (i.e., ’s LHS).

    Function: gimple_phi_num_args()

    Returns the number of arguments in . This number is exactly the number of incoming edges to the basic block holding .

    Function: gimple_phi_arg(, )

    Returns th argument of .

    Function: gimple_phi_arg_edge(, )

    Returns the incoming edge for the th argument of .

    Function: gimple_phi_arg_def(, )

    Returns the for the th argument of .

    13.3.1 Preserving the SSA form

    Some optimization passes make changes to the function that invalidate the SSA property. This can happen when a pass has added new symbols or changed the program so that variables that were previously aliased aren’t anymore. Whenever something like this happens, the affected symbols must be renamed into SSA form again. Transformations that emit new code or replicate existing statements will also need to update the SSA form.

    Since GCC implements two different SSA forms for register and virtual variables, keeping the SSA form up to date depends on whether you are updating register or virtual names. In both cases, the general idea behind incremental SSA updates is similar: when new SSA names are created, they typically are meant to replace other existing names in the program.

    For instance, given the following code:

    1 L0: 2 x_1 = PHI (0, x_5) 3 if (x_1 < 10) 4 if (x_1 > 7) 5 y_2 = 0 6 else 7 y_3 = x_1 + x_7 8 endif 9 x_5 = x_1 + 1 10 goto L0; 11 endif

    Suppose that we insert new names and (lines and ).

    1 L0: 2 x_1 = PHI (0, x_5) 3 if (x_1 < 10) 4 x_10 = … 5 if (x_1 > 7) 6 y_2 = 0 7 else 8 x_11 = … 9 y_3 = x_1 + x_7 10 endif 11 x_5 = x_1 + 1 12 goto L0; 13 endif

    We want to replace all the uses of with the new definitions of and . Note that the only uses that should be replaced are those at lines , and . Also, the use of at line should not be replaced (this is why we cannot just mark symbol for renaming).

    Additionally, we may need to insert a PHI node at line because that is a merge point for and . So the use of at line will be replaced with the new PHI node. The insertion of PHI nodes is optional. They are not strictly necessary to preserve the SSA form, and depending on what the caller inserted, they may not even be useful for the optimizers.

    Updating the SSA form is a two step process. First, the pass has to identify which names need to be updated and/or which symbols need to be renamed into SSA form for the first time. When new names are introduced to replace existing names in the program, the mapping between the old and the new names are registered by calling (note that if your pass creates new code by duplicating basic blocks, the call to will set up the necessary mappings automatically).

    After the replacement mappings have been registered and new symbols marked for renaming, a call to makes the registered changes. This can be done with an explicit call or by creating flags in the structure for your pass. There are several flags that control the behavior of :

    • . Update the SSA form inserting PHI nodes for newly exposed symbols and virtual names marked for updating. When updating real names, only insert PHI nodes for a real name in blocks reached by all the new and old definitions for . If the iterated dominance frontier for is not pruned, we may end up inserting PHI nodes in blocks that have one or more edges with no incoming definition for . This would lead to uninitialized warnings for ’s symbol.
    • . Update the SSA form without inserting any new PHI nodes at all. This is used by passes that have either inserted all the PHI nodes themselves or passes that need only to patch use-def and def-def chains for virtuals (e.g., DCE).
    • . Insert PHI nodes everywhere they are needed. No pruning of the IDF is done. This is used by passes that need the PHI nodes for even if it means that some arguments will come from the default definition of ’s symbol (e.g., ).

      WARNING: If you need to use this flag, chances are that your pass may be doing something wrong. Inserting PHI nodes for an old name where not all edges carry a new replacement may lead to silent codegen errors or spurious uninitialized warnings.

    • . Passes that update the SSA form on their own may want to delegate the updating of virtual names to the generic updater. Since FUD chains are easier to maintain, this simplifies the work they need to do. NOTE: If this flag is used, any OLD->NEW mappings for real names are explicitly destroyed and only the symbols marked for renaming are processed.

    13.3.2 Examining nodes

    The following macros can be used to examine nodes

    Macro: SSA_NAME_DEF_STMT()

    Returns the statement that creates the . If is an empty statement (i.e., returns ), it means that the first reference to this variable is a USE or a VUSE.

    Macro: SSA_NAME_VERSION()

    Returns the version number of the object .

    13.3.3 Walking the dominator tree

    Tree SSA function: voidwalk_dominator_tree(, )

    This function walks the dominator tree for the current CFG calling a set of callback functions defined in in . The call back functions you need to define give you hooks to execute custom code at various points during traversal:

    1. Once to initialize any local data needed while processing and its children. This local data is pushed into an internal stack which is automatically pushed and popped as the walker traverses the dominator tree.
    2. Once before traversing all the statements in the .
    3. Once for every statement inside .
    4. Once after traversing all the statements and before recursing into ’s dominator children.
    5. It then recurses into all the dominator children of .
    6. After recursing into all the dominator children of it can, optionally, traverse every statement in again (i.e., repeating steps 2 and 3).
    7. Once after walking the statements in and ’s dominator children. At this stage, the block local data stack is popped.