Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Tidy up commentary and calculation of INSERT and DELETE. Build the place to start code rewriting, and add some methods to start on renaming variables wholesale. |
---|---|
Timelines: | family | ancestors | descendants | both | kbk-pre |
Files: | files | file ages | folders |
SHA3-256: |
e901ec10df548165be3e9064c3bd7f57 |
User & Date: | kbk 2018-11-12 16:38:54.751 |
Context
2018-11-13
| ||
02:57 | Code complete for PRE - except that I don't know how to introduce a phi if PRE has made an expression available on two afferent flow graph edges rather than in a dominator. check-in: fa9f7e0eb3 user: kbk tags: notworking, kbk-pre | |
2018-11-12
| ||
16:38 | Tidy up commentary and calculation of INSERT and DELETE. Build the place to start code rewriting, and add some methods to start on renaming variables wholesale. check-in: e901ec10df user: kbk tags: kbk-pre | |
01:38 | Change signatures in preparation for code rewriting: include enough information in signatures so that INSERT can reconstruct an instruction check-in: c098311242 user: kbk tags: kbk-pre | |
Changes
Changes to quadcode/duchain.tcl.
︙ | ︙ | |||
24 25 26 27 28 29 30 31 32 33 34 35 36 37 | # run any time after creating the SSA representation. Many quadcode # transformations update the variables incrementally, using 'addUse', # 'removeUse' and 'renameUses' to do the job. A few make sufficiently # violent changes to the control flow that it is more effective simply # to discard and rebuild the relations. oo::define quadcode::transformer { # ud_du_chain -- # # Records ud- and du-chains for quadcode in SSA form # # Results: # None. | > > > > > > > > > > > > > > > > > > | 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | # run any time after creating the SSA representation. Many quadcode # transformations update the variables incrementally, using 'addUse', # 'removeUse' and 'renameUses' to do the job. A few make sufficiently # violent changes to the control flow that it is more effective simply # to discard and rebuild the relations. oo::define quadcode::transformer { # reset_ud_du_chains -- # # Resets the ud- and du-chains # # Results: # None. # # When a pass such as partial redundancy elimination runs, it # renames all variables. Rather than unlinking individual variables, # it simply blows the ud- and du-chains away and starts afresh. method reset_ud_du_chains {} { set duchain {} set udchain {} } # ud_du_chain -- # # Records ud- and du-chains for quadcode in SSA form # # Results: # None. |
︙ | ︙ | |||
53 54 55 56 57 58 59 | method ud_du_chain {} { my debug-duchain { puts "before duchain" my dump-bb } | | < | 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | method ud_du_chain {} { my debug-duchain { puts "before duchain" my dump-bb } my reset_ud_du_chains # Walk through the basic blocks, and the instructions in each block set b -1 foreach content $bbcontent { incr b set pc -1 foreach q $content { |
︙ | ︙ |
Changes to quadcode/pre.tcl.
︙ | ︙ | |||
13 14 15 16 17 18 19 | # The basic idea behind this optimization is that quadcode results are # partitioned into a set of equivalence classes, corresponding with # the values that they compute. Variables in the same class are known # to be equal, and so code that computes them can be removed if the values # are already available; loop-invariant values can be hoisted out of the # corresponding loops, and so on. # | | > > > > | 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | # The basic idea behind this optimization is that quadcode results are # partitioned into a set of equivalence classes, corresponding with # the values that they compute. Variables in the same class are known # to be equal, and so code that computes them can be removed if the values # are already available; loop-invariant values can be hoisted out of the # corresponding loops, and so on. # # Sources of particular note that are cited with procedures that employ # their algorithms include: # # [DRES93] Drechsler, Karl-Heinz, and Manfred P. Stadel. "A variation of # Knoop, RĂ¼thing and Steffen's _Lazy Code Motion._ _SIGPLAN Notices_ 28:5 # (May, 1993), pp. 29-38. # # [SIMP96] Simpson, Loren Taylor. "Value-driven redundancy elimination." # PhD thesis, Rice University, Houston, Texas, April 1996. # https://www.clear.rice.edu/comp512/Lectures/Papers/SimpsonThesis.pdf namespace eval quadcode { |
︙ | ︙ | |||
72 73 74 75 76 77 78 | # Results: # Returns 1 if modifications were made, 0 if the method # accomplished nothing. # # Side effects: # Redundant calculations are removed. # | < | | | | | | | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > < | > | | 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 | # Results: # Returns 1 if modifications were made, 0 if the method # accomplished nothing. # # Side effects: # Redundant calculations are removed. # # The removal of redundant calculations may expose additional # opportunities for optimization. In particular, it is possible # that phi operations will have become worthless, either because # two such operations become the same operation, or because # all inputs to a phi become the same input. It may be necessary # to repeat this optimization after cleaning up useless phi's. # # The partial redundancy elimination done here is a combination of the # 'lazy code motion' of [DRES93] (and explained again in [SIMP96] and # straigntforward available expression analysis. It follows the general # plan: # # Identify expressions that can be proven to compute the same value. # (Global Value Numbering - https://en.wikipedia.org/wiki/Value_numbering) # # Calculate 'defined' - a function from basic blocks to sets of values # that enumerates the values that have values assigned in a block # # Calculate 'available expressions' - a pair of functions from # basic blocks to sets of values that enumerate the values that # are known to be in quadcode registers at the start and the end of # the blocks. https://en.wikipedia.org/wiki/Available_expression # # Calculate the 'altered' function, which enumerates the set of # values that are changed in a basic block and therefore may # not flow through it. ([DRES93] uses the function TRANSP, which # is the complement of 'altered'). # # Calculate the 'ANTLOC' function, which is the set of computations # in a basic block that do not depend on fixed values assigned in the # block and could therefore be moved to a predecessor block. # # Using ANTLOC and 'altered' calculate the 'ANTIN' and 'ANTOUT' # functions. ANTIN is the set of values anywhere later in the program # that could be moved to a point before a basic block. ANTOUT is # the set of values anywhere later in the program that could be # pulled up into the basic block. # # Calculate EARLIEST, which is a function on flowgraph edges # rather than basic blocks, and identifies the set of values for which # the edge is the earliest point at which the calculation might appear. # # Calculate LATER, which is a function on flowgraph edges, and # LATERIN, which is a function on basic blocks. They are sets of # computations that could be done on the given edge or at the start # of the given basic block with the same effect as if they were # done at EARLIEST. # # Calculate INSERT, which is a function from flowgraph edges to sets # of values, that enumerates the places to insert computations that # have been moved upward. # # Calculate DELETE, which is a function from basic blocks to sets of # values, that enumerates the places to remove values that have # been covered by INSERT (or naturally by being computed redundantly # in an earler block.) # # Finally, rewrite all the variable names in the program, while # performing the required insertions and deletions. In addition to # deleting the calculations identified by DELETE, delete any that # are performed redundantly within a single basic block. oo::define quadcode::transformer method partialredundancy {} { my debug-pre { puts "Before partial redundancy elimination:" my dump-bb my variable pre_exemplar } # Compute Global Value Numbering for the values present in the quadcode lassign [my pre_gvn] VN pre_exemplar cands my debug-pre { puts "Value numbering:" dict for {k n} [lsort -stride 2 -index 0 -ascii -increasing $VN] { set v [lindex $pre_exemplar $n] if {[lindex $v 0] ne $k} { puts " found congruence $k -> $n ($v)" } } puts "Candidates for PRE: [my pre_format_bitset $cands]" } # Compute the 'defined' function for the basic blocks set defined [my pre_defined $VN] my debug-pre { puts "Defined expressions:" set b -1 foreach vset $defined { puts "[incr b]: [my pre_format_bitset $vset]" |
︙ | ︙ | |||
189 190 191 192 193 194 195 | # Compute LATER, which is an indication that an expression # may be moved later than an edge (LATER) or the start of # a block (LATERIN) lassign [my pre_later $antloc $EARLIEST] LATERIN LATER | > > | > > > | > | | < | > > > > | | > > | | > | | | | < < < < < < < | < < < | > | > > | | | | 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 | # Compute LATER, which is an indication that an expression # may be moved later than an edge (LATER) or the start of # a block (LATERIN) lassign [my pre_later $antloc $EARLIEST] LATERIN LATER # Compute INSERT, which records where we want to insert new # definitions of values set INSERT [my pre_insert $LATER $LATERIN] # Compute DELETE, which records where we want to remove existing # definitions of values set DELETE [my pre_delete $antloc $LATERIN] my debug-pre { puts "Actions to take:" set b -1 foreach del $DELETE { incr b if {$del != 0} { puts "Basic block $b:\n delete [my pre_format_bitset $del]" } foreach s [my bbsucc $b] { set edge [list $b $s] set ins [dict get $INSERT $edge] if {$ins != 0} { puts "Graph edge $b->$s:\n \ insert [my pre_format_bitset $ins]" } } } } # OK, we're ready to do the variable rewriting set changed [my pre_rewrite_vars $INSERT $DELETE $pre_exemplar] unset pre_exemplar return $changed } # quadcode::transformer method pre_gvn -- # # Calculates a Global Value Numbering for values in a quadcode # sequence. # # Results: # Returns a three-element list. # # The first element is a dictionary whose keys are the names of # program variables and whose values are value numbers. Two value # numbers are distinct if it cannot be shown that they result # from the same computation. # # The second element is a dictionary whose keys are the value # numbers and whose values are ordered pairs: program variables # that represent the values, and the expressions that compute the # values. # # The third element is a bit set containing those values that are # candidates - that is, could conceivably be moved from block to block # # This procedure implements the 'RPO' algorithm presented in # Figure 4.3 of [SIMP96] (see references in the file header. # While perhaps not as fast, it is significantly simpler than # the 'SCC' algorithm and yields the same results. # # This procedure assumes that 'deadbb' has been run before it, # so that it will find basic blocks already in depth-first order. oo::define quadcode::transformer method pre_gvn {} { |
︙ | ︙ | |||
434 435 436 437 438 439 440 | # AVIN is a list of bit vectors giving the sets of values available # at the start of the basic blocks. # # AVOUT is a list of bit vectors giving thes sets of values available # at the end of the basic blocks # # This procedure solves the data flow equations given in Figure 5.6 | | | 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 | # AVIN is a list of bit vectors giving the sets of values available # at the start of the basic blocks. # # AVOUT is a list of bit vectors giving thes sets of values available # at the end of the basic blocks # # This procedure solves the data flow equations given in Figure 5.6 # on page 63 of [SIMP96]. oo::define quadcode::transformer method pre_avail {defined} { my debug-pre { puts "Calculate available expressions:" } |
︙ | ︙ | |||
546 547 548 549 550 551 552 | # on exit from the program's blocks # # Results: # Returns a list indexed by basic block number. The list values # are dictionaries whose keys are the exemplars of altered values # and whose values are immaterial # | | | 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 | # on exit from the program's blocks # # Results: # Returns a list indexed by basic block number. The list values # are dictionaries whose keys are the exemplars of altered values # and whose values are immaterial # # This is the algorithm of Figure 7.1 on page 75 of [SIMP96]. Simpson # is quite unclear on the issue that (AVOUT-AVIN) should apply only to # fixed values, and then the altered candidates appear only as a # consequence of tracking the dependents on thosed fixed # values. Nevertheless, if this filtering is not done, nothing ever # becomes locally anticipable, because its own calculation turns up in # (AVOUT-AVIN). |
︙ | ︙ | |||
611 612 613 614 615 616 617 | # altered - List of bit sets, indexed by basic block number, indicating # what values are altered within the block # # Results: # Returns a list of bit sets, indexed by basic block number, indicating # what values are locally anticipable within the block. Note that # | | | 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 | # altered - List of bit sets, indexed by basic block number, indicating # what values are altered within the block # # Results: # Returns a list of bit sets, indexed by basic block number, indicating # what values are locally anticipable within the block. Note that # # Page 76 of [SIMP96]: antloc_b = defined_b - altered_b # Simpson does not appear to constrain this to candidates, but it doesn't # ever make sense to move a non-candidate. oo::define quadcode::transformer method pre_antloc {cands defined altered} { set ANTLOC {} set b -1 |
︙ | ︙ | |||
649 650 651 652 653 654 655 | # entry into basic blocks. # # ANTOUT is a list indexed by basic block number of bit vectors # that enumerate the candidate values that are anticipable on # exit from basic blocks # # This procedure solves data flow equations given in Figure 6.3 | | | 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 | # entry into basic blocks. # # ANTOUT is a list indexed by basic block number of bit vectors # that enumerate the candidate values that are anticipable on # exit from basic blocks # # This procedure solves data flow equations given in Figure 6.3 # on page 69 of [SIMP96]. oo::define quadcode::transformer method pre_ant_global {altered antloc} { my debug-pre { puts "Calculate global anticipable expressions:" } |
︙ | ︙ | |||
763 764 765 766 767 768 769 | # Returns a dictionary whose keys are ordered pairs {from to} # representing edges in the control flow graph (where 'from' and # 'to' are basic block numbers), and whose values are bit vectors # representing the sets of values for which the edges are the # earliest points where the values may be computed. # # This analysis is part of the data flow equations in Figure 6.3 on | | | 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 | # Returns a dictionary whose keys are ordered pairs {from to} # representing edges in the control flow graph (where 'from' and # 'to' are basic block numbers), and whose values are bit vectors # representing the sets of values for which the edges are the # earliest points where the values may be computed. # # This analysis is part of the data flow equations in Figure 6.3 on # page 69 of [SIMP96]. oo::define quadcode::transformer method pre_earliest {AVOUT altered ANTIN ANTOUT} { set EARLIEST {} # Walk through flowgraph edges (i -> j) |
︙ | ︙ | |||
819 820 821 822 823 824 825 | # in the program. # # LATERIN is a list indexed by basic block number of bit vectors # that enumerate the candidate values that may be calculated # on entry to the corresponding basic block or later in the program. # # This procedure solves part of the dataflow equations shown in | | | 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 | # in the program. # # LATERIN is a list indexed by basic block number of bit vectors # that enumerate the candidate values that may be calculated # on entry to the corresponding basic block or later in the program. # # This procedure solves part of the dataflow equations shown in # Figure 6.4 on page 70 of [SIMP96]. oo::define quadcode::transformer method pre_later {ANTLOC EARLIEST} { my debug-pre { puts "Calculate LATER:" } |
︙ | ︙ | |||
924 925 926 927 928 929 930 931 932 933 934 935 936 937 | } $worklist destroy return [list $LATERIN $LATER] } # quadcode::transformer method pre_format_bitset -- # # Formats a set of values for debug printing # # Parameters: # vset - Set of values to be printed | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 | } $worklist destroy return [list $LATERIN $LATER] } # quadcode::transformer method pre_insert -- # # Calculates where new definitions of values must be inserted # in partial redundancy elimination. # # Parameters: # LATER - Dictionary whose keys are ordered pairs {i j} # identifying edges {i -> j} in the control flow graph. # The values are bit sets that enumerate the candidate values # that may be computed on the corresponding edge or a later # point in the program. # # LATERIN - List indexed by basic block number of bit vectors # that enumerate the candidate values that may be calculated # on entry to the corresponding basic block or later in the # program. # # Results: # Returns a dictionary whose keys are ordered pairs {i j} identifying # edges (i -> j) in the control flow graph. The values are bit sets # that enumerate candidate values that need newly-created assignments # on those edges oo::define quadcode::transformer method pre_insert {LATER LATERIN} { set INSERT {} # Walk through the edges and set INSERT on any edge for which # the value is evaluated LATER, but not LATER-IN on the following node dict for {edge exprs} $LATER { lassign $edge from to dict set INSERT $edge [expr {$exprs & ~[lindex $LATERIN $to]}] } return $INSERT } # quadcode::transformer method pre_delete -- # # Calculates the points in the program where assignments to a given # value must be deleted when performing partial redundancy elimination. # # Parameters: # ANTLOC - List of bit sets, indexed by basic block number, indicating # what values are locally anticipable within the block. # LATERIN - List indexed by basic block number of bit vectors # that enumerate the candidate values that may be calculated # on entry to the corresponding basic block or later in the # program. # # Results: # Returns a list indexed by basic block number whose values are # bit vectors identifying values whose assignments in the given # block are superfluous. oo::define quadcode::transformer method pre_delete {ANTLOC LATERIN} { set DELETE {} set b -1 foreach antloc $ANTLOC laterin $LATERIN { incr b if {$b == 0} { lappend DELETE 0 } else { lappend DELETE [expr {$antloc & ~$laterin}] } } return $DELETE } # quadcode::transformer method pre_rewrite_vars # # Once all the information is available, rewrites the variables # in a program, inserting and deleting computations of values, # to perform partial redundancy elimination. # # Parameters: # INSERT - Dictionary whose keys are ordered pairs {from to} # identifying flowgraph edges and whose values are # bit sets identifying the values whose computation should # be inserted onto the edges. # DELETE - List indexed by basic block number whose values are bit # sets identifying the values whose computation may be # removed from the blocks. # exemplar - List indexed by value number containing ordered pairs, # {exemplar signature} that give an exemplar variable name # and the signature of a quadcode instruction that computes # the value. # # Results: # Returns 1 if the program has been rewritten in any substantial way, # 0 if only renamings have been performed # # Side effects: # All program variables are renumbered, redundant computations are # removed, and invariant computations are hoisted. # # Note that this transformation can leave meaningless phi operations # behind, which will be tidied up in dead code elimination. oo::define quadcode::transformer method pre_rewrite_vars {INSERT DELETE exemplar} { return 0 } # quadcode::transformer method pre_format_bitset -- # # Formats a set of values for debug printing # # Parameters: # vset - Set of values to be printed |
︙ | ︙ |
Changes to quadcode/ssa.tcl.
︙ | ︙ | |||
661 662 663 664 665 666 667 668 669 670 671 672 673 674 | if {[dict exists $r $v]} { return [dict get $r $v] } set nv [my newVarInstance $v] dict set r $v $nv return $nv } # quadcode::transformer method newVarInstance # # Creates a new instance of the given variable # # Parameters: # name - Name of the variable or of an existing instance | > > > > > > > > > > > > > > > > | 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 | if {[dict exists $r $v]} { return [dict get $r $v] } set nv [my newVarInstance $v] dict set r $v $nv return $nv } # quadcode::transformer method resetVarCounts # # Resets all instance counts of all variables. # # Results: # None. # # When a pass such as partial redundancy elimination runs, it rewrites # all variable names throughout the program. Rather than having runaway # variable indices, it calls this routine to reset all counts for # variable names. oo::define quadcode::transformer method resetVarCounts {} { unset varcount } # quadcode::transformer method newVarInstance # # Creates a new instance of the given variable # # Parameters: # name - Name of the variable or of an existing instance |
︙ | ︙ |