Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | More about code motion |
---|---|
Timelines: | family | ancestors | descendants | both | kbk-refactor-callframe |
Files: | files | file ages | folders |
SHA3-256: |
2f51c8d73bce7855797f67aa11f7a5dc |
User & Date: | kbk 2019-02-25 04:41:47.424 |
Context
2019-03-26
| ||
20:45 | Oops - didn't commit images for callframe.md! check-in: a19796b4e5 user: kbk tags: kbk-refactor-callframe | |
2019-02-25
| ||
04:41 | More about code motion check-in: 2f51c8d73b user: kbk tags: kbk-refactor-callframe | |
2019-02-18
| ||
04:25 | Write up first couple of tasks check-in: 643eefa7e6 user: kbk tags: kbk-refactor-callframe | |
Changes
Changes to doc/20190216callframe/callframe.md.
︙ | ︙ | |||
236 237 238 239 240 241 242 243 244 245 246 247 248 249 | the instruction, and all other names that potentially alias the given name lose their values. + `invoke[Expanded]` - The available values in the source callframe become available in the destination callframe. Then, any names that the invoked command might change and any names that might alias them, lose their available values. + `directSet` etc. - Any name that might be an alias of the changing variable loses its available value. (If the changing variable name cannot be determined, all names might be aliases, because in this case we cannot determine that the name is fully qualified.) + `variable`, `upvar`, `nsupvar` - Any existing value for the given | > > > | 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 | the instruction, and all other names that potentially alias the given name lose their values. + `invoke[Expanded]` - The available values in the source callframe become available in the destination callframe. Then, any names that the invoked command might change and any names that might alias them, lose their available values. + `extractCallframe` - The available values in the source callframe become available in the destination callframe. + `directSet` etc. - Any name that might be an alias of the changing variable loses its available value. (If the changing variable name cannot be determined, all names might be aliases, because in this case we cannot determine that the name is fully qualified.) + `variable`, `upvar`, `nsupvar` - Any existing value for the given |
︙ | ︙ | |||
276 277 278 279 280 281 282 | block's new available set. The calculation of this set may in turn change the available values on φ instructions at the start of loops, so the calculation must be applied to convergence over all blocks in the program. Those who are versed in formal data flow analysis are welcome to separate the descriptions of the operations given above into separate | | | 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 | block's new available set. The calculation of this set may in turn change the available values on φ instructions at the start of loops, so the calculation must be applied to convergence over all blocks in the program. Those who are versed in formal data flow analysis are welcome to separate the descriptions of the operations given above into separate sets `AVAIL\_GEN` and `AVAIL\_KILL` so that the problem can be given as the forward analysis: ![\texttt{AVAIL\_OUT}[B] = \texttt{AVAIL\_GEN}[B] ~ \cup ~ \left( \overline{\texttt{AVAIL\_KILL}[B]} ~ \cap \bigcap_{P\in \texttt{pred}[B]} \texttt{AVAIL\_OUT}[P] \right) |
︙ | ︙ | |||
358 359 360 361 362 363 364 365 366 367 368 369 370 371 | input callframe. We start with the set required relative to the output callframe. All variables that may be read or written, and all possible aliases, are added to the set. (Since we do not in general examine whether a procedure must alter a given variable, but only whether it may alter it, we require in general that the previous value of the variable must already be in sync.) + `directGet` etc. - The variables required will be relative to the input callframe. We start with the set required relative to the output callframe. Any variable that might alias the variable being retrieved or set in the instruction is added to the set. + `variable`, `upvar`, `nsupvar` - The variables required will be relative to the input callframe. We start with the set required | > > > | 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 | input callframe. We start with the set required relative to the output callframe. All variables that may be read or written, and all possible aliases, are added to the set. (Since we do not in general examine whether a procedure must alter a given variable, but only whether it may alter it, we require in general that the previous value of the variable must already be in sync.) + `extractCallframe` - The variables required in the source callframe are precisely those anticipated in the destination callframe. + `directGet` etc. - The variables required will be relative to the input callframe. We start with the set required relative to the output callframe. Any variable that might alias the variable being retrieved or set in the instruction is added to the set. + `variable`, `upvar`, `nsupvar` - The variables required will be relative to the input callframe. We start with the set required |
︙ | ︙ | |||
388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 | designates a variable that is not live in the given callframe may be safely removed. Either it is a dead STORE (that is, neither it nor any potential alias will ever be accessed again), or it participates in a STORE-STORE pair. ## TASK 3: Anticipability analysis and loop-invariant loads ## TASK 4: Loop-invariant stores ## References | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | | | > | > > > > > > | 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 | designates a variable that is not live in the given callframe may be safely removed. Either it is a dead STORE (that is, neither it nor any potential alias will ever be accessed again), or it participates in a STORE-STORE pair. ## TASK 3: Anticipability analysis and loop-invariant loads With what has been developed in Tasks 1 and 2, we can get rid of locally redundant `moveToCallFrame` and `moveFromCallFrame` instructions. What remains is to attack loop-invariant ones: that is, `moveFromCallFrame` that can be safely moved above a loop or `moveToCallFrame` that can be safely be moved below. We use a plan of attack based loosely on \[DREC93\]. ### Anticipability We already have the concept of _availability,_ developed in Task 1. Let us introduce some notation for available values. `AVAIL_OUT[B]` will be notation for the set of ordered pairs (_cf_. _name_) available in SSA registers on exit from block `B` of the program. (In the pairs, _cf_ names an SSA variable designating a callframe, and _name_ is the name of a variable in that frame. Similarly, `AVAIL_IN[B]`will be the set of (_cf_, _name_) pairs available on entry to block `B`, and we know that ![\texttt{AVAIL\_IN}[B] = \bigcap_{P\in \texttt{pred}[B]} \texttt{AVAIL\_OUT}[P] \right)](./availin.svg) (This discussion glosses over the fact that the callframe reference must be translated every time that it passes through a φ instruction. The translation is straightforward - for the intersection above, the available value will be the output of the phi; for the anticipability analysis below, the anticipable value will be the input of the phi on the code path being analyzed.) Availability by itself is not sufficient for loop-invariant code motion. In addition, we need the concept of _anticipability_. A (_cf_, _name_) pair is _anticipable_ at a given point in the program if every execution path forward from that point contains a `moveFromCallFrame` retrieving that value prior to the value's being modified. Calculating anticipability requires multiple traversals of the program in postdominator order. Nothing is anticipable on 'return'. For blocks that do not return, we begin by taking the intersection of values anticipable at their successors: ![\texttt{ANTIC\_OUT}[B] = \bigcap_{P\in \texttt{succ}[B]} \texttt{ANTIC\_IN}[P] \right)](./anticout.svg) (Note that in computing this function, we translate the callframe reference if it appears in a φ operation at the start of the block.) and then traverse the instructions of the block in reverse order. + `moveFromCallFrame` - The given name and callframe reference become anticipable. + `moveToCallFrame` - The given name and callframe reference, if they are anticipable, are removed from the anticipable set, along with any names that might potentially be aliases of the given name .This becomes the anticipable set for the input callframe. The output callframe is no longer relevant and may be forgotten. + `invoke[Expanded]` - Any variables potentially modified by the invoked command, and any potential aliases, are removed from the anticipable set, and the new set becomes the anticipable set for the input callframe. The output callframe is no longer relevant and may be forgotten. + `extractCallFrame` - The store operations anticipated in the source callframe are precisely those anticipated in the destination callframe, and the destination callframe is no longer relevant. + `directGet` - Since (as stated above) we ignore the possibility that a read trace on one variable may modify another, we simply copy all anticipable references. + `directSet`, etc. - Any local variable that might alias the variable being modified is no longer anticipable, and once again the callframes are swapped as with `moveToCallFrame` and `invoke`. + `variable`, `upvar`, `nsupvar` - The local variable and any aliases are no longer anticipable. + φ - Analysis stops at a φ instruction that references a callframe. When analyzing a predecessor block, `ANTIC_IN` for its successors will be translated to refer to the callframe that is input to the φ. If a block has no φ instruction for the callframe, the same callframe will be used when constructing its `ANTIC_IN` when analyzing a predecessor. ### Placement of load instructions Once the `AVAIL_OUT` and `ANTIC_IN` sets are constructed, we can determine where to place additional `moveFromCallFrame` instructions that will perform loop-independent code motion. Since the entry block of a loop is always a merge point in the control flow, its predecessor blocks will be single-exit blocks owing to critical edge splitting. We need consider only insertion of `moveFromCallFrame` instructions at these points. The general plan is sketched in Section 4.3.2 of \[VanD04\]. We examine merge points in the program (blocks with more than one predecessor), in dominator order. For each (_cf_, _name_) that is anticipable at the entry to the block, we check availability in the predecessor blocks. If the expression is available in at least one predecessor, but not all predecessors, we insert `moveFromCallFrame` instructions for it in the predecessors where it is unavailable. This is another analysis that must be iterated to convergence. Inserting these 'moveFromCallFrame' operations produces new available variables, which can in turn expose further opportunities for code motion. \[VAND04\] discusses this in more detail in chapter 4. ### Possible combination of passes This analysis closely mirrors what happens in partial redundancy elimination (`quadcode/pre.tcl`). It may be possible to combine the two into a single pass. The principal change will be that availability analysis will need some refactoring, to discipline the size of the `AVAIL` sets. Unlike the description in \[SIMP96\], we do not need to shrink any of these sets for safety, because every time we perform an operation that might affect a value in the callframe, we assign a new SSA value to the callframe. Nevertheless, the quadcode sequence is structured so that only one callframe is in flight at any given time. Even if procedure inlining gives rise to additional callframes, we still have the weaker condition that if there is an instruction like `doSomething cfout cfin moreArgs` the input callframe will never again be used. ## TASK 4: Loop-invariant stores ## References \[DREC93\] Karl-Heinz Drechsler and Manfred P. Stadel. "A variation of Knoop, Rüthing, and Steffen's _Lazy Code Motion_". _ACM SIGPLAN Notices_ 28:5 (May, 1993), pp. 29-38. [<https://dl.acm.org/citation.cfm?id=152823>] \[SIMP96\] Loren Taylor Simpson. 1996. "Value-Driven Redundancy Elimination". Ph.D. Dissertation. Rice University, Houston, TX, USA. AAI9631092. [<https://www.clear.rice.edu/comp512/Lectures/Papers/SimpsonThesis.pdf>] \[VanD04\] VanDrunen, Thomas J. "Partial redundancy elimination for global value numbering." PhD thesis, Purdue University, West Lafayette, Indiana (August, 2004) [<ftp://ftp.cs.purdue.edu/pub/hosking/papers/vandrunen.pdf>] |