Check-in [ab89bbf879]
Bounty program for improvements to Tcl and certain Tcl packages.
Tcl 2019 Conference, Houston/TX, US, Nov 4-8
Send your abstracts to [email protected]
or submit via the online form by Sep 9.

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Fix a bunch of Markdown formatting issues in callframe motion implementation plan
Timelines: family | ancestors | kbk-refactor-callframe
Files: files | file ages | folders
SHA3-256: ab89bbf87963cfd1cfed4b955de205558e3d8cd27be566b1c820610a8893f712
User & Date: kbk 2019-07-21 19:56:02
Context
2019-07-21
19:56
Fix a bunch of Markdown formatting issues in callframe motion implementation plan Leaf check-in: ab89bbf879 user: kbk tags: kbk-refactor-callframe
2019-06-08
22:21
Remove cfRedundancy' header - added prematurely before method structure actually designed. check-in: 74a853f4cd user: kbk tags: kbk-refactor-callframe
Changes

Changes to doc/20190216callframe/callframe.md.

303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
...
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
...
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
...
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
...
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
optimization. In particular, LOAD-LOAD and STORE-LOAD redundancies can
be eliminated by checking each `moveFromCallFrame` operation in turn
for whether it is accessing an available value. If it is, it is
eliminated and its result is replaced with the available value.

### Availability analysis: equivalence sets

LOAD-STORE pairs can also be detected, by examining 'moveToCallFrame'
instructions and eliminating them if the variable being written is
available, and known to be equal to the Tcl value in the callframe.

Therefore, in the availability analysis, as well as accumulating the
sets of available Tcl values and the names in the callframe that
correspond to them, we must track the sets of names in a given
callframe that correspond correspond to each SSA value. That way, 
................................................................................
Availability by itself is not sufficient for loop-invariant code
motion. In addition, we need the concept of _anticipability_. A (_cf_,
_name_) pair is _anticipable_ at a given point in the program if every
execution path forward from that point contains a `moveFromCallFrame`
retrieving that value prior to the value's being modified.

Calculating anticipability requires multiple traversals of the program
in postdominator order. Nothing is anticipable on 'return'. For blocks
that do not return, we begin by taking the intersection of values
anticipable at their successors:

![\texttt{ANTIC\_OUT}[B] =  
\bigcap_{P\in \texttt{succ}[B]} \texttt{ANTIC\_IN}[P] 
\right)](./anticout.svg)

................................................................................
predecessor), in dominator order. For each (_cf_, _name_) that is 
anticipable at the entry to the block, we check availability in the
predecessor blocks. If the expression is available in at least one
predecessor, but not all predecessors, we insert `moveFromCallFrame`
instructions for it in the predecessors where it is unavailable. 

This is another analysis that must be iterated to
convergence. Inserting these 'moveFromCallFrame' operations produces
new available variables, which can in turn expose further
opportunities for code motion. \[VAND04\] discusses this in more
detail in chapter 4.

### Possible combination of passes

This analysis closely mirrors what happens in partial redundancy
................................................................................

`doSomething cfout cfin moreArgs`

the input callframe will never again be used.
  
## TASK 4: Loop-invariant stores

We have seen in Task 3 that'moveFromCallFrame' instructions can be
placed speculatively to convert partial availability of a needed
expression to full availability (and remove a 'moveFromCallFrame'
later in the program on a 'busier' execution path). In a similar manner,
'moveToCallFrame' instructions can be placed speculatively at the
head of blocks in order to make partially dead assignments ("faint"
assignments in the terminology of \[KNOO94\]) fully dead and eliminate
a 'moveToCallFrame' on an eariler path of the program. 

\[LO98\] demonstrates how this partial liveness analysis is precisely
dual to partial availability analysis, and introduces a dual form to
Static Single Assignment (SSA), called Static Single Use (SSU). This
latter form has similar advantages to SSA in that the dependency
chain, here from definition to use rather than from use to definition,
is factored so that there are explicit deviation points (dual to the merge
................................................................................
reason, the more computationally complex analysis of \[KNOO94\] will
be followed, but modified to operate on basic blocks rather than
individual instructions.

We begin with the liveness analysis of Task 2. We augment 'liveness'
with the concept of partial liveness. The data flow equations will look like

![\textt{LIVE\_OUT}[B] = \begin{Bmatrix} \displaystyle \bigcap_{S \in \textt{succ}[B]} \textt{LIVE\_IN}[S],& S \text{ is not an exit block} \\ \displaystyle \left\{ v \in \textt{LOCALVARS}~|~v~\text{might alias a nonlocal variable}  \right\},&\text{otherwise}\end{matrix}](liveout.svg)

![\textt{PLIVE\_OUT}[B] = \begin{Bmatrix} \displaystyle \bigcup_{S \in \textt{succ}[B]} \textt{PLIVE\_IN}[S],& S \text{ is not an exit block} \\ \displaystyle \left\{ v \in \textt{LOCALVARS}~|~v~\text{might alias a nonlocal variable}  \right\},&\text{otherwise}\end{matrix}](pliveout.svg)

![\textt{LIVE\_IN}[B] = \textt{process}(B, \textt{LIVE\_OUT}[B])](livein.svg)

![\textt{PLIVE\_IN}[B] = \textt{process}(B, \textt{PLIVE\_OUT}[B])](plivein.svg)

In these equations, _process_ represents the basic-block-level upward
analysis from Task 2. The _LIVE_ and _PLIVE_ sets are sets of
(callframe, variable) pairs, and it is assumed that the arguments to
the intersection and union are translated if the callframe appears in
a φ operation.

Liveness corresponds to the 'availability' of the partial redundancy
elimination. Just as a 'moveFromCallFrame' may not be hoisted over the
point where it is available, a 'moveToCallFrame' may not be sunk past the
point where it is live.

Similarly, 'moveToCallFrame' creates values and corresponds to
anticipability. Just as it is of no benefit to insert a
'moveFromCallFrame' for a value that is not anticipated, it is of no
benefit to insert a 'moveToCallFrame' for a value that is already present.



## References

\[DREC93\] Karl-Heinz Drechsler and Manfred P. Stadel. "A variation of
Knoop, Rüthing, and Steffen's _Lazy Code Motion_". _ACM SIGPLAN






|







 







|







 







|







 







|

|

|


|







 







|

|












|
|


|

|
|







303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
...
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
...
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
...
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
...
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
optimization. In particular, LOAD-LOAD and STORE-LOAD redundancies can
be eliminated by checking each `moveFromCallFrame` operation in turn
for whether it is accessing an available value. If it is, it is
eliminated and its result is replaced with the available value.

### Availability analysis: equivalence sets

LOAD-STORE pairs can also be detected, by examining `moveToCallFrame`
instructions and eliminating them if the variable being written is
available, and known to be equal to the Tcl value in the callframe.

Therefore, in the availability analysis, as well as accumulating the
sets of available Tcl values and the names in the callframe that
correspond to them, we must track the sets of names in a given
callframe that correspond correspond to each SSA value. That way, 
................................................................................
Availability by itself is not sufficient for loop-invariant code
motion. In addition, we need the concept of _anticipability_. A (_cf_,
_name_) pair is _anticipable_ at a given point in the program if every
execution path forward from that point contains a `moveFromCallFrame`
retrieving that value prior to the value's being modified.

Calculating anticipability requires multiple traversals of the program
in postdominator order. Nothing is anticipable on `return`. For blocks
that do not return, we begin by taking the intersection of values
anticipable at their successors:

![\texttt{ANTIC\_OUT}[B] =  
\bigcap_{P\in \texttt{succ}[B]} \texttt{ANTIC\_IN}[P] 
\right)](./anticout.svg)

................................................................................
predecessor), in dominator order. For each (_cf_, _name_) that is 
anticipable at the entry to the block, we check availability in the
predecessor blocks. If the expression is available in at least one
predecessor, but not all predecessors, we insert `moveFromCallFrame`
instructions for it in the predecessors where it is unavailable. 

This is another analysis that must be iterated to
convergence. Inserting these `moveFromCallFrame` operations produces
new available variables, which can in turn expose further
opportunities for code motion. \[VAND04\] discusses this in more
detail in chapter 4.

### Possible combination of passes

This analysis closely mirrors what happens in partial redundancy
................................................................................

`doSomething cfout cfin moreArgs`

the input callframe will never again be used.
  
## TASK 4: Loop-invariant stores

We have seen in Task 3 that `moveFromCallFrame` instructions can be
placed speculatively to convert partial availability of a needed
expression to full availability (and remove a `moveFromCallFrame`
later in the program on a 'busier' execution path). In a similar manner,
`moveToCallFrame` instructions can be placed speculatively at the
head of blocks in order to make partially dead assignments ("faint"
assignments in the terminology of \[KNOO94\]) fully dead and eliminate
a `moveToCallFrame` on an eariler path of the program. 

\[LO98\] demonstrates how this partial liveness analysis is precisely
dual to partial availability analysis, and introduces a dual form to
Static Single Assignment (SSA), called Static Single Use (SSU). This
latter form has similar advantages to SSA in that the dependency
chain, here from definition to use rather than from use to definition,
is factored so that there are explicit deviation points (dual to the merge
................................................................................
reason, the more computationally complex analysis of \[KNOO94\] will
be followed, but modified to operate on basic blocks rather than
individual instructions.

We begin with the liveness analysis of Task 2. We augment 'liveness'
with the concept of partial liveness. The data flow equations will look like

![\textt{LIVE\_OUT}[B] formula](liveout.svg)

![\textt{PLIVE\_OUT}[B] formula](pliveout.svg)

![\textt{LIVE\_IN}[B] = \textt{process}(B, \textt{LIVE\_OUT}[B])](livein.svg)

![\textt{PLIVE\_IN}[B] = \textt{process}(B, \textt{PLIVE\_OUT}[B])](plivein.svg)

In these equations, _process_ represents the basic-block-level upward
analysis from Task 2. The _LIVE_ and _PLIVE_ sets are sets of
(callframe, variable) pairs, and it is assumed that the arguments to
the intersection and union are translated if the callframe appears in
a φ operation.

Liveness corresponds to the 'availability' of the partial redundancy
elimination. Just as a `moveFromCallFrame` may not be hoisted over the
point where it is available, a `moveToCallFrame` may not be sunk past the
point where it is live.

Similarly, `moveToCallFrame` creates values and corresponds to
anticipability. Just as it is of no benefit to insert a
`moveFromCallFrame` for a value that is not anticipated, it is of no
benefit to insert a `moveToCallFrame` for a value that is already present.



## References

\[DREC93\] Karl-Heinz Drechsler and Manfred P. Stadel. "A variation of
Knoop, Rüthing, and Steffen's _Lazy Code Motion_". _ACM SIGPLAN