Bill Pugh was ill. Instead of that discussion to warm up about this topic.
Moores law:
Proebstings law of optimization:
In OCT almost everything is dynamic: indirect addressing always.
nice framework CRL for generic analyses
Uses ILP. worst case execution times. estimation of bounds.
TDL target description language for target description.
Postpass techniques seem to be a la mode since they do not touch the compiler
AVIV Hanano, ICG Bashford/leupers 99 Search techniques
PROPAN uses ILP to integrate several techniques (RA, IS, unit binding)
time or order-based ILP.
constraints that model data dependencies or flow modelling, register allocations.
Number of constraints and variables is quadratic.
Is there a techique besides ILP to solve constraints?
Wolf meint, im Scheduling wären bessere Abschätzungen möglich, wie gut man an das Optimium herankommt.
first work on PLDI 96 by Ramalingan
quite different; finding out with which probability a DFA result is valuable
Rita, an elimination framework for solving dfa problems
A technique to group and order rule sets of rewrite systems, to select one of the many normal forms of an indeterministic system.
Jalapeno is a set of compilers (fast one, good one).
IBM gave up its static Java compiler since it was too since it was too complicated to introduce dynamism into it.
Quicksilver generates parametrizable binary code for Java VMs.
Uses Relocation techniques. to bind variable and function addresses
very interesting.
Good register allocation. Must be fast. Enhanced left-edge algorithm from Tucker 75. very interesting. SSA-based control flow graph. PACT99 paper. Unitof allocation and optimizuation: tree region ofblocks.
latte.snu.ac.kr
Data flow may not follow all paths, therefor bidirectionality is advantageous.
Critique on Muchnicks book because it bans bidirectional flows...
Distinguish node and edge flow functions. That give a general framework, from which unidirectional analyses are special cases.
Concept of graph width helps to improve the complexity estimate for bidirectional flow. Toplas 94. pldi 93.
PRE is not true bidirectional; true bidirectional problems cannot be made unidirectional.
Rüthing has also a paper in SAS 98.
It is the same problems as in AGs, replacing unificationby passes.
Assignment motion introduces bidirectional flows.
(Debugging optimized programs). One needs to build execution histories, trajectories.
Huffman-encoding of the program. The most probabble program gets the least bits.
idea: Annotate type info to ssa-language to be more sure about effects.
Experiments in global scheduling, trace scheduling, superblock scheduling
in the Cameleon compiler of IBM YH, many optimizations
Overview on optimal scheduling algorithms for basic blocks. Search-based, topsort based (wavefront), ILP based algorithms. Important for Boris.
has an equivalence class construction, folding common subtrees of schedules.
then saving only the minimal cost paths for wavefronts.
Induction over wavefronting.
Using wavefronts for optimal problems! for ``being ready'' problems
Whats the difference between dynamci programming and wavefront? the former is cost-based.
Wolf: scheduling of parallel machines is similar.
multithreaded processors.
Use threads for speculative execution of branches. Kill other branch as soon as possible. Make a common project from firm/klirflow.
lazy code motion plus code-size awareness.
Computing down-safety closure. For those nodes thatt do not introduce code replications in insertion. then compute a bipartite graph and from that a tight set from which the insertion points can be guessed.
Extend JVM and raise VM Exceptions instead of explicit checks..
Array of array bounds (array protectioin register), 2 new instructions protected-load and protected-store use this array to raise an exception. Validity check is performed in parallel to the memory operation.
Special optimization for APregisters, to optimize their allocation
insertion of set_bound instructions, just before ploads, then moving them with code motion
AP registers are virtual, i.e. must be allocated, and spilling means software checking instead of hardwarechecking.
Incrementalization:
Incrementalization is the opposite of finding a closed form for recurrences
The trick is to use inverses or associativity. I dont like her classification, it is without names and not precise enough:
In case of multiple recursions, use for incrementalization that recursion with the minimal put change
Was bedeutet das in Termalgebra??
Project dynamo at HP
Similar: Redstone/Wiggins at Compaq
Goal: adapting the binary to its execution environment
principle heuristic: keep the dynamic optimizations simple (hard testing,debugging, long runtime of complicated algorithms)
looking for hot program paths.
Comparison.
Path profiling metric. Compute hit rate and noise. Optimize noize and reduce hit rate.
Noise is worse with NEt, but the rest is the same or better.
Profiling for longer intervals does not improve prediction wuality due to rising missed opportunity cost
roughly identical hit rates.
Dynamo is part of the runtime system. dynamo looks lke a software interpreter that executes the same instruction set as the underlying cpu.
Related work: boa binary translator IBM
How can that be used for dynamic reconfiguration of architectural code?
some classes of optimizations in backend specifications. Disappointing.
Coalescing can increase the chromatic number if an inference graph, if dependency graphs are rebuilt.
Improving the linear Scan register allocation algorithm with splitting live ranges. Use the usage density. Looks very interesting as fast nice allocator.
almost 30% of reduction of code size.
Cache-conscious data structures (Larus/chilimbi)
Locality enhanced servers.
Techniques:
Aber: das alles reicht nicht, man sollte neue Sprachen entwickeln, die cachesensitiv sind. Wär das nicht was für AOP?
new algorithm for node splitting in irreducible graphs
works with DJ-graphs to recognize regions in VPO compiler
interesting method to save def-use-analysis for dsps
first for basic bllocks calculate ``delta values''
for procedures be sure by a dfa framework that guarantees that delta values are the same for neighbored basic blocks.
FLaSH project. first-order language with tail recursion. Conflicts if a function is called twice - then the first result must be memorized in order to reuse it again.
Good compilers should only prefetch that data which will actually be used.
Finding pointer increment variables
Does a profitability analysis, based on a recurrence measure and estimated memory bandwidth.
Builds a Miss Interference Graph
VERY GOOD: 10 years of development. Gnu Public license for Linux IA 64.
Phases:
IRs: very high (incl. fortran 90), high, mid, low WHIRL intermediate representation.
gcc, g++ are the base.
IPA:
ipa optimizations are re-called by the linker before linking.
loop nest optimizer mit cost model
Parallelization
global optimizations: SSA (version of CC96), all traditional optimizations
design for testability and debugability.
backend goes back to quadrupels.
target information table, several processors modeled IA-63, IA-32, MIPS
Hyperblock optimizer, based on Mahlke
predicate query subsystem make queries about predicate values
loop preparation for sw pipelining
get it:
Anton Ertl.
Sam Mifkin, IBM
Jan-Erik /Swedes AIR:
Max Hailperin:
Jim Dehnert:
S. Pande:
Dynamic optimizations, trends inarchitecture research. different measures: power, code size, incremental (scaling) optimizations.
D. Dhamdere: Execution histories. also for dynamic slicing.