TODO
From Libfirm
Contents |
[edit] Improvements
[edit] Firm
- It would be a cool thing to make firm tarvals actually represent expressions (add, sub) of symconsts and constants - limited by what the backend/linker can support. This would make handling of constant folding and matching addressmode in the backend alot easier since you only have to look for Const nodes and don't have to combine Const+SymConst to ia32_Const anymore. It would also help the combo optimisation.
- Implement a new (fixpoint based) control flow optimization
- Optimize Cmps in deconv (find criteria when this is possible)
- (X & c1) >>s Y => (X & c1) >> Y if c1 doesn't have sign bit set
- i >= 2 && i <= 9 can be done as (unsigned) (i-2) <= 7
- if (~x & 1) -> if (!(x & 1)
- X + (signbit) --> X ^ signbit
- combinations of (X ==,<,>,!= C1 & X ==,<,>,!= C2) -> (X op C)
- DeMorgan phase for boolean expressions
- optimise Cmp(&global, NULL), Cmp(address + offset, NULL), Sel(FramePtr) != NULL, This would make a nice confirm candidate (confirming such addresses to be != NULL).
- A new if-optimisation phase which should improve the effect of localopts like (a == 0 && b == 0 -> a | b == 0): The optimisation should:
- detects if(cond1 && cond2) and if(cond1 || cond2) (cond1/cond2 having no side-effects of course) constructs and tests wether we get simpler code when we construct a single if with AndB/OrB for the conditions
- for (maybe a limited) range of && ... && ... and || ... || ... conditions permutates them to test wether some of them get optimized away
[edit] Backend
- We don't want to schedule Phis. The only problem is that we need to find a list of phis in a block often in the backend so we need another (fast) way to do this if they aren't scheduled anymore. (see also the linked list of phis proposal in the Firm section)
- The Spillslot coalescer is suboptimal when mergin spillslots of different sizes.
- Create an amd64 backend
- Belady spiller: fix_block_borders: If a value is reloaded in lots of empty predecessor blocks (with high execfreq) and are only available on less frequently executed blocks, then reloading in the current block should be better as we can keep the critical-edge-split blocks empty and remove them later.
- We need a must_be_same register constraint (additionally to the should_be_same constraint). This usually means creating copies in front of the instruction if a value doesn't die at the instruction.
[edit] x86
- We could allocate stack space for function calls in advance and don't do IncSP/IncSP before and after function calls
- transform (Const - X) to -X + Const, it's both 2 commands in x86 but add has more potential for other optimisations, but don't do this when we can use SourceAM
- When we have float parameters in private functions, use the fp stack for them (needs a sim_RegParams)
- Add support code to make it possible to fold reloads into leas (which will end up in an add node)
[edit] Bugs
- Block out edges are wrong after local_optimize merged 2 blocks.

