SW pipelining by register rotation- Optimizations and limitations
Register rotation removes the requirement that kernel loops be unrolled to allow software renaming of the registers.
Speculation can further increase loop performance by removing dependence barriers.
Technique works also for while loops.
Works also with predicated instructions (instead of assigning stage predicates).
Also possible for multiple-exit loops (epilog get more complicated).
- Loops with very small trip counts may decrese performance when pipelined.
- Not desirable to pipeline a floating-point loop that contains a function call (number of fp registers is not known and it may be hard to find empty slots for instructions needed to save and restore the caller-saver floating-point registers across the function call).