If the fetch engine is providing 16 to 32 instructions per cycle, then the execution core must consume instructions just as rapidly.
To avoid unnecessary delays due to false dependencies, logical registers must be renamed.
To compensate for the delays imposed by the true data dependencies, instructions must be executed out of order.
Patt et al. envision an execution core comprising 24 to 48 functional units supplied with instructions from large reservation stations and having a total storage capacity of 2,000 or more instructions.
Functional units will be partitioned into clusters of three to five units. Each cluster will maintain an individual register file. Each functional unit has its own reservation station.
Instruction scheduling will be done in stages.