The main problem of instruction fetching is control transfer performed by jump, branch, call, return, and interrupt instructions:
- If the starting PC address is not the address of the cache line, then fewer instructions than the fetch width are returned.
- Instructions after a control transfer instruction are invalidated.
- A multiple cache lines fetch from different locations may be needed in future very wide-issue processors where often more than one branch will be contained in a single contiguous fetch block.
Problem with target instruction addresses that are not aligned to the cache line addresses:
- Self-aligned instruction cache reads and concatenates two consecutive lines within one cycle to be able to always return the full fetch bandwidth. Implementation:
- either by use of a dual-port I-cache,
- by performing two separate cache accesses in a single cycle,
- or by a two-banked I-cache (preferred).