CMP or SMT?
The performance race between SMT and CMP is not yet decided.
CMP is easier to implement, but only SMT has the ability to hide latencies.
A functional partitioning is not easily reached within a SMT processor due to the centralized instruction issue.
- A separation of the thread queues is a possible solution, although it does not remove the central instruction issue.
- A combination of simultaneous multithreading with the CMP may be superior.
Research: combine SMT or CMP organization with the ability to create threads with compiler support or fully dynamically out of a single thread
- thread-level speculation
- close to multiscalar