Vitaly Parnas - Speculative execution in Computer Architecture

In Computer Architecture, speculative execution is one of a multitude of strategies processors have employed for decades to maximize computational throughput (speed). Specifically, SE is an Instruction Level Parallelism technique that executes a series of instructions before the CPU deems them necessary. The speculative instructions specifically are those that result as an outcome of a branch/conditional statement. The CPU determines, based on execution pattern, the most probable branch, and executes a series of the proceeding instructions.

Why is this necessary? Or rather, how does this strategy increase throughput? A pipelined CPU datapath consists of many units tasked for specific computation, all of which can be executed in parallel. This includes arithmetic-performing units, floating-point units (FPUs), memory load, memory write, instruction decoding, branch operations, bit shifts, etc, many of which available in multiples. The more units are actively occupied, the higher the overall throughput. You can wash one load, and simultaneously dry another.

Because certain types of computation may demand far more computational cycles than others, it behooves the CPU to initiate them as early as possible, especially in presence of a vacant unit available in the datapath. A likely branch of execution may involve a floating point operation (relatively expensive to simple integer arithmetic). Provided the CPU datapath contains a vacant FPU, it should task the FPU with the floating-point operation, even with the branch not yet asserted and at risk of later discarding the operation.

While a miscalculated speculation involves additional cycles to backtrack to an asserted path of execution, and the speculative logic in itself involves additional bookkeeping, the idea is that such speculation succeeds more often than it fails, and the average throughput increases in comparison to a non-speculative CPU.

Questions, comments? Connect.