Programs tend to be predictable and follow patterns. We can use that to be able to predict whether or not a branch will be taken, and increase instruction throughput. This post discusses how the Gshare branch predictor works.
Gshare uses two structures to make predictions. A pattern history table and a global history register.
Pattern History Table
A pattern history table (PHT) is an array of 2-bit saturating counters. Saturating arithmetic is when the output of any operation has a fixed minimum and maximum value. With a 2-bit saturating counter, this means that there is no overflow. If the value of the counter is 11
and we increase the value, it is still 11
. We use this saturating arithmetic to represent a finite state machine that represents four states: strongly taken, weakly taken, weakly not-taken and strongly not-taken.
Here is the finite state machine of a 2-bit saturating counter:
Every time the output of a branch instruction is evaluated, the PHT gets updated. If it’s current state is weakly taken, and that branch was evaluated and was actually taken, we update the counter to be in the strongly taken state.
This structure keeps track of a branches local history. If it’s state is strongly not-taken, then it is very likely that that branch will not be taken. We can use this structure to make predictions on a given branch. We could index into this array of counters with the program counter (PC) and make a prediction based on this local history.
Global History Register
It has been shown that taking global history and local history into account can get you better predictions. Global branch history is a typically sequence of 1’s and 0’s, where each bit represents whether the ith last branch was taken. Instead of looking at what the current branch has done so far, like a PHT, global history looks at what the last N branches has done.
We can represent this as a shift register. Every time a branch executes, we shift the bits to the left, and add the branch outcome to the end.
Instead of indexing with program counter, gshare XORs the global history with the program counter, and uses this output to index into the PHT.
Here is the full diagram: