Browsing by Subject "Value prediction"

Now showing 1 - 2 of 2

Advancing value prediction
(2019-05-09) Subramanian, Anjana; Lin, Yun Calvin
Read after write dependencies form a key bottleneck in single thread performance. Value prediction [9][10][18] is a speculative technique that overcomes these dependencies by predicting results of instruction execution, thereby preventing dependent instructions from stalling. Usually, the penalties for value mispredictions are extremely high. As a result, value predictors have evolved to prioritize accuracy over coverage. To improve upon the state-of-the-art, our goals are: (i) to develop more powerful prediction mechanisms that have a better accuracy-coverage tradeoff (ii) to maximize performance gains obtained from correct predictions. We present two independent pieces of work that address each of these. To achieve the first goal, we design a Heterogeneous Context-based Value Predictor (HCVP) that combines the use of branch history with value history to represent program context information. We demonstrate that this combination provides better predictability than using either of them individually and that it allows for the use of relatively short value history lengths that provide more coverage than very long ones. HCVP does not maintain speculative value histories as it more tolerant to the update problem that occurs when back to back instances of the same instruction are predicted. Our predictor performs better than the state-of-the-art value predictors (E VTAGE and DFCM++) to achieve a 29% speedup over a baseline with no value prediction. When combined with the E Stride predictor, it achieves a speedup of 46%, which is 9% higher than that achieved by E VTAGE E Stride (EVES), the winner of the First Championship Value Prediction. To achieve the second goal, we exploit the fact that some instructions are more performance critical than others. We categorize instructions by various parameters to find one or more classes of instructions that provide high performance benefits for correct predictions. We find that loads, address producing instructions, and high fanout instructions are extremely beneficial for value prediction.
Techniques for advancing value prediction
(2019-05-09) Joshi, Pawan Balakrishna; Lin, Yun Clavin
Sequential performance is still an issue in computing. While some prediction mechanisms such as branch prediction and prefetching have been widely adopted in modern, general-purpose microprocessors, others such as value prediction have not been accepted due to their high area and misprediction overheads. True data dependences form a major bottleneck in sequential performance and value prediction can be employed to speculatively resolve these dependences. Accurate predictors [1] [2] have been shown to provide performance benefits, albeit requiring a large predictor state. We argue that a first step in making value prediction practical is to manage the metadata associated with the predictor effectively. Inspired by irregular prefetchers that store their metadata in off-chip memory, we propose the use of an improved prefetching mechanism for value prediction that not only provides performance benefits but also a means to off-load predictor state to the memory hierarchy. We show an average of 5.3% IPC improvements across a set of Qualcomm-provided traces [3]. The result of a static instruction can be predicted by mapping runtime context information to the value produced by the instruction. To that end, existing value predictors either use branch history contexts [2] or value history contexts [1] to make predictions. As long histories are needed to achieve high accuracy, these approaches slow down the training time of the predictor, negatively impacting coverage. We identify that branch and value histories both provide distinct advantages to a value predictor, and therefore combine them in a novel predictor design called the Relevant Context-based Predictor (RCP) that maintains high accuracy while improving training time. We show an average of 38% speedup over a baseline that performs no value prediction on the Qualcomm-provided traces, compared to 34% by the previous best.