Software prefetching for memory-level parallelism




Kwon, Yongkee

Journal Title

Journal ISSN

Volume Title



In computer systems, latency tolerance is the use of concurrency to achieve high performance in spite of high latency. Existing techniques to tolerate long memory latencies include data prefetching, out-of-order instruction execution, and multithreading. However, because of limited buffers, purely hardware-based mechanisms are not capable of exploiting sufficient memory-level parallelism (MLP) to tolerate long memory access latencies for irregular applications with little locality. As a result, high-performance general-purpose processors that support out-of-order instruction execution, when executing irregular applications, often suffer from severe underutilization of the underlying resources whose design is balanced for regular applications with some amount of locality. This dissertation demonstrates that specialization in software with minimal hardware support in general-purpose processors can efficiently tolerate high latency and better utilize resources available for concurrent and parallel execution, by exploiting abundant MLP available in irregular applications, thereby improving performance.


LCSH Subject Headings