Main-memory near-data acceleration with concurrent host access

Date

2021-04-13

Authors

Cho, Benjamin Youngjae

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Processing-in-memory is attractive for applications that exhibit low temporal locality and low arithmetic intensity. By bringing computation close to data, PIMs utilize proximity to overcome the bandwidth bottleneck of a main memory bus. Unlike discrete accelerators, such as GPUs, PIMs can potentially accelerate within main memory so that the overhead for loading data from main memory to processor/accelerator memories can be saved. There are a set of challenges for realizing processing in the main memory of conventional CPUs, including: (1) mitigating contention/interference between the CPU and PIM as both access the same shared memory devices, and (2) sharing the same address space between the CPU and PIM for efficient in-place acceleration. In this dissertation, I present solutions to these challenges that achieve high PIM performance without significantly affecting CPU performance (up to 2.4% degradation). Another major contribution is that I identify killer applications that cannot be effectively accelerated with discrete accelerators. I introduce two compelling use cases in the AI domain for the main-memory accelerators where the unique advantage of a PIM over other acceleration schemes can be leveraged.

Description

LCSH Subject Headings

Citation