Disabled Core Patterns and Core Defect Rates in Xeon Phi x200 ("Knights Landing") Processors

Date

2021-10-18

Authors

McCalpin, John D.

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The Intel Xeon Phi x200 (“Knights Landing”, “KNL”) processor was Intel’s second-generation commercial many-core processor offering and the first offered as a standalone processor. Each processor die has 76 cores arranged in 38 pairs. Unlike Intel’s mainstream multicore processors, there were no product offerings with less than 84% of the cores enabled, making issues of yield critical. The Texas Advanced Computing Center deployed its 4200 Xeon Phi 7250 (68-core) processors in two phases: 504 nodes in June of 2016 and the remaining 3696 nodes in April 2017. Over 1100 different patterns of disabled cores are observed across the systems, with approximately 75% appearing only once. The most common pattern is seen in over 30% of nodes, with cores disabled at the tiles immediately above and below the two memory controllers. Interpreting these as the “default” cores to be disabled in the absence of defective cores allows disambiguation of cores that are disabled due to defects and those disabled to meet the target enabled core count. Analysis of the statistics of disabled cores in each of these two deployments supports the hypothesis that that core defects are random and independent, with a statistically significant reduction in the probability of defects between the first and second deployments.

Description

LCSH Subject Headings

Citation