Real-space Methods for Electronic Structure Calculations of over 100,000 Atoms
Two factors limit our ability to accurately describe the properties of materials: (1) the ability characterize multiple electron interactions, and (2) the computational tools to solve the resulting equations. With density functional theory (DFT) and the use of pseudopotentials, the electronic structure problem can be effectively solved for many weakly coupled systems. Computational cost of the Kohn–Sham equations is still a problem, frequently restricting the systems of interest to just a few thousand or fewer atoms. Here, we discuss novel methods that let us solve systems that contain more than 100,000 atoms. We concentrate on new computational algorithms based on real-space DFT and pseudopotentials. Our strategy has several benefits. The global communication required for fast Fourier transforms is avoided by real-space formalisms, such as finite differences and finite elements, which also provide superior scalability for big calculations across hundreds or thousands of computer nodes. Furthermore, finite-difference techniques with a uniform real-space grid offer simple implementation; for instance, the grid spacing alone determines how quickly a Kohn-Sham solution converges. Based on a Chebyshev-filtered subspace iteration method (CheFSI), we developed a promising approach for solving the Kohn–Sham equations in real space. We will illustrate two improvements on CheFSI to enhance scalability and accelerate the calculations: (1) a hybrid method that combines a spectrum slicing method and CheFSI, which divides a Kohn–Sham eigenvalue problem into subproblems wherein each subproblem can be solved in parallel using CheFSI; (2) a grid partitioning method based on space-filling curves which improves the efficiency of the sparse matrix–vector multiplication—the key component in CheFSI. We show with computations of confined systems with over 100,000 atoms or 400,000 electrons, that this method effectively reduces the communication overhead and improves the utilization of the vector processing capabilities provided by most modern parallel computers.