Beamforming adaptive arrays with graphics processing units
Beamforming is a signal processing technique by which an array of receivers sensitive to signals from all directions can be processed to form one larger more sensitive receiver that can identify which direction signals originate. Conventional beamforming methods can allow signals from noisy interferers to mask signals of interest if these interferers lie close to those directions to which the beamformer is sensitive. Adaptive beamforming (ABF) attempts to overcome this by minimizing the beamformer’s output subject to certain constraints. At its core ABF is an optimization problem, and a robust ABF procedure that consistently provides an optimal solution is computationally expensive. Nevertheless, ABF is of particular interest to the US Navy, where personnel trained to analyze acoustic data from sonar receivers can locate and track quiet targets of interest, e.g. submarines, that may be masked by sources of both ambient and directional noise in the ocean. ABF can be implemented to operate concurrently on sets of frequency-independent data, thus making ABF well-suited for parallel processing. Additionally, due to ABF’s high density of arithmetic operations, it is a suitable candidate for implementation on modern graphics processing units (GPUs). GPUs have been designed to quickly perform many concurrent arithmetic operations on large amounts of data. Furthermore, as of early 2007 they have reached a point at which they are not only capable of performing general-purpose tasks completely unrelated to graphics but also can be programmed to do such tasks far more easily and more naturally than has previously been possible. I show a method for parallelizing an existing serial ABF algorithm on an NVIDIA Geforce 8800 GTX, one of the first GPUs to use a generalized stream processor-based architecture. I take a single program, multiple data (SPMD) approach where the same software kernel executes over multiple blocks of frequencyindependent data in parallel. Further parallelism is exploited by subdividing each block into smaller subsets, independent of the direction the array is steered, and operating on these concurrently. Although initial results indicate that the GPU-based beamformer yields lower throughput than its serial counterpart, a number of possible optimizations are discussed that can allow the GPU implementation to match, if not exceed, the serial implementation.