Distributions of sums of binary variables in survey research
In survey research, researchers often add up a finite number of binary responses to form an index of some political attitude or behavior, such as political knowledge and political participation. Indices of this sort are called grouped binary variables in political science research; they comprise finite and countable binary items that take on only integer values ranging from zero to the total number of survey items. Commonly-used distributions for modeling these kinds of indices are the binomial, beta binomial, and extended beta binomial distributions. But whether these distributions are appropriate depends on the assumptions that the binary responses are identically and independently distributed Bernoulli random variables. If these assumptions are violated, the binomial, beta binomial, and extended beta binomial models are rendered questionable, and it may be more useful to turn to other distributions of sums of Bernoulli variables, called generalized binomial distributions.
To facilitate the use of generalized binomial distributions in political science research, this report is a review of the various probability distributions of grouped binary variables. This report clarifies the nature of the distributions of sums of Bernoulli variables in survey research by considering whether the Bernoulli variables are independently and/or identically distributed, whether there is heterogeneity across survey items and/or across respondents, and the consequences of these considerations for the relative dispersion of each generalized binomial distribution.