See Inside

# When Small Numbers Lead to Big Errors

A statistician weighs in on the pitfalls of estimating the sizes of small groups

As the U.S. military embarks on its review of Don’t Ask, Don’t Tell, to be delivered in a final report later this year, the question arises: How many service members are affected by the policy? To help answer that question, the Pentagon this past summer surveyed its troops, asking them if they served or had ever served with someone they believe to be gay. Leaving aside an obvious problem with the survey—that it is based on pure speculation—it also raises a common statistical challenge: asymmetry in population sizes. Because the vast majority of service members are heterosexual, many more straights will be misclassified as gays than vice versa.

This is a general problem in survey research. For example, Harvard University researcher David Hem­en­way has shown how some well-publicized studies have overestimated the number of guns used in the U.S. for self-defense by 10 times. Even if only 1 percent of respondents answer the survey incorrectly, the error is large compared with the proportion of the general population in any given year that uses guns in self-defense, which reasonable studies show to be about 0.1 percent.* In other words, the misclassification rate far exceeds the actual population size. To get around this problem, we would be wiser to trust surveys of crime victims, which restrict the gun use question to a smaller pool of subjects.

As to our original question, a reasonable (though still imperfect) way to figure out what percentage of military service members are gay is by combining two estimates: the percentage of gays in the general population (easy enough to estimate from national surveys) and an estimate of the percentage of individuals in same-sex unmarried partner couples who report ever having served in the military (known as a probability). By extrapolating from the general population to service members, you are restricting your analysis to same-sex unmarried couples and thus narrowing the pool of potential false positives. Gary J. Gates of the University of California, Los Angeles, estimates using this method that 1.5 percent of men and 6.2 percent of women in the military are gay or bisexual.

*Correction (10/19/10): This sentence was edited after posting. It originally stated that the 1 percent error is large compared with the proportion of the general population that owns guns for self-defense.

Already a Digital subscriber? Sign-in Now

Gelman is a professor of statistics and political science at Columbia University.

Rights & Permissions

View
1. 1. JesseC 07:55 PM 10/7/10

“the error is large compared with the proportion of the general population that owns guns for self-defense, which reasonable studies show to be about 0.1 percent”

In 2007 there were 9880 applications for concealed carry permits in the state of Colorado. The population in that year was 4.7M for an annual rate of 0.2% and the permits are good for 5 years. By extension, the minimum percentage of the population that owns guns for self defense is at least 10 times the number cited – and that is just the ones who feel that they need to actually carry a concealed firearm.

When Gelman repeats such self evident drivel, he calls the rest of his report into question.

2. 2. andrewgelman 11:07 PM 10/9/10

JesseC:

Thanks for pointing out my typo. What I meant to say was, "the proportion of the general population in any given year that uses guns in self-defense, which reasonable studies show to be about 0.1 percent." It should be "uses," not "owns." I sent a message to my editor at Scientific American and I assume this will be corrected soon.

3. 3. Kompost in reply to andrewgelman 08:26 AM 10/19/10

Again, a minor mistake that leads to big error :)

4. 4. jbairddo 08:54 AM 10/19/10

A) Gun survey vs. guessing someone is gay, how is this even close.
B) I would never attempt to guess if someone is gay, but I am pretty sure I would know if a bought a gun for SD or used it for SD.
C) Every study has a bias, which is why samples are listed with a standard deviation and p values. If they are not statistically significant based on sample size they are not published, a fact I would think that a or of statistics would know. The fact that it is not stated suggests this is an anti gun piece and has nothing to do with gays in the military or statistics.

5. 5. TTLG 01:20 PM 10/19/10

While it was interesting to see the effects of numerical disparities, it seems that the more fundamental problem is the large errors associated with surveys, especially those addressing topics which are highly emotional for many people (as can be seen from the comments on any article addressing gun control or homosexuality).

It seems to me that the ultimate solution will have to be going from surveys to actual observations. For example, in the case of gun use, obtaining and examining any available security camera footage of crimes would be much more work than any survey, but would go a long way to resolving the question, especially if these observations were combined with interviews of the victims and comparing the results.

Making observations on sexuality is even more difficult while keeping within ethical limits, but even this can be done to some extent. For example, why did the surveyed military personnel believe another was gay? Were they told, or was it some observation of how they behaved? What behavior? In principle, any public observations could be duplicated by an independent observer and evaluated on whether they do, in fact, correlate to homosexuality.

While these suggestions also clearly have their limitations, I think getting past people's biases and erroneous perceptions is necessary to get truly trustworthy answers to many of these types of questions.

6. 6. R T Nicholson 08:38 PM 10/20/10

It seems to me that the ultimate solution will have to be going from surveys to actual observations. For example, in the case of gays in the military, obtaining and examining any available security camera footage of homosexual liasons would be much more work than any survey, but would go a long way to resolving the question, especially if these observations were combined with interviews of the participants and comparing the results.

7. 7. cjrisi88 in reply to jbairddo 12:28 PM 11/1/10

Every study does not necessarily have bias,also bias has nothing to do with standard deviation or p value. Standard deviation has to do with measuring the variation in a random variable.
If you sample something it is near impossible to get the exact same result every time, it is usually distributed around the mean of a set of values.

A pvalue is for measuring the probability of getting the set of data you sampled if the distribution you are testing against is actually the true distribution of the data. You can still get a very low p-value, make conclusions off of it and have a large amount of bias in your data.

Also statisticians usually do not call these things bias they call them sampling error or measurement error. Bias is something different, when a statistician is referring to bias they are speaking about how the expected value of an estimator and how it is this value is not equal to the parameter it is trying to estimate.

See http://en.wikipedia.org/wiki/Estimator for more details on this.

You must sign in or register as a ScientificAmerican.com member to submit a comment.
Click one of the buttons below to register using an existing Social Account.

## More from Scientific American

• Features | 1 hour ago

### Scientific American's 2013 Gadget Guide: 10 Technologies You Need to See [Slide Show]

• News | 2 hours ago | 3

### In a 'Rainbow' Universe, Time May Have No Beginning

• News | 12 hours ago | 2

### Hopes Dashed for HIV Cure with Bone Marrow Transplant

• Reuters | 15 hours ago | 1

### Winter Storm Pushes Up U.S. East Coast after Deep-Freeze in the South

• Reuters | 17 hours ago | 5

More »

## Latest from SA Blog Network

• ### Mathematics+Fatherhood: an Interview with Darren Glass

Roots of Unity | 11 minutes ago
• ### Gag Me With a Spoon: "Val-Speak" Takes Over SoCal

Cocktail Party Physics | 11 hours ago
• ### Nerds and Words: Week 49

Overthinking It | 18 hours ago
• ### Photoblogging: Muppet or Flamingo?

MIND
The Thoughtful Animal | 20 hours ago
• ### Sunday Species Snapshot: Fijian Monkey-Faced Bat

Extinction Countdown | 22 hours ago

## Science Jobs of the Week

When Small Numbers Lead to Big Errors: Scientific American Magazine

X

Give a 1 year subscription as low as \$14.99

X

X

###### Welcome, . Do you have an existing ScientificAmerican.com account?

No, I would like to create a new account with my profile information.

X

Are you sure?

X