Cover Image: October 2010 Scientific American Magazine See Inside

When Small Numbers Lead to Big Errors

A statistician weighs in on the pitfalls of estimating the sizes of small groups















Share on Tumblr

As the U.S. military embarks on its review of Don’t Ask, Don’t Tell, to be delivered in a final report later this year, the question arises: How many service members are affected by the policy? To help answer that question, the Pentagon this past summer surveyed its troops, asking them if they served or had ever served with someone they believe to be gay. Leaving aside an obvious problem with the survey—that it is based on pure speculation—it also raises a common statistical challenge: asymmetry in population sizes. Because the vast majority of service members are heterosexual, many more straights will be misclassified as gays than vice versa.

This is a general problem in survey research. For example, Harvard University researcher David Hem­en­way has shown how some well-publicized studies have overestimated the number of guns used in the U.S. for self-defense by 10 times. Even if only 1 percent of respondents answer the survey incorrectly, the error is large compared with the proportion of the general population in any given year that uses guns in self-defense, which reasonable studies show to be about 0.1 percent.* In other words, the misclassification rate far exceeds the actual population size. To get around this problem, we would be wiser to trust surveys of crime victims, which restrict the gun use question to a smaller pool of subjects. 

As to our original question, a reasonable (though still imperfect) way to figure out what percentage of military service members are gay is by combining two estimates: the percentage of gays in the general population (easy enough to estimate from national surveys) and an estimate of the percentage of individuals in same-sex unmarried partner couples who report ever having served in the military (known as a probability). By extrapolating from the general population to service members, you are restricting your analysis to same-sex unmarried couples and thus narrowing the pool of potential false positives. Gary J. Gates of the University of California, Los Angeles, estimates using this method that 1.5 percent of men and 6.2 percent of women in the military are gay or bisexual.

*Correction (10/19/10): This sentence was edited after posting. It originally stated that the 1 percent error is large compared with the proportion of the general population that owns guns for self-defense.



This article was originally published with the title When Small Numbers Lead to Big Errors.



Subscribe     Buy This Issue

Already a Digital subscriber? Sign-in Now
If your institution has site license access, enter here.

ABOUT THE AUTHOR(S)

Gelman is a professor of statistics and political science at Columbia University.


7 Comments

Add Comment
View
  1. 1. JesseC 07:55 PM 10/7/10

    “the error is large compared with the proportion of the general population that owns guns for self-defense, which reasonable studies show to be about 0.1 percent”

    In 2007 there were 9880 applications for concealed carry permits in the state of Colorado. The population in that year was 4.7M for an annual rate of 0.2% and the permits are good for 5 years. By extension, the minimum percentage of the population that owns guns for self defense is at least 10 times the number cited – and that is just the ones who feel that they need to actually carry a concealed firearm.

    When Gelman repeats such self evident drivel, he calls the rest of his report into question.

    Reply | Report Abuse | Link to this
  2. 2. andrewgelman 11:07 PM 10/9/10

    JesseC:

    Thanks for pointing out my typo. What I meant to say was, "the proportion of the general population in any given year that uses guns in self-defense, which reasonable studies show to be about 0.1 percent." It should be "uses," not "owns." I sent a message to my editor at Scientific American and I assume this will be corrected soon.

    Reply | Report Abuse | Link to this
  3. 3. Kompost in reply to andrewgelman 08:26 AM 10/19/10

    Again, a minor mistake that leads to big error :)

    Reply | Report Abuse | Link to this
  4. 4. jbairddo 08:54 AM 10/19/10

    A) Gun survey vs. guessing someone is gay, how is this even close.
    B) I would never attempt to guess if someone is gay, but I am pretty sure I would know if a bought a gun for SD or used it for SD.
    C) Every study has a bias, which is why samples are listed with a standard deviation and p values. If they are not statistically significant based on sample size they are not published, a fact I would think that a or of statistics would know. The fact that it is not stated suggests this is an anti gun piece and has nothing to do with gays in the military or statistics.

    Reply | Report Abuse | Link to this
  5. 5. TTLG 01:20 PM 10/19/10

    While it was interesting to see the effects of numerical disparities, it seems that the more fundamental problem is the large errors associated with surveys, especially those addressing topics which are highly emotional for many people (as can be seen from the comments on any article addressing gun control or homosexuality).

    It seems to me that the ultimate solution will have to be going from surveys to actual observations. For example, in the case of gun use, obtaining and examining any available security camera footage of crimes would be much more work than any survey, but would go a long way to resolving the question, especially if these observations were combined with interviews of the victims and comparing the results.

    Making observations on sexuality is even more difficult while keeping within ethical limits, but even this can be done to some extent. For example, why did the surveyed military personnel believe another was gay? Were they told, or was it some observation of how they behaved? What behavior? In principle, any public observations could be duplicated by an independent observer and evaluated on whether they do, in fact, correlate to homosexuality.

    While these suggestions also clearly have their limitations, I think getting past people's biases and erroneous perceptions is necessary to get truly trustworthy answers to many of these types of questions.

    Reply | Report Abuse | Link to this
  6. 6. R T Nicholson 08:38 PM 10/20/10

    It seems to me that the ultimate solution will have to be going from surveys to actual observations. For example, in the case of gays in the military, obtaining and examining any available security camera footage of homosexual liasons would be much more work than any survey, but would go a long way to resolving the question, especially if these observations were combined with interviews of the participants and comparing the results.

    Reply | Report Abuse | Link to this
  7. 7. cjrisi88 in reply to jbairddo 12:28 PM 11/1/10

    Every study does not necessarily have bias,also bias has nothing to do with standard deviation or p value. Standard deviation has to do with measuring the variation in a random variable.
    If you sample something it is near impossible to get the exact same result every time, it is usually distributed around the mean of a set of values.

    A pvalue is for measuring the probability of getting the set of data you sampled if the distribution you are testing against is actually the true distribution of the data. You can still get a very low p-value, make conclusions off of it and have a large amount of bias in your data.

    Also statisticians usually do not call these things bias they call them sampling error or measurement error. Bias is something different, when a statistician is referring to bias they are speaking about how the expected value of an estimator and how it is this value is not equal to the parameter it is trying to estimate.

    See http://en.wikipedia.org/wiki/Estimator for more details on this.

    Reply | Report Abuse | Link to this
Leave this field empty

Add a Comment

You must sign in or register as a ScientificAmerican.com member to submit a comment.
Click one of the buttons below to register using an existing Social Account.

More from Scientific American

See what we're tweeting about

Scientific American Editors

More »

Free Newsletters


Get the best from Scientific American in your inbox

Solve Innovation Challenges

Powered By: Innocentive

  SA Digital
  SA Digital

Email this Article

When Small Numbers Lead to Big Errors: Scientific American Magazine

X
Scientific American Magazine

Subscribe Today

Save 66% off the cover price and get a free gift!

Learn More >>

X

Please Log In

Forgot: Password

X

Account Linking

Welcome, . Do you have an existing ScientificAmerican.com account?

Yes, please link my existing account with for quick, secure access.



Forgot Password?

No, I would like to create a new account with my profile information.

Create Account
X

Report Abuse

Are you sure?

X

Institutional Access

It has been identified that the institution you are trying to access this article from has institutional site license access to Scientific American on nature.com. To access this article in its entirety through site license access, click below.

Site license access
X

Error

X

Share this Article

X