A recent study by researchers at the University of Utah suggested that the amount of food diners in a restaurant consumed was influenced by fork size. I haven’t seen details of the study, but it does remind me that people can draw diametrically opposite conclusions from the same raw data by altering definitions ever so slightly.
If only such contradictory results were contrived and isolated phenomena, but they’re not. When dealing with weakly correlated quantities, we often can come up with spurious trends and associations by artfully defining the size of the categories we use. This has been done recently in studies of violent crime to show that certain categories of crime were changing in the desired direction, and I intend to illustrate the point here with a similar story.
Using the fork study for inspiration only, let’s see how small variations in definitions can make all the difference. Imagine 10 diners at a buffet and consider the possible influence of plate size on how much they consume. Three diners were provided with plates that were deemed small, say, less than 8 inches in diameter, and they consumed 9, 11 and 10 ounces of food, for an average of 10 ounces. Now further assume that four diners were provided with medium-size plates, say, between 8 and 11 inches in diameter, and they consumed 18, 7, 15 and 4 ounces of food, for an average of 11 ounces.
Finally, we’ll assume that the remaining three diners were provided with plates deemed large, say, larger than 11 inches in diameter, and they consumed 13, 11 and 12 ounces, for an average of 12 ounces.
Spot the trend? As the plate sizes increased from small to medium to large, the average amount consumed increased from 10 to 11 to 12 ounces. Aha, a nice result!
But wait. What if the medium-size plates were very slightly redefined to be between
8.2 and 10.8 inches, and the small and large plates were redefined accordingly? And what if this redefinition resulted in the misclassification of two diners? The diner who ate 18 ounces of food was actually provided with a small plate (say, 8.1 inches in diameter), and the diner who ate only 4 ounces was actually provided with
a large plate (say, 10.9 inches in diameter).
Let’s do the numbers once again under this assumption. Four (rather than three) diners were provided with small plates, and they consumed 9, 11, 10 and 18 ounces of food, for an average of 12 ounces. Two (rather than four) diners were provided with medium-size plates, and they consumed 7 and 15 ounces of food, for an average of 11 ounces. Four (rather than two) were provided with large plates, and they consumed 4, 13, 11 and 12 ounces of food, for an average of 10 ounces.
Spot the trend? As the plate sizes increased from small to medium to large, the average amount consumed decreased from 12 to 11 to 10 ounces. Aha, a nice result!
Moreover, small samples are not the problem here. A large number of data points make this sleight of hand even easier because it provides more opportunity to fiddle with the categories. Anyone for sunspot intensity or Super Bowl outcomes?