Wednesday, November 02, 2011

This is a load of manure - ignore these reports.

It seems that everyone is playing with the "data" collected in the Nurses' Health Study. Yes, hundreds of thousands of people were subjects in that study and there were some significant findinigs. Sadly, huge amounts of the data were generated by self reporting. That means that the participants told the investigators their perception of their actions. This is a wildly inaccurate form of data collection and it is inappropriate and unwise to use it, especially when you are evaluating rare events within the sample. Why?



First, the "data" is unreliable. Quickly, can you tell me how many ounces of alcohol you consumed each week for the past month? Do you know how much alcohol is in a glass of wine or a bottle of beer? Even, how much alcohol is in an ounce or whiskey or vodka? No, unless you are well versed in percentages and the concept of proof, you won't know those numbers. Therefore, you are only going to report the number of drinks, glasses of wine, or number of beers. The investigators will convert to ounces of alcohol.



Maybe that's good until I ask you to accurately report how many glasses of wine, martinis, or beers you drank per week during the previous month. Heck, I can't even tell you what I had for lunch last Saturday let alone describe the number of drinks I might have consumed.



Perhaps the numbers work to the advantage of the bean counters who were playing with the "data" (not real data, btw). So what? The reported risk is 333 per 100,000 years, or 0.0033 per year, or 0.167 per 50 years. It may be mathematically significant, but who can honestly worry about risks that are so tiny? Maybe the researchers, but not me.


No comments:

Post a Comment