Writing with data – Takeaways

Across the two readings for today, there were a number of important points that we should think about anytime we are writing from data.

1) What is the quality of your data? The sources and methods we use to collect data can lead to incorrect assumptions about the true state of the variables of interest. The best thing we can do to deal with this problem is to ignore the cliche “numbers don’t lie.” Numbers can lie just as easily as any source we deal with in journalism. The people who create or compile the data might have a vested interest in the results of the data or they might just be big dummies. Alternatively, if you are collecting the data you might not understand the variables you are interested in well enough to be able to find all the relevant information.

2) Correlation is not causation. Just because two variables are related, doesn’t mean that their is a causal relationship between those variables. The best way to deal with this problem is to be skeptical of yourself and your reporting. Is there any reasonable alternative explanation for the causal relationship you are looking at? Are you sure the relationship goes in the direction you think it goes?

3) Do your results generalize? Generalization refers to the ability for your results to replicate to other groups. Anytime we are looking at data that is less than a population or census, we need to be concerned about whether our results accurately reflect the variables/relationships of interest. Looking at the restaurant inspection data, we can easily see how lack of generalizability can be a problem.

If I called the health department and pull data for January through March of 2013, I would believe the health department does about 28 routine inspections a month or about 330 per year. As we know they actually did more than 1500 in 2013. The sample that we chose (i.e., Jan – March) was not representative of the rest of the year.

4) Statistical significance.  If we find a relationship between two variables, we must ask ourselves, “Does the strength of this relationship rise above that of chance?” Tests of statistical significance answer just that question. Which is great, but also limiting. Above chance is not a high bar to pass, so beyond just being concerned about significance we should also be concerned with the strength of the effect. This can be addressed via measures of effect size, which can be calculated using most statistic programs.

5) Am I providing the proper context for the number I am reporting? A quote taken out of context can completely change the meaning/intention of the source. The same is true of numbers. If we do not provide the reader with the proper context to interpret the meaning of a statistic, he or she is left to guess the importance.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>