We are bombarded by numbers and information every day, and that can create a lot of stress. While eating well, getting enough sleep and exercising are a great start to handling stress, the real secret to a happy and meaningful life is learning statistics. Hear me out --
Unlike all the rest of the tools we use, only statistics can tell you what pieces in the endless stream of information coming at you can be ignored. And that gives you more time and energy to focus on what you care about. So, really, the science of statistics is just like yoga or meditation. It clears your mind and sets your priorities. Really.
And stats is not hard. Not confusing. It's just math, and not at all the worst kind. While it can get pretty complicated, the basic stuff is very accessible. And the basic stuff is all you really need to find happiness.
There are two concepts that statisticians care about more than anything else – central tendency and variability. These are not words we use every day, for sure, but they are good words. Once you learn them, lots of things become easier. This post will just touch on them so you can get a sense of what they are. Then the next posts will take each on separately. In the last post of this series, I will pull the two concepts back together in a discussion on statistical significance.
Think about all the variety of plants in the world. There are almost 300,000 known species of plants on earth. If you wanted to teach someone about all those plants, where would you start? How could you summarize all of that variety into a manageable amount of information that can be taught?
Carl Linnaeus was faced with exactly that problem and built a taxonomy that simplifies and summarizes all plants into twelve 'Divisions.' 300,000 plants down to twelve Divisions. That is some serious simplifying. But, of course, it was just the beginning. There are further categories, such as classes, genera, and species.
The point is, Linnaeus took a wild mess of data on hundreds of thousands of plants and found a way to organize it into digestible pieces. And in addition to these pieces being bite-sized, they are also informative – where a plant lands in that categorization process tells you a fair amount about the plant. For example, it tells you whether the plant has seeds, how it reproduces, or whether it lives on land or in water.
Enter statistics, which does precisely this function for information that we can count (such as, number of admissions, number of patient days, or number of infections). The sole purpose of statistics is to summarize, simplify and inform.
As it has become popular to say, statistics is a way 'to separate the signal from the noise.' Think of the data in our EMR as being similar to the wild mess of data that Linnaeus faced. Every time you enter a single piece of information into a patient's chart, you are adding to that huge pile of data. Statistics is a key tool in turning it into something that is informative.
When Linnaeus sat down to categorize plants, he grouped them according to types (land plants, water plants, woody plants, etc.). And even though he grouped every plant, he kept all the varieties represented – he found a place for every plant.
In statistics, central tendency and variability are the corollaries to simplifying something and allowing that thing to remain complicated. Let me tell you what I mean.
Below is a graph of the ages of children born in my family, my husband's family and my cousins' family when we were young – all 16 of us. In some point in time, these three families, all together, had two one year olds, two four year olds, etc.
Without doing any math, you could probably conclude two things. First, the 'center' age of this group (or average) is around age 7 or 8 just eye-balling it. Second, the average does not really 'summarize' all the ages very well – there is a 14 year spread (or range) – so the average doesn't seem to tell the whole truth very well. The truth is more complicated than the 'average' suggests.
Now imagine if our ages looked like Table B. You would still say the center is around 7 or 8, but you would probably feel like the average told more of the truth about how old all of us were in 1966. There is only a six year spread, which means there is less variability in the ages in the second chart than in the first.
Another way to look at it is this: What age is the 'typical' child in this group? In the first chart, no age is really typical because there is so much variability. In the second, there seems to be a pretty clear typical age because there is a lot less variability.
Central tendency is simply a number that describes what value you would 'typically' expect to see in any group of things. And variability takes account of all the crazy differences in that group of things that are NOT 'typical.'
The tension between typical and not typical is at the heart of all statistical procedures. Statisticians want to describe what you would expect to see in any group of data AND they want to tell you how likely that expectation is to be wrong. Statistics is always about these two things, regardless of the complex language we use.
Indeed, when you see an average reported with a confidence interval around it, that is exactly what you are being told: The average is the 'typical' number and the confidence interval tells you HOW typical it is. If the confidence interval is small, the average is pretty darned typical and you can be confident it describes the group of data well; if it is big, it isn't and you can't be. Hence, the term 'confidence' interval (can you be confident in that the average represents the truth of what is in the data?).
But we get ahead of ourselves. Let's dig more into each of these things and then we will get to statistical significance.
Let us know what you think - please leave a comment below.