It’s All Greek

When Humpty Dumpty uses a word, it means just what he chooses it to mean, neither more nor less. To people not conversant in a technical specialty, it seems that all the experts are Humpty Dumptys. Statistics is no exception.

It doesn’t look like a mouse to me.

If you’re a beginner at data analysis, it will seem like there is a superabundance of esoteric statistical slang. You’ll hear it even from friendly statisticians. It gets worse when you start reading websites, books, and worst of all, journal articles. If you want to see what I mean, read some of the article titles in the Journal of the American Statistical Association (at http://pubs.amstat.org/loi/jasa). The statisticians who write those obfuscatory tracts believe they are writing to people who know as much as they do. This seems odd given that those authors are supposed to be the experts in what they are writing about. Even other statisticians can’t decipher some of those articles without spending time with the reference books. So don’t feel like you’re alone in a foreign country. We stand befuddled together.

To simplify statistical jargon, think of three distinctions—statistical concepts named after someone, special words created to convey a special meaning, and common words and phrases with alternative meanings. We’ll leave the acronyms out of it for now.

Named Things

Statistical procedures, especially statistical tests, are often modified to accommodate some special circumstance or to have some desirable property. When this occurs, the new procedure is commonly named after the originators. Thus, there are statistical tests named after Dixon, Tukey, Wilcoxon, Scheffe, Kolmogorov, Fisher, Levene, Hotelling, Dunnett, and Bonferroni. And those are just some well-known ones. Dig into the literature, and you’ll find scores more.

It’s not just tests that get named. Bayesian statistics is a branch of statistics based on Bayes Theorem formulated in the 1700s by Reverend Thomas Bayes. Kriging, the interpolation algorithm of geostatistics was named after Daniel Krige, a South African mining engineer, who pioneered the field in the 1950s. The Normal distribution is also called the Gaussian distribution after Carl Friedrich Gauss, who introduced it in 1809, and the Laplacian distribution after Pierre-Simon Laplace who showed that the distribution was the basis for the central limit theorem in 1810. There are also theoretical frequency distributions named after Benford, Weibull, Rayleigh, Cauchy, Poisson, and Bernoulli.

If someone mentions a named distribution , test, or other statistical procedure, don’t panic. Nobody knows everything. Just ask what the distribution or procedure is supposed to do. If you took an introductory course in statistics and know about probability, the Normal distribution, and hypothesis testing, you’re in great shape for understanding most of the named stat terms you might run into. This type of statistical jargon could be much worse. When biologists name something after someone, they do it in Latin.

Created Words

Some statistical jargon might just as well be a foreign language because the words have no common meaning in the English language outside of statistics (or math). Examples of such words include: kurtosis, leptokurtic, platykurtic, skewness, covariance, autoregressive, variogram, logit, probit, eigenvalue, median, outlier, stationarity, winsorizing, communality, multicollinearity, and my personal favorite, homoscedasticity. If you’re at a bar and you hear any of these words being bandied around, slip quietly out the door and run for your life. Any statistician who uses these words with innocent civilians without explanation either doesn’t understand his or her audience or is a sadist. Dealing with created statistical terms is straightforward; just ask the statistician using them what they mean. Preferably ask in a foreign language just to prove the point.

Alternative Meanings

The most confusing statistical jargon just might be words in most people’s everyday vocabulary that have a very different statistical meaning. For example, when you hear the word mean, your mind has to sort out the word’s connotation. It can signify to intend, as in say what you mean. It can be used to associate, as in spring means flowers. It can refer to resources or methods, as in by any means. It can indicate character, as in she has a mean streak. It can imply exceptional skill, as in he has a mean fastball. And of course, in statistics, mean means average.” If you don’t realize that some words in English have different meanings in statistics, you can get confused very quickly. I’ve had well-meaning report editors change median to medium and nonsignificant to insignificant.

Here are a few more examples:

Word	Meaning to a Statistician	Meaning to a Nonstatistician
bagging	A method for combining predictions from many data mining models	What the cashier does with your groceries when you’re done paying
blocking	A technique for controlling variation in ANOVA	What the offensive line does during football season
brushing	Interactively selecting data points on an on-screen graph to access other information associated with the point	What you do with your toothpaste and toothbrush
breakdown	Splitting data into groups to calculate descriptive statistics and correlations	What happens to your car when you’re in a hurry to get somewhere
censoring	Data with a real but undetermined value, usually less than or greater than all other values in a dataset.	Restricting free speech; removing material considered to be offensive from books or other media
confidence	Absence of type I errors	Ego stability
discriminate	Classify observations by a statistical model; a good thing.	To make distinctions based on race, creed, ethnicity, age or other category without regard to individual merit; a bad thing
errors	Differences between observed values and values predicted from a statistical model; residuals	Mistakes
mode	The most frequently appearing number in a set of numbers	A manner of acting, such as being in “relaxation mode.”
Monte Carlo	A simulation procedure for evaluating the properties or performance of a statistic	The quarter of Monaco known for its resorts and casinos; a hotel in Las Vegas
Normal	Follows a Gaussian (bell-shaped) distribution	Typical, routine, sane
residuals	Differences between observed values and values predicted from a statistical model; errors	Money made by musicians and actors when their works are replayed.
sample	An individual observation or multiple observations that are part of a population	A piece, a bit, a taste.

Don’t feel that you’re alone in the quagmire of statistical jargon. Like dialects of the English language, different statistical specialties have their own jargon and ways of expressing ideas. Data mining, time-series forecasting, quality control, nonlinear modeling, biometrics, econometrics, and geostatistics are all examples of statistical specialties that use terms not used in the other specialties. Imagine a Louisiana Cajun talking to a Pennsylvania Dutch. They both speak dialects of English, but it might as well be Greek.

Read more about using statistics at the Stats with Cats blog. Join other fans at the Stats with Cats Facebook group and the Stats with Cats Facebook page. Order Stats with Cats: The Domesticated Guide to Statistics, Models, Graphs, and Other Breeds of Data Analysis at Wheatmark, amazon.com, barnesandnoble.com, or other online booksellers.

10 Responses to It’s All Greek

Jon Lee says:

July 4, 2010 at 1:51 AM

A good one is also ‘boot strapping’

- statswithcats says:
  
  July 4, 2010 at 2:35 PM
  
  Yup, I didn’t think of that one. It makes me think of jack knifing, too.
  
  - evelyn nave says:
    
    October 7, 2010 at 12:45 PM
    
    I’m in the process of blogging on parallel universes: Life Sciences vs. Statistics.
    
    Some favorite stats. jargon of mine:
    Ergodicity
    Heteroskedasticity–It just doesn’t get any better than that.
Nitric Oxide says:

July 23, 2010 at 6:19 PM

What exactly is it about studying things through someone else’s shoes that provides such an excellent perspective? There sure is some excellent mind food weblogs out there and I sure do feel like I have had a great helping here checking out your site. I ended up on here soon after researching some of my work stuff on Google and somehow found myself your website. It’s always good times browsing through and I’m hopeful that you’ll keep writing away. Cheers!

buy roids says:

January 1, 2011 at 10:54 AM

This blog is paradise for me, i love all these informations, thanks for your work buddy. Waiting for more info

Wade says:

September 19, 2012 at 7:52 PM

I was looking into modeling some data using a multinomial logistic distribution and found this one:
“Irrelevance of Independent Alternatives” or “IIA”.

Pingback: HOW TO WRITE DATA ANALYSIS REPORTS. LESSON 1—KNOW YOUR CONTENT. | Stats With Cats Blog
Pingback: How to Write Data Analysis Reports. Lesson 2—Know Your Audience. | Stats With Cats Blog
Pingback: How to Write Data Analysis Reports in Six Easy Lessons | Stats With Cats Blog
Pingback: Searching for Answers | Stats With Cats Blog

It’s All Greek

Named Things

Created Words

Alternative Meanings

About statswithcats

10 Responses to It’s All Greek

Leave a comment Cancel reply

DISCLAIMER

Recent Posts

Archives

RSS Links

Feedburner

Follow Blog via Email

Blogroll

Recent Posts from: Random TerraBytes

Wanderwork: An Unintended Consequence of Telework—2017 Update

We Are Our Experiences

A March of Evidence

Republicans, Democrats, and Independents–2022

<strong>What’s Your Superhero Story?</strong>

Inflation and Corporate Profits

Meta

It’s All Greek

Named Things

Created Words

Alternative Meanings

Share this:

Related

About statswithcats

10 Responses to It’s All Greek

Leave a comment Cancel reply

DISCLAIMER

Recent Posts

Archives

RSS Links

Feedburner

Follow Blog via Email

Blogroll

Recent Posts from: Random TerraBytes

Wanderwork: An Unintended Consequence of Telework—2017 Update

We Are Our Experiences

A March of Evidence

Republicans, Democrats, and Independents–2022

<strong>What’s Your Superhero Story?</strong>

Inflation and Corporate Profits

Meta