We’ve all seen it before.
“Since 2005, underdogs of more than six points have gone 162-138 (54%) against the spread in the first round of conference tournaments.” – some crappy “analyst”.
I get it. Trends are easy to comprehend and fit a narrative to. Media companies and sportsbooks (which are becoming increasingly intertwined) love them because they can be queried very quickly, and they give the illusion that you have an edge.
But that’s exactly what trends are. An illusion. If you look hard enough anywhere, you can manipulate any data set to craft a narrative about a trend that doesn’t actually exist.
Most trends are a result of variance. And as a result, you can find a trend for just about anything. Thankfully, we have the tools to assess variance which can help explain why trends can be 1) so prevalent yet 2) so meaningless at the same time.
A binomial distribution is a discrete probability distribution of the number of successes in n-numbered trials. Binomial distributions help us answer questions such as “what’s the probability of getting 7 or more ‘heads’ if I flip a coin 10 times?” (17.2% for those wondering). Using binomial distributions, we could answer the question: “If a monkey picked the ATS winner in 300 games, what’s the probability of Mr. Monkey wins at least 162 games (54%)?”
Back to Trends
So how do we apply this to trends? Can you say – if you pull a random 300-game sample, there is a 9.2% probability of finding a trend that wins a 54% clip? Not quite.
You see 9.2% represents a single tail of the distribution of outcomes as displayed below.
When someone is seeking a trend, they aren’t discriminating on which side of the tail that they identify as a trend. While we’ve correctly identified that there is a 9.2% probability that Mr. Monkey could win 162 or more games (out of 300), there is also a 9.2% probability that Mr. Monkey loses 162 or more games (out of 300). Therefore, there is an 18.4% chance that Mr. Monkey’s win probability is either 1) ≥ 54% or 2) ≤ 46% (in which case someone might say it is a ‘trend’ to fade Mr. Monkey).
Thus, if someone takes a random 300-game sample, we know that there is an 18.4% chance that they will find a ‘trend’ with a winning % of 54% or greater.
Finding a Trend
If you pull a 300-game sample size and don’t find a trend, are you going to stop looking, or try again? Try again, of course. So how long will it take you to find a trend?
Not very long.
Since there is an 18.4% chance that you will find a trend, we also know that there is an 81.6% chance that you don’t find a trend on your first try. So you try a different ‘system’ (I hate that word). How many tries does it take (on average) for you to find a trend? We solve for smallest value of n (number of tries) that satisfies the following formula:
Solving for n, we determine that if you try 4 different potential ‘trends’ you will likely (>50%) find one. Due to the long tail nature of the distribution, on average it will take you 5.4 (1 / 18.4%) tries to find a trend with 54% win probability over a 300 game sample size.
Average Number of Queries to Find Various Trends
Below we’ve computed the average queries required to find various trends. For example, If you're looking for a trend that wins in 18 out of 30 games (60%) it will only take you 2.8 queries, on average.
So What’s the Big Takeaway?
The big takeaway is that most trends that are thrown out by ‘analysts’ are completely meaningless since many trends can be found with just a little bit of querying of various game samples. Even trends that are “statistically significant” can be found. This does not mean that they are necessarily predictive. Trends are more than likely the result of data mining and have little predictive value. When evaluating supposed trends, ask yourself, is there something fundamentally driving this anomaly, or is this simply just noise?