Archive of posts filed under the Statistics category.

## Yet Another Way to Lie With Statistics

This is a nice takedown of some spurious economic analysis, courtesy of Freakonomics:

Looking at the graph at the right, it’s hard not notice the negative correlation between the two given variables, and the economist in question uses that correlation to bolster his policy argument.

The graph looks a lot different, however, when you look at all the available data, not just the data between today and the arbitrarily chosen cut-off of 1990.  But that chart doesn’t support the argument as decisively.

As the author suggests, “Be wary of economists wielding short samples.”

www.MrHonner.com

## The Year in NFL Scoring

As the books close on the 2011 NFL regular season, it’s time to revisit my pre-season prediction that the new kickoff rule would result in a slight decrease in per-game scoring.

The pre-season predictions on the number of touchbacks turned out to be fairly accurate.  In 2011, about 43% of kickoffs (922 out of 2151) resulted in touchbacks; in 2010, only 16% of kickoffs (359 out of 2221) resulted in toucbacks (thanks to NFL.com for the data).

Did the increase in touchbacks reduce overall scoring in 2011, as hypothesized?  No.  In 2011, around 44.4 points were scored per game in the NFL; in 2010, around  44.1 points were scored per game.  Per-game scoring actually increased slightly this year !

One issue worth mentioning, however, is the disproportionate effect the top three scoring teams have on the data.  During the 2010 season, New England was the highest scoring team in the league with 518 points total points; this was nearly 80 points more than the second highest scoring team.  In 2011, the Packers, Saints, and Patriots all scored over 500 points!  If we remove the three highest-scoring teams from each season, scoring for the rest of the league actually drops about 0.7 points per game.

It’s been fun drilling down into the data this year, and many other interesting questions popped up along the way.  And off-season changes always create new opportunities for analysis.

www.MrHonner.com

## More on NFL Scoring

As the Detroit Lions prepare for their first compelling Thanksgiving Day game in 15 years, I thought I would revisit my pre-season hypothesis that scoring in the NFL would be down in 2011 due to the new kickoff rule.

A quick recap of my argument:  the new kickoff rule will result in more touchbacks, which will reduce overall starting field position, which will result in fewer points being scored.  An elementary analysis suggested that per-game scoring would be down by about 2 points per game.

The first two weeks of the season saw record-setting offensive production:  scoring was actually up by 2.5 points per game!  But now, with more than half the 2011 NFL season in the books, the average points-per-game is 44.07.  During the 2010 NFL season, the average points-per-game was around 44.16.

A TV analyst recently suggested that scoring decreases as the season progresses, due to factors like weather and injury.  Not only does this give me another idea for a math and sports analysis, it also gives me hope that perhaps my pre-season prediction may still come true!

www.MrHonner.com

Google’s Public Data Explorer is a great, free resource for students and teachers interested in data science and statistics.

The site allows you to create custom graphs of available data sets, making it easy to experiment with different representations and explore the meaning of data.

There are several data sets available to play around with.   The OECD Factbook alone provides a wealth of raw data on education, energy, employment, population and migration, and many other categories.  There are also data sets available from the U.S. Census and the U.S. Bureau of Economic Analysis.  There appears to be support for using your own data sets, as well.

The data can be represented in a variety of ways:  histograms, line graphs, and even dynamic time series are all available.  It’s a great way to play around with data, and to build skill and intuition in data analysis, interpretation, and representation.

www.MrHonner.com

## Bike Data Visualization

This is a cool visualization of bicycle usage in London:

http://goo.gl/ChPIB

On October 4th, 2010, there was a public transit strike in London, which maximized bike usage in the city.

The creators claim that the flashes of light that represent the bikes move in accordance with the actual speed and path of the bikes they represent.

An innovative representation that puts me in mind of graph theory and networking!