Statistics

Representation Statistics Technology

Facebook Formulas

This graph on the right represents break ups per day, as determined by an analysis of Facebook status changes. The data suggests that break-ups seem to occur most frequently in mid-February and late November.

Drawing conclusions from data is always dicey, and there are probably a lot of holes to poke in the methodology here, but it certainly is fun trying to attach meaning to these numbers!

This graph was featured in a TED Talk given by David McCandless, who runs the wonderful website www.informationisbeautiful.net.

The whole talk can be found here; this chart comes up at around the 6:50 mark.

The amount of data available through social networking sites is mindblowing, and it can’t be long before it will be used in some significant way. Indeed, a group of MIT students has already devised a system, cleverly titled Project Gaydar, that, with some accuracy, identifies the sexual orientation of a Facebook user based on friends, likes, and other connections.

What will they compute about us next?

By patrick honner, 14 years9 years ago

Resources Statistics

Proofiness

This is a short interview in the NYT with Charles Seife, the author of “Proofiness: The Dark Art of Mathematical Deception”.

http://well.blogs.nytimes.com/2010/10/29/the-dark-art-of-statistical-deception/

Trading on Colbert’s clever coinage–Truthiness–Seife’s book apparently address the myriad ways that the misrepresentation and misinterpretation of statistics negatively affects medicine, economics, politics, justice, and other aspects of society.

It’s not clear that this book is covering any ground that hasn’t already been covered in, say, How to Lie With Statistics (an amusing classic!) or the engaging and readable work of John Allen Paulos, but hopefully the more the issue is raised, the more seriously it will be taken. The consequences of innumeracy, and general scientific illiteracy, are profound and far-reaching, and they affect us all.

By patrick honner, 14 years9 years ago

Application Statistics

Benford’s Law

This is an article about the discovery of new sets of data that seem to obey Benford’s Law–a curious mathematical characteristic of the numbers we collect from the world that is really more conjecture than law.

http://www.newscientist.com/article/mg20827824.700-curious-mathematical-law-is-rife-in-nature.html

It seems that in scores of data sets collected from natural phenomena, the numbers we see tend to start with the digit 1 far more often than, say, with the digit 6. Indeed, statistical analysis shows that when you look at population numbers, death rates, street addresses, lengths of rivers, stock prices, and more recently, depths of earthquakes and brightness of gamma rays, the observed numbers start with the digit 1 about 30% of the time. The occurences of other digits as the leading digit fall as you go up the scale.

Apart from being a natural curiosity, Benford’s Law has proven to have some very useful applications. Scientists can use Benford’s Law to help predict phenomena and look for trends in data, as the rule gives number-crunchers an idea of what they might be looking at from the start.

Additionally, Benford’s Law has been successfully used to identify all kinds of numerical fraud–tax fraud, voter fraud–because when people are faking numbers, they tend to evenly distribute leading digits. Benford’s Law tells the data-police that if approximately 1/9 of the numbers they are looking start with 1, then something fishy is going on.

Keep that in mind next April.

By patrick honner, 15 years9 years ago

Statistics Teaching Testing

Who Tests the Testers?

It’s tricky business, curving state exams.

An audit by Harvard researchers compared student results on NY State exams (Regents, et al) with corresponding national exams, and it seems that much of the “progress” made by NY students over the past few years was probably illusory.

There are several telling statistics in the report, but none clearer than this: in 2007, the minimum score on the NY state math exam corresponded to the 36th percentile nationwide. In 2009, the minimum score on the NY state math exam corresponded to the 19th percentile nationwide. This effectively defined proficiency as “do better than 19 percent of students across the country”.

In theory, curves for tests can drop if exams get harder, but no one with any knowledge of NY State math exams would make that argument. Indeed, these exams have been getting easier and easier to pass. For example, to pass the Integrated Algebra Regents Exam in 2009, a student only needed 30 raw points out of 88. A passing score of 34% seems pretty low to begin with, but keep in mind that a student guessing randomly on the multiple choice questions alone should get about 1/4 of the questions right, which amounts to 15 points. Halfway to proficiency.

By patrick honner, 15 years12 years ago

Application Statistics

Maybe Monkeys Can’t Do Algebra

There is a controversy brewing around Harvard scientist Marc Hauser, his research students, and his published results. Some of Hauser’s work–which apparently focuses on the cognitive ability of non-human primates–has come under scrutiny, and one of his articles has been retracted.

According to the abstract, the retracted article (“Rule-Learning By Cotton Top Tamarins“) discussed how tests usually performed on human infants were used to “assess whether cotton-top tamarin monkeys can extract abstract algebraic rules”. Obviously the results of the study are now in question, but it’s a fascinating idea nonetheless.

By patrick honner, 15 years10 years ago

Follow Mr Honner