## Regents Recap — August 2015: Modeling Data

Here is another installment in my series reviewing the NY State Regents exams in mathematics.

Data and statistics play a much bigger role in algebra courses now, due in part to their increased emphasis in the Common Core standards.  I am generally supportive of this, but I do worry about how statistical concepts are presented and assessed in these courses and on their exams.

For example, here is question 27 from the August, 2015 Common Core Algebra exam.

Evaluating mathematical models is an extremely important skill in many aspects of life.  But properly evaluating mathematical models is subtle and complex.

The following sample response, provided by New York state as an example of an answer deserving of full credit, does not respect that complexity.  And it makes me worry about what we are teaching our students about this important topic.

It’s true that the given data does not grow at a constant rate.  But that isn’t a good reason to reject a linear model for this set of data.  Models are used to approximate data, not represent them perfectly.  It would be unusual if a linear model fit a real set of data perfectly.

The weakness of this argument becomes even more apparent when we notice that the data isn’t perfectly fit by an exponential model, either.  Therefore, how could it be wrong for a student to say “We should use a linear model, because the data doesn’t grow at a linear rate and thus isn’t exponential”?

This is another example of the problems we are seeing with how statistics concepts are being handled on these high stakes exams, which is a consequence of both the rushed implementation of new standards and an ever-increasing emphasis on high-stakes testing in education.  It is also an example of how high-stakes tests often encourage terrible mathematical habits in students, something I address in my talk “g = 4, and Other Lies the Test Told Me“.

Related Posts

1. Curmudgeon says:

I went to Desmos …

The linear fit for all the data is R = .96
Knock off the first two, R=.977

This data also fits a quadratic model fairly closely. y=16.515x^2-26.758x+259.33, R=.999

It does give R=1 for an exponential though, 179.37*e^(x/4.4718). There’s a measure of rounding error in the value of R, of course, but “the data isn’t perfectly fit by an exponential model, either” isn’t accurate.

Having said that, I agree with your statement about the approximation usage of models but asking for “the better model” is an okay question since a cursory look sees that the data are increasing at an increasing rate. Their sample response is problematic, but not the question itself.

• MrHonner says:

Of course it’s accurate to say the data isn’t fit perfectly by an exponential model. There is no exponential function on which all the given points lie.

• Curmudgeon says:

It’s still a better fit than a linear one.

• MrHonner says:

But the point is that the given argument rejects the linear model because it’s not a perfect fit. The exact same argument can be used to reject the exponential model.

Of course the exponential model is better, but as you point out, the linear model is pretty good.

2. Kate Belin says:

Each week, the principal sends out a list of things he’s read. This week he sent a link to this post of yours and invited conversation (which is especially cool since he doesn’t come out of math, so I’m not sure how he got to following your blog). I wrote this back to the staff so leaving it here too.

I agree that NYS did not choose a convincing sample of a full-credit response. On one hand, it is easy to look at this sample response and critique that it rejects the linear model because of the non-constant rate of change rather than showing a student argue for the exponential model (or how the student doesn’t doesn’t elaborate on what he/she means by non-constant rate of change.) However, I think we should also look at the question. By narrowing it down to two choices for the student (linear or exponential), instead of asking a question that elicits student thinking about how they see the growth of this function, it petty much puts the student in a position where they can argue for one choice by dismissing the other.

• MrHonner says:

Hi Kate-

Thanks for duplicating your response here, and for letting me know that your principal shared the piece. It’s great hearing that my work inspires conversations elsewhere!

Yes, the question clearly presents the student with an “either .. or” decision. However, as noted above, the argument provided in the sample response applies equally to *both* options, thus, it can not possibly be a valid argument in support of either. We should be not supporting, not encouraging, such unsound reasoning.

A deeper issue here is that students in this course actually have no mathematical tools available to make an informed judgment about which model is better. Correlation coefficients between different kinds of regressions are incomparable, so the one legitimate procedure students are taught to apply in this situation is of no help.

3. Bob Lochel says:

As an AP Stats teacher, I teach students that context should give the first clue to the type of model. Using technology to find a model which fits the best can be useful, but the model isn’t very helpful if it doesn’t provide additional insight into the problem. I’m wondering how this would have been scored if a student simply appealed to context being growth of a population over time, which would imply an exponential model.