Balkinization  

Tuesday, January 06, 2009

The Art of SATergy

Ian Ayres

Crosspost from Freakonomics:

My son took the SSAT exam this past Saturday. And while I was sitting in the Choate athletic facility waiting for him to finish, I remembered that Avinash Dixit and Barry Nalebuff’s new book, The Art of Strategy, has a great example concerning standardized testing. Game theory is so powerful it can help you figure out the correct answer without even knowing what the question is.


Consider the following question for the GMAT (the test given to MBA applicants). Unfortunately, issues of copyright clearance have prevented us from reproducing the question, but that shouldn’t stop us.



Which of the following is the correct answer?


a) 4π sq. inches


b) 8π sq. inches


c) 16 sq. inches


d) 16π sq. inches


e) 32π sq. inches


O.K., we recognize that you’re at a bit of a disadvantage not having the question. Still, we think that by putting on your game-theory hat you can still figure it out.


Before reading their analysis, take a shot at trying to reason your way to the correct answer.



Here’s what they said:



The odd answer in the series is c. Since it is so different from the other answers, it is probably not right. The fact that the units are in square inches suggests an answer that has a perfect square in it, such as 4π or 16π.


This is a fine start and demonstrates good test-taking skills, but we haven’t really started to use game theory. Think of the game being played by the person writing the question. What is that person’s objective?


He or she wants people who understand the problem to get the answer right and those who don’t to get it wrong. Thus wrong answers have to be chosen carefully so as to be appealing to folks who don’t quite know the answer. For example, in response to the question: “How many feet are in a mile?” an answer of “Giraffe,” or even 16π, is unlikely to attract any takers.


Turning this around, imagine that 16 square inches really is the right answer. What kind of question might have 16 square inches as the answer but would lead someone to think 32π is right? Not many. People don’t often go around adding π to answers for the fun of it. “Did you see my new car — it gets 10π miles to the gallon.” We think not. Hence we can truly rule out 16 as being the correct solution.


Let’s now turn to the two perfect squares, 4π and 16π. Assume for a moment that 16π square inches is the correct solution. The problem might have been: “What is the area of a circle with a radius of 4?” The correct formula for the area of a circle is πr2. However, the person who didn’t quite remember the formula might have mixed it up with the formula for the circumference of a circle, 2πr. (Yes, we know that the circumference is in inches, not square inches, but the person making this mistake would be unlikely to recognize this issue.)



Note that if r = 4, then 2πr is 8π, and that would lead the person to the wrong answer of b. The person could also mix and match and use the formula 2πr2, and hence believe that 32π or e was the right answer. The person could leave off the π and come up with 16 or c, or the person could forget to square the radius and simply use πr as the area, leading to 4π or a. In summary, if 16π is the correct answer, then we can tell a plausible story about how each of the other answers might be chosen. They are all good wrong answers for the test maker.


What if 4π is the correct solution (so that r = 2)? Think now about the most common mistake: mixing up circumference with area. If the student used the wrong formula, 2πr, he or she would still get 4π, albeit with incorrect units. There is nothing worse, from a test maker’s perspective, than allowing the person to get the right answer for the wrong reason. Hence 4π would be a terrible right answer, as it would allow too many people who didn’t know what they were doing to get full credit.


At this point, we are done. We are confident that the right answer is 16π. And we are right. By thinking about the objective of the person writing the test, we can suss out the right answer, often without even seeing the question.


Now, we don’t recommend that you go about taking the GMAT and other tests without bothering to even look at the questions. We appreciate that if you are smart enough to go through this logic, you most likely know the formula for the area of a circle. But you never know. There will be cases where you don’t know the meaning of one of the answers or the material for the question wasn’t covered in your course. In those cases, thinking about the testing game may lead you to the right answer.


If you want a fun way to learn a ton of useful game theory, this is the book for you. How good is it? Steve Levitt has a blurb on the book saying it’s so good, he read it twice.


Comments:

I've long used this kind of strategy when I didn't know a multiple choice test answer. More frequently I used it when doing practice questions in class--AP calculus stand out in my memory for this. On a lot of these tests you can almost always winnow your way down to three answers, usually can winnow your way down to two answers, and often only one. Tests are rarely designed with test-taking strategies taken into account.

Occasionally standardized multiple choice tests are well designed so as to avoid this kind of thing
 

This would be a better example if your symbols were more readable. The pi looks like an "n". In that case, "c" is a plausible answer.
 

Tests are rarely designed with test-taking strategies taken into account.

Perhaps it's better to say that tests are rarely designed to block such test-taking strategies. I encourage my students to develop such strategies, because it's a vital survival skill in the ivy jungle. Of course, it doesn't help them as much on short answer / essay questions...

Frankly, in searching for the answer to the above unknown question, we needed to deploy knowledge that would have been quite sufficient to answer correctly if we had read the question to start.
 

Heh, all the way through that I thought the pi was an "n". Right up until I read Mark's comment. But that misunderstanding made absolutely no difference in my analysis. Which is an interesting bit in support of the general thesis. I'm underwhelmed however. The correct answer was obvious at a glance and is second nature to anyone who has ever thought about multiple-choice tests (ie anyone with a brain who has taken them). The lengthy analysis was interesting, but not required. It boils down to: The "n" is obviously part of the right answer (excluding the "16"), because "n" appears in 4 of 5 choices. The "16n" is the right answer, because they try to fool you with the "16".
 

The analysis rests on the assumptions (which were not given in the problem set-up, btw) that the test-constructor

a) knows the subject with absolute precision
b) is willing to spend the time and effort to make the problem "hard" to answer correctly.

In my experience, both of these conditions seldom hold.

Any experienced teacher understands that constructing and giving an exam is merely an exercise in probability and statistics.

A strategy to defeat the game-theoretic analysis is simply to give the questions to a group of college students (in remedial classes, probably) without the answers, take the four most common incorrect answers as alternatives to the right answers for your future multiple-choice question.
 

The correct formula for the area of a circle is (pi)r2. However, the person who didn't quite remember the formula might have mixed it up with the formula for the circumference of a circle, 2Ï€r.

What if they mix it up with the formula for the surface area of a sphere (which makes more sense than mixing it up with the circumerence)? If the radius is 2 then the student might think the answer is 16(pi) when the correct answer is 4(pi). If one uses game-theory, and the assumption that the student knows the difference between square inches and linear inches, the correct guess should be "a". "c" is just there to confuse students who use the formula for the surface of a sphere.
 

You're testing the device driver you're writing to make sure it conforms to power saver standards. It works fine except every now and then it doesn't wakeup from a suspend. What is the most productive debugging strategy?

Bonus question: what admirable test taking skills apply here? Answer: none. This is the real world. Your code doesn't give a rat's ass how close you come to finding the bug without finding it. Your driver will remain unreliable until you do.

In short, at the risk of impermissibly widening the discussion, I'd like to rain a little on the testing parade. And suggest that a good course has the student solve real world problems.
 

I'd add to jpk's point that I interpret this post as undermining the whole project of standardized tests. If knowing the right answer isn't essential, then what are we actually testing?
 

I'd add to jpk's point that I interpret this post as undermining the whole project of standardized tests. If knowing the right answer isn't essential, then what are we actually testing?

# posted by Mark Field : 3:12 PM


If you could figure out the answer to every question using game theory, and do it within the timeframe allowed for the test, then the testing would be a waste.

However, using game theory takes far too long, and is not nearly as reliable as actually knowing, or being able to figure out, the answers. All it does is make your guesses more likely to be right. Maybe.
 

If we did have to solve all the questions with game theory, then I'd agree. What usually happens, though, is that we know the answer to the majority of questions and can answer them fairly quickly. Game theory, though it takes longer, could allow someone to get more correct answers on the remainder than chance alone.

Whether this is good or bad depends on what you're trying to test. If you want to know whether students have mastered the subject, then the answers achieved in some alternate way will distort the results. If you want to test the gamesmanship of the test-takers, then fine.
 

C2H5OH (and Prof. Ayres):

Any experienced teacher understands that constructing and giving an exam is merely an exercise in probability and statistics.

A strategy to defeat the game-theoretic analysis is simply to give the questions to a group of college students (in remedial classes, probably) without the answers, take the four most common incorrect answers as alternatives to the right answers for your future multiple-choice question.


This is one approach (which may well work as well as any other).

How about meta-"game" theory that takes into account those people that use these ad hoc methods, and throws in answers that would be attractive to those who 'know' game theory and test construction strategy but who don't actually know the subject matter? I'd think any properly constructed alternative incorrect answers should take such "gaming" into account, and throw in some answers that will hook these people as well. Then the "gaming theory" people will have to evaluate a "likely" answer as being either correct, or as being just one meta-level above the other superficially "reasonable" but incorrect ones.

And then it's turtles all the way down....

Cheers,
 

Mark, I think the SATs try to discourage guessing/gaming by punishing you more for wrong answers than for no answer.
 

Arne,

Your suggestion inevitably leads to what we might term a "Vezzini infinite recursion", while my suggestion is more in the style of the man in black, in that it cannot be gamed in general.

Using game theory on an SAT question is, as BB pointed out, like swatting gnats with a scoop shovel. What I would like to see is an analysis of a "real-world" problem, say the Hamas/Israel struggle.

After all, if the thesis of the discussion is that detailed knowledge of the subject matter is rendered unnecessary by using game theory and simplistic assumptions about the motivations of the players, then surely we have an example before us.
 

Mark, I think the SATs try to discourage guessing/gaming by punishing you more for wrong answers than for no answer.

They do penalize for guessing, but if gaming the test allows you to better the odds, then your score will improve beyond mere chance even though you don't actually know the answer.
 

C2H5OH:

Your suggestion inevitably leads to what we might term a "Vezzini infinite recursion", ...

... and as I said, "It's turtles all the way down". ;-)

Cheers,
 

They do penalize for guessing, but if gaming the test allows you to better the odds, then your score will improve beyond mere chance even though you don't actually know the answer.

# posted by Mark Field : 6:10 PM


It may improve your odds over guessing blindly, but probably not enough to be worth the risk. I'm not saying that is definitely the case, but I suspect it is. In fact, I strongly suspect it isn't an improvement on blind guessing. In this example the professor finds the answer by making several assumptions about the test preparation that very well might be false. One wrong assumption and you might actually be hurting your chances of gaming the right answer. You might be walking into a trap.
 

One wrong assumption and you might actually be hurting your chances of gaming the right answer. You might be walking into a trap.

Like this:

Man in Black: All right. Where is the poison? The battle of wits has begun. It ends when you decide and we both drink, and find out who is right... and who is dead.
Vizzini: But it's so simple. All I have to do is divine from what I know of you: are you the sort of man who would put the poison into his own goblet or his enemy's? Now, a clever man would put the poison into his own goblet, because he would know that only a great fool would reach for what he was given. I am not a great fool, so I can clearly not choose the wine in front of you. But you must have known I was not a great fool, you would have counted on it, so I can clearly not choose the wine in front of me.
Man in Black: You've made your decision then?
Vizzini: Not remotely. Because iocane comes from Australia, as everyone knows, and Australia is entirely peopled with criminals, and criminals are used to having people not trust them, as you are not trusted by me, so I can clearly not choose the wine in front of you.
Man in Black: Truly, you have a dizzying intellect.
Vizzini: Wait til I get going! Now, where was I?
Man in Black: Australia.
Vizzini: Yes, Australia. And you must have suspected I would have known the powder's origin, so I can clearly not choose the wine in front of me.
Man in Black: You're just stalling now.
Vizzini: You'd like to think that, wouldn't you? You've beaten my giant, which means you're exceptionally strong, so you could've put the poison in your own goblet, trusting on your strength to save you, so I can clearly not choose the wine in front of you. But, you've also bested my Spaniard, which means you must have studied, and in studying you must have learned that man is mortal, so you would have put the poison as far from yourself as possible, so I can clearly not choose the wine in front of me.
Man in Black: You're trying to trick me into giving away something. It won't work.
Vizzini: IT HAS WORKED! YOU'VE GIVEN EVERYTHING AWAY! I KNOW WHERE THE POISON IS!
Man in Black: Then make your choice.
Vizzini: I will, and I choose - What in the world can that be?
Vizzini: [Vizzini gestures up and away from the table. Roberts looks. Vizzini swaps the goblets]
Man in Black: What? Where? I don't see anything.
Vizzini: Well, I- I could have sworn I saw something. No matter.First, let's drink. Me from my glass, and you from yours.
Man in Black, Vizzini: [they drink ]
Man in Black: You guessed wrong.
Vizzini: You only think I guessed wrong! That's what's so funny! I switched glasses when your back was turned! Ha ha! You fool! You fell victim to one of the classic blunders! The most famous is never get involved in a land war in Asia, but only slightly less well-known is this: never go in against a Sicilian when death is on the line! Ha ha ha ha ha ha ha! Ha ha ha ha ha ha ha! Ha ha ha...
Vizzini: [Vizzini stops suddenly, and falls dead to the right]
Buttercup: And to think, all that time it was your cup that was poisoned.
Man in Black: They were both poisoned. I spent the last few years building up an immunity to iocane powder.
 

Back in the real world, and on the verbal, not math, portion of the test, it's occasionally the case that TWO of the answers could be "right", depending on how you approach the question. Pretty much have to game it then, or just give up and pick randomly between them.

It has even happened on the math portion of the SAT, once or twice, IIRC; A geometry question having to do with how many surfaces were left when you assembled certain solids, and you had to decide whether they meant two original surfaces with ended up adjacent and co-planar to count as one or two.
 

C2H50H said:

After all, if the thesis of the discussion is that detailed knowledge of the subject matter is rendered unnecessary by using game theory and simplistic assumptions about the motivations of the players, then surely we have an example before us.

Isn't this how most economic advisors operate?

a) detailed knowledge of the subject matter is ... unnecessary CHECK!

b) simplistic assumptions about the motivations of the players CHECK!

Oh, I forgot; these are economists advising us on how to take tests.
 

Post a Comment

Older Posts
Newer Posts
Home