Sunday, February 22, 2009

Interpreting Polls: An AngryMath Open Letter

Below, I present the very last question from the last test in the statistics class that I teach, from two weeks ago:

An online forum says "Most of our members are not in favor of switching to a new product. We polled 900 members, and only 37% are in favor of switching (margin of error +/-3%)." Orius reads this and responds, "That poll doesn't convince me of anything. This forum has over 74,000 members, so that's only a small fraction of forum members that you polled." Do you agree with Orius' reasoning? Explain why you do or do not agree. Refer to one of our statistical formulas in your explanation.

Here is the best possible answer to that test question:
No, Orius is mistaken. Population size is not a factor in the margin-of-error formula: E = z*σ/√n (z-score from desired confidence level, σ population standard deviation, n sample size).

Now, I make a point to ask a question like this right at the end of my statistics class because it's an enormously common criticism of survey results. It's also enormously flat-out wrong. (In this case, the quotes I used in the test question were copied directly from a discussion thread at gaming site ENWorld from last year).

Two weeks later, I get up on Sunday morning and eat a donut while reading famed technical news site Slashdot. Here's what I get to read in an article summary on the front page:
Adobe claims that its Flash platform reaches '99% of internet viewers,' but a closer look at those statistics suggests it's not exactly all-encompassing. Adobe puts Flash player penetration at 947 million users out of a total 956 million internet-connected devices, but the total number of PCs is based on a forecast made two years ago. What's more, the number of Flash users is based on a questionable internet survey of just 4,600 people — around 0.0005% of the suggested 956,000,000 total. Is it really possible that 99% penetration could have been reached?

Below, I present my open response to this Slashdot summary:

What's more, the number of Flash users is based on a questionable internet survey of just 4,600 people — around 0.0005% of the suggested 956,000,000 total.

That's the single dumbest thing you can say about polling results. I just asked this question on the last test of the statistics class I teach two weeks ago. Neither the population size, nor the sampling fraction (ratio of the population surveyed), are in any way factors in the accuracy of a poll.

Opinion polling margin of error is computed as follows (95% level of confidence): E = 1/sqrt(n) = 1/sqrt(4600) = +/-1%. So from this information alone, the actual percent of Flash users is 95% likely to be somewhere between 98% and 100%. Again, note that population size is not a factor in the formula for margin of error.

As a side note, polling calculations are actually most accurate if you had an infinite population size (that's one of the standard mathematical assumptions in the model). If anything, a complication arises if population size gets too small, at which point a correction formula can be added if the sampling fraction rises over 5% of the population or so.

There might be other legitimate critiques of any poll (like perhaps a biased sampling method). But a small sampling fraction is not one of them. It's about as ignorant a thing as you can say when interpreting poll results (on the order of "the Internet is not a truck").

http://en.wikipedia.org/wiki/Margin_of_error#Effect_of_population_size

Thursday, February 19, 2009

Oops: Margins of Error

Yesterday I posted an AngryMath blog about polling margin of error, asserting that the following claim was invalid: "If Candidates A and B differ by a number less than the margin of error, you can't be sure who is really ahead."

Well, it turns out that was a mistake on my part. As I groggily woke up this morning (the time where most of my best thinking occurs), I realized I'd made a mistake with a hidden assumption that the percentage of people supporting Candidates A and B were independent... when obviously (on reflection) they're not; in the simplest example every vote for A is a vote taken away from B. You can take A's support and directly compute B's support.

So if I do a proper hypothesis test with this understanding (H0: pA = 0.5 versus HA: pA > 0.5), with polling size n=100 and 55% polled support for A (as an example), you get a P-value of P = 0.1587 (significantly higher than the limit of alpha = 0.05 at the 95% confidence level), showing indeed that we cannot reject the null hypothesis.

In short, it turns out that the statement "A and B are within the margin of error, so we can't be sure exactly who is ahead", is actually correct at the same level of confidence as the margin of error was reported. In fact, to extend that result, there will be a window even if A and B are beyond the margin of error where you still can't pass a hypothesis test to conclude who is really ahead. (Visually, intervals formed by the margins-of-error overlap a little bit too much.)

Mea culpa. I removed the erroneous post from yesterday and left this one.

Friday, February 13, 2009

Basic Teaching Motivation

I'm constantly obsessing about the best, most important thing I can deliver at the very beginning of the very first meeting of any class. In the past I've basically said that "Abstraction: Familiarity and Use" is the single overarching principle that I'm teaching in all my classes (math or CS), and therefore that should be the introductory lecture, in some sense, in every single class. I think now I might need to get a bit more topically specific for each class.

For the intermediate algebra class that I regularly teach (which is truly an enormous challenge for most of the students I get), I'm considering this very short mission statement: "Can you follow rules? (Can you remember them?)"

(Here's how I might develop this:) When I say that, I don't mean to come off as some kind of control freak. There are both Good and Bad rules in the world. You should take a philosophy course or some kind of ethical training to identify for yourself what rules are Good (and effective, and you should dedicate yourself to following), and what rules are Bad (and you should dedicate yourself to challenging and overthrowing).

But this course is specifically about the skill of, when you're handed a Good rule, do you have the capacity to quickly digest it and remember it and follow it? If you can't do that, then you're not allowed to graduate from college. The purpose is twofold: (1) testing and training in following rules in general, and (2) an introduction to mathematical logic in specific. The first is a requirement before you're expected to be given responsibility in any professional environment. The second gives you a platform to understand principles of mathematics, which are usually the best, most effective, and most powerful rules that we know of.

So, if you can't follow rules, or if you simply can't remember them, it will be frankly impossible to pass a course like this, and you'll get trapped into a cycle of taking this course over and over again without success.

(Honestly, as an aside, I think the primary challenge to students in my intermediate algebra course is simply an incapacity to remember things from day to day. I know now that we can literally end one day with a certain exercise, and have everyone able to do it, and start the very next day with the exact same exercise and have half the classroom unable to do it.)

I conclude, as I've expressed previously before, with a possible epitaph:

I want to foster a sense of justice.
A love of following rules that are good.
A love of destroying rules that are bad.

Division By Zero

Usually in my intermediate algebra classes someone has to ask "why is division by zero undefined"? This is really highly desirable, because we usually want a moment of discussion to fix the fact for the majority of students who either: (a) have never heard of it before, (b) didn't remember it, (c) don't think it makes sense, or (d) flat-out disagree with it.

One problem that occurs to me now is that, when discussing it extemporaneously, I myself tend to forget that there's two very separate and distinct cases involved: dividing any run-of-the-mill number by zero, versus dividing zero by zero. I think the following is a pretty complete proof:

All variables below are in some set U with multiplication.
Define: Zero (0) as, for all x: 0x = 0.
Define: Division (a/b) as the solution of a = bx (that is, a/b=x iff a=bx).
To prove that division by zero is undefined, let's assume the opposite: there exists some number n such that n/0 = x, that is, n = 0x (def. of division), where x is a unique element of U.
Case 1: Say n≠0. Then n = 0x => n=0 (def. of zero). This is a contradiction that shows this case cannot really occur.
Case 2: Say n=0. So 0 = 0x (def. of division), which is true for all x (def. of zero). Thus x is not a unique number, contradicting our initial assumption.

To summarize: If you try to divide any nonzero number by zero, the answer is "Not A Number". However, if you try to divide zero by zero, the answer is "All Numbers Simultaneously".

(Side note: I guess there's an interesting gap in what I just wrote there, that maybe in Case #2 if the set U just has one single element [zero], then actually you can identify x=0, and do in fact have definable division by zero for that one, degenerate case. Hmm, never noticed that.)

Anyway, that's a fairly intricate amount of basic logic for my intermediate "didn't even know about division by zero" algebra students (2 kind-of-exotic definitions, two cases from an implied "or" statement, and doubled proof by contradiction). So I think what I wind up doing, to give them at least some sense of it, is to consider a single numerical example and gloss over the secondary n=0 case:

Say x = 6/0.
So: x*0 = 6 <-- Multiply both side by zero. Violates rule (for all x: x*0 = 0), so x is not any number (NAN). (Of course, normally you can't multiply both sides of an equation by zero. But you could if division by zero was defined [see degenerate case above, for example] and that actually is the assumption here. It's a sneaky proof-by-contradiction, for one numerical case, without calling it out as such.)

Now, I pick the number "6" above for a very specific reason; if that still leaves someone unconvinced, it gives me the option to do the following. Take six bits of chalk (or torn-up pieces of paper, or whatever), put them in my hand, and write 6/2 = ? on the board. Ask one student to take two pieces out, then another student, and another, until they're all gone. How many students can I do to with before the chalk is gone? That's the answer to the division problem: 3 (obviously enough).

Now take back the chalk and write 6/0 = ? on the board. Ask one student to take zero pieces out, and then another, and another. How many students can I do this to before the chalk is gone? That's the answer to this division problem: Infinitely many, which is (again) not a number defined in our real number system. (If we have a short discussion about infinity at this point, and even if I leave students with the message "I want to say this should be infinity", I think that's okay.)

This is all part of an early lecture on basic operations with 1 and 0, serving to (a) remind students who routinely trip up over them, (b) serve as a foundation for simplifying exercises (why you're expected to simplify x*1 but not x+1), and (c) serve as an example for how much more convenient/fast it is to express rules in algebraic notation, versus regular English. It's also around that time that we have to categorize different sets of numbers, sometimes resulting in the question "what's not in the set of Real numbers?", for which I recommend the examples of infinity (∞) and a negative square root (like √(-4)). So, a lot of those ideas segue together.

Mathematicus Fhtagn

Routinely I wake up from dreams, explaining myself to someone in this fashion: I'm trying to wrap up my master's degree. I am always "wrapping up", in the last year of that program. It's like some part of my brain, bent into a closed loop, perpetually running on a treadmill; a Sisyphean ordeal, unable to finish. That is, in some sense, where my heart always is.

Tuesday, February 10, 2009

Dice Distributions

I got a bit obsessed with finding complete pictures for a sequence of sum-of-dice distributions. I finally completed a nice spreadsheet with charts of the distributions of everything from 1d6 to 10d6, here:

http://www.superdan.net/download/DiceSamples.xls

It provides a nice picture of the evolution of the distribution, as more dice are added, into one that (a) more closely matches a normal curve, as per the Central Limit Theorem, and also (b) gets narrower and narrower, as the standard deviation of the dice average falls. I printed out the first page (n=1 to 3) for my statistics class, in an attempt to intuitively anticipate the CLT.

The other nice thing here is that all the numbers come out of a programmed macro function for summed dice frequency (which I picked up from the Wikipedia article on Dice, and I wanted to see implemented in code): F(s,i,k) = sum n=0 to floor((k-i)/s): (-1)^n * choose(i, n) * choose(k-s*n-1, i-1).

http://en.wikipedia.org/wiki/Dice#Probability

Sunday, February 8, 2009

Sharp-Edged Dice

Over on my gaming blog, I wrote a post on how to test for balanced dice (including any common polyhedral dice used in role-playing games, such as 4, 6, 8, 12, and 20-sided dice). Basically it's an application of Pearson's chi-squared test, and I computed a rejection cutoff value for the SSE (sum-squared error) at a 5% significance level, if you're rolling the die in question 5 times per side. I hadn't seen that specifically presented anywhere else, and I really wanted to see it in a simplified format. I won't repeat it here, but it's over in my gaming blog at the link below:

http://deltasdnd.blogspot.com/2009/02/testing-balanced-die.html

As a follow-up to that, I tried to do a little research on what game manufacturers do for quality assurance on their dice. While I didn't find that specific information, I did find something that absolutely warmed and delighted my heart. It's the long-time owner of the dice-company Gamescience, "Colonel" Lou Zocchi, engaging in a 20-minute rant about how crappy his competitors' dice are, because they engage in a sand-blasting process that rounds off all the edges of their dice in an irregular fashion, and therefore generates slightly incorrect probability distributions. (As an aside, this actually does match my own independent test of his style of dice he produced.)

Lou's retiring this year and selling the company, but he's still got the fire, and I'll give him an official AngryMath salute for his rant on dice probabilities. Watch it here:

http://www.gamescience.com/