- Start with a hand of four cards: {A, 2, 3, 4}
- I'll turn my back and secretly do one of two things:

H_{0}: Leave the Ace in, or

H_{A}: Take the Ace out - Now shuffle the hand and deal out 3 cards.

_{0})? Note that all possible draws would be {{A,2,3}, {A,2,4}, {A,3,4}, {2,3,4}} so the probability of seeing that would be P = f/N = 1/4 = 0.25.

Conclusion: If I draw {2,3,4} then we have some evidence that I did change the deck (H

_{A}) -- because it's unlikely to see that result if I didn't (P = 0.25).

Now -- You can actually demonstrate this and ask the class if they think I left the Ace in or took it out each time. I'd recommend 3 run-throughs: leave it, leave it, then take it out. (In the latter case, also ask: Is it possible that I left the Ace in?) In reality, you should probably hold the cards against the otherwise full box, so it isn't obvious if your hand becomes empty in the take-it-out case. (And otherwise practice the prestidigitation in advance so your handwork doesn't give it away.)

Open Question: Should I actually reveal to the class which one I did each time (for confirmation), or leave that as a mystery (modeling real-world usage)?

I'm struggling with your suggested demonstration, and I think it's because you mention hypothesis testing for means and then proceed with a demonstration that isn't about means. Also, I don't think it's as simple as defining a population then considering all the possible samples. That might make for an effective demonstration, but (as I understand it) hypothesis testing is totally unaware of the size of the population (i.e., your samples of 3 cards do not "know" they're sampling a population of 4 cards). By trying to define all possible samples, I fear students might be misled about the population-sample relationship in hypothesis testing and the theoretical nature of a sampling distribution.

ReplyDeleteI'm glad you're making me thing about this, because in my limited experience I haven't used much to explain the concept other than drawings of overlapping sampling distributions, and the general explanation that lots of overlap would be higher p-values, and little overlap would be small p-values. I'm guessing there might be some computer simulations that would be helpful, but I haven't explored enough (yet) to find them.

Raymond -- Thanks for the comment, really good stuff to think about!

ReplyDeleteNow, I actually think one of the

advantageshere is to have an example that is about something other than testing a population mean. One of the things I struggle with in the introductory class is in trying to communicate that the concepts of confidence-intervals and hypothesis-tests apply to a whole universe of parameters other than just a mean (median, standard deviation, proportion, odds ratio, etc.) So dealing with those general concepts in isolation, prior to introducing the machinery of means-testing, I think might give valuable added perspective.And I think that part of the demonstration is that somehow you do indeed have to categorize all possible sampling results under the null-hypothesis. For this brief example, you can list them individually. For the case of a mean from an unknown population, the analogy is to use the Central Limit Theorem, and conclude that they are at least approximately normally distributed (for a sufficiently large sample). So there is a correspondence there that I'm consciously trying to highlight.