Bayesian or Orthodox: Where do your intuitions fall?
Consider the following scenarios and see what your intuitions tell you. You might reject all the answers or feel attracted to more than one. Real research questions do not have pat answers. See if, nonetheless, you have clear preferences for one or a couple of answers over another. Almost all answers are consistent either with some statistical approach or with what a large section of researchers do in practice, so do not worry about picking out the one ‘right’ answer (though, given certain assumptions, I will argue that there is one right answer!).
1) You have run the 20 subjects you planned and obtain a p value of .08. Despite predicting a difference, you know this won’t be convincing to any editor and run 20 more subjects. SPSS now gives a p of .01. Would you:
a) Submit the study with all 40 participants and report an overall p of .01?
b) Regard the study as non-significant at the 5% level and stop pursuing the effect in question, as each individual 20-subject study had a p of .08?
c) Use a method of evaluating evidence that is not sensitive to your intentions concerning when you planned to stop collecting subjects, and base conclusions on all the data?
2) After collecting data in a three-way design you find an unexpected partial two-way interaction, specifically you obtain a two-way interaction (p = .03) for just the males and not the females. After talking to some colleagues and reading the literature you realise there is a neat way of accounting for these results: certain theories can be used to predict the interaction for the males but they say nothing about females. Would you:
a) Write up the introduction based on the theories leading to a planned contrast for the males, which is then significant?
b) Treat the partial two-way as non-significant, as the three-way interaction was not significant, and the partial interaction won’t survive any corrections for post hoc testing?
c) Determine how strong the evidence of the partial two-way interaction is for the theory you put together to explain it, with no regard to whether you happen to think of the theory before seeing the data or afterwards, as all sorts of arbitrary factors could influence when you thought of a theory?
3) You explore five possible ways of inducing subliminal perception as measured with priming. Each method interferes with vision in a different way. The test for each method has a power of 80% for a 5% significance level to detect the size of priming produced by conscious perception. Of these methods, the results for four are non-significant and one, the Continuous Flash Suppression, is significant, p = .03, with a priming effect numerically similar in size to that found with conscious perception. Would you:
a) Report the test as p=.03 and conclude there is subliminal perception for this method?
b) Note that when a Bonferoni-corrected significance value of .05/5 is used, all tests are non-significant, and conclude subliminal perception does not exist by any of these methods?
c) Regard the strength of evidence provided by these data for subliminal perception produced by Continuous Flash Suppression to be the same regardless of whether or not four other rather different methods were tested?
4) A theory predicts a difference in reaction time between two conditions. A previous study finds a significant difference between the conditions of 20 seconds, with a Cohen’s dz of 0.5. You wish to replicate in your lab. In order to obtain a conventional power of 80% you run 35 subjects and find a t of 1.0 and a p of .32. Would you
a) conclude that under the conditions of your replication experiment there is no effect?
b) conclude that null results are never informative and withhold judgment about whether there is an effect or not?
c) realise that while 20 seconds is a likely value given the theory being tested, the difference could in fact be 15 seconds either side of this value and still be consistent with the theory. You treat the evidence as inconclusive; e.g. your certainty in the theory might go down modestly from being about 65% to a bit more than 50%, and so you decide to run more subjects until the evidence more strongly supports the null over the theory or the theory over the null?
5) You look up the evidence for a new expensive weight loss pill. Use of the pill resulted in significant weight loss after 3 months daily ingestion with a before-after Cohen’s dz of 1.0 with n=10 subjects giving a p of .01. In addition, you accept that there are no adverse side effects. Would you:
a) Reject the null hypothesis of no change and buy a 3 month’s supply?
b) Decide 10 subjects does not provide enough evidence to base a decision on when it comes to taking a drug, withhold judgment for the time being, and help sponsor a further study?
c) Decide that in a 3-month period you would like to loose between 10-15kg. In fact, despite the high standardised effect size, the raw mean weight loss in the study was 2kg. The evidence that the pill uses a mechanism producing 0-10 kg loss (which you are not interested in) rather than 10-15kg (which you are) is overwhelming. You have sufficient data to decide not to buy the pill?