Back to book

Making the most of your data with Bayes

This page will give you the means for performing simple Bayesian analyses.

For a tutorial on using Bayes factors see:
Dienes, Z. (2014). Using Bayes to get the most out of non-significant results. Frontiers in Psychology, 5: 781. doi: 10.3389/fpsyg.2014.00781



For advice on specifying your “prior” (i.e. model of H1):

Dienes, Z. (2019). How do I know what my theory predicts? Advances in Methods and Practices in Psychological Science, 2, 364-377. https://doi.org/10.1177/2515245919876960

Dienes, Z. (2021). Obtaining evidence for no effect. Collabra: Psychology,7 (1): 28202. https://doi.org/10.1525/collabra.28202



On how to report Bayes factors:

Dienes, Z. (2021). How to use and report Bayesian hypothesis tests. Psychology of Consciousness: Theory, Research, and Practice8, 9–26

___________________________________________________________

1. Bayes factor.

The Bayes factor tells you how strongly data support one theory (e.g. your pet scientific theory under test) over another (e.g. the null hypothesis). It is a simple intuitive way of performing the Bayesian equivalence of significance testing, telling you the sort of answer which many people mistakenly think they obtain from significance testing, but cannot. A "null" result in significance testing, for example, does not automatically mean you should reduce your confidence in the theory under test; often you should actually increase your confidence. A non-significant p-value does not tell you whether you have evidence for the null or no evidence for any conlusion at all (or indeed evidence against the null). Yet people routinely take a non-significant result as indicating they should reduce their confidence in a theory that predicts a difference.

The Bayes factor needs two types of input: 1) a summary of the data and 2) a specification of what the theories predict. In total you will only need to enter about four numbers!

1) In a situation where you could do a t-test, the data summary is exactly the same as would be used for a t-test:

a) the sample mean difference between conditions, or between a mean and a baseline, call this meandiff; and
b) the standard error of the difference, call this SE.
Note that t = meandiff/SE. Thus, if you know t and the mean differnce between conditions, you can get the relevant SE from SE = meandiff/t. This applies for any type of t-test.
(Note more generally: a) could be any sample statistic, such as a regression coefficient, and b) is its standard error, so long as a) is distributed roughly normally.)

In sum: For the first step, enter the difference between conditions in the "sample mean" box and its standard error in the "standard error" box.

2) Next you specify the theory you are testing against the null hypthesis. Specifying the theory means saying what range of effects are consistent with the theory and if any are particularly likely. The 2008 calculator calls the plot of different plausibilities of population effects given the theory "p(population effect|theory)", and asks if this is uniform. A simple rule is that if you can say what the maximum plausible effect is, say "yes"; otherwise say "no".

a) If you can specify a plausible maximum effect, use a uniform from 0 to that maximum. Enter "0" in the lower limit box and the maximum in the upper limit box. Then click "Go!"
b) If you can specify a plausible predicted effect size P (e.g. based on a previous study or meta-analysis), say "No" to a uniform. Three new boxes will come up in the 2008 calculator, asking for the mean, standard deviation and number of tails (of a normal). A simple rule: If your theory allows effects to go in either direction,set the mean to 0, the standard deviation to P, and the tails to 2. If the theory makes a directional prediction, set mean to 0, SD to P and tails to 1. Then click "Go".

In sum: For the second step, you enter two numbers to describe your theory.

Note: A Bayes factor that is not based on the predictions of your theory will be irrelevant to your theory.

A Bayes factor of 3 or more can be taken as moderate evidence for your theory (and against the null) and of 1/3 or less as moderate evidence for the null (and against your theory). Bayes factors between 1/3 and 3 show the data do not provide much evidence to distinguish your theory from the null: The data are insensitive.


Notes:

  1. If the theory predicts a direction, the program assumes the predicted difference is in the positive direction. If your mean difference was in the opposite direction as theory, enter it as negative.

  2. The 2008 calculator assumed that data will be normally distributed around the population mean with known variance. Typically population variance is unknown and only estimated from data (and so the standard error calculated from your data is only estimated), so the assumption of known variance will be problemmatic for small sample sizes (say less than 30) - in which case use the correction (given in Box 4.4 on page 94 of Dienes (2008)): increase the standard error, SE, to SE*(1 + 20/df*df), where df is the degrees of freedom. Or, a better option, one uses one of the other calculators below with t-distribution likelihoods; enter the degrees of of reedom that wuold appear in a t-tet when it asks for degrees of freedom data.

  3. To test a a Pearson's correlation r, first transform it to make it normal with Fisher's z transform. This site will do that for you. It has standard error SE = 1/squareroot(df - 1). For example, a correlation between mindfulness and hypnotisability is found of 0.2 with 30 participants. The Fisher z transform of 0.20 is 0.20. It has degrees of freedom = 28, so standard error = 0.19. From past research, correlates of hypnotisability, if they exist, are often around r = .30. The Fisher z transform of .30 is .31. Enter sample mean = .20, standard error = .19, use a normal likelihood, 2 for tails (if that is the theory), 0 for mean f the normal and .3 for standard deviation. B = 0.78 (i.e. insensitive).

  4. The sample mean and standard error (for the data summary), and the limits of the uniform, or the mean and standard deviation of a normal, t-distribution or Cauchy (for specifying the theory) must all be in the same units. If your mean is on Likert scale, the predictions of your theory will also be in terms of a Likert scale. If you need to use standardized effect sizes, then r = sqrt( sqr(t) / ( sqr(t) + df) ). Then analyze r according to note 3.

  5. BROWSERS HAVE BY DEFAULT BLOCKED FLASH PROGRAMS. But you can use a stand alone flash player. Go here: https://www.adobe.com/support/flashplayer/debug_downloads.html and download the “flash player projector”. It is a .exe file; run it and a flashplayer opens.

 A 5-minute short instructional on using the calculator



1) Web Bayes factor calcuator (likelihood: normal, t, binomial, noncentral t or d; Model of H1: normal, t, beta, Cauchy, uniform; model of H0: point, normal, t, beta, Cauchy, uniform. Programmed by Lincoln Colling.)

2) A ShinyApp Bayes factor calculator (likelihood: default is normal; extra choice: t-distribution. Model of H1: default is normal; extra choices: t-distribution, Cauchy, uniform. Created by Harry Tattan-Birch).



3) For the original 2008 calculator: click here to download the Flash Bayes factor calculator (use the flash player described above to run it)



4) a ShinyApp Bayes factor calculator (likelihood: t-distribution; model of H1: t-distribution or uniform. Created by Bence Palfi)

5) The same as 4) but in French



For those who use Matlab, here is Matlab code for calculating Bayes factor in the same way as the flash program above. Baguley and Kaye (2010) provide equivalent R code. John Christie has also provided R code for the calculator, modified so that one can adjust the quality of the estimation of area under the curve with greater accuracy for his calculator (I used some pretty primitive numerical integration). Stefan Wiens provides R code here, including for using the t-distribution to model H1. For an example of R code using Bayes factors with logistic mixed effects models (glmer) , written by Elizabeth Wonnacott for Wonnacot, Brown, & Nation (2017). Bence Palfi pulled various bits of my Rcode together to make one function where you have a chioce of likelihoods (normal or t) and a choice of models of H1 (uniform, normal, t or Cauchy) which is used in the above ShinyApps.

For Bayes factor calculators for the binomial situation see here for two groups and here for one group, and Lincoln Colling’s calculator above for greater flexibility with models of H1.

 

Five minute Bayes:

The weakness of power

How many participants might I need?

How to analyze a 2X2 contingency table

 

2. Prior and Posterior distributions.

As well as a Bayes factor, it is usualy useful to determine what the most plausible set of population mean differences are, given your data and other constraints.

You start with prior beleifs about the population parameter. Assume you can represent your prior by a normal distribution (without grave misrepresentation) and also that your data are normal. Once you have determined the mean and standard deviation of your prior, collected data and hence found the mean and standard deviation of your likelihood (i.e. the mean difference in your data and its standard error), use this flash program to determine the mean and standard deviation of your posterior and look at graphs of the prior, likelihood and posterior distributions (use the Flash player described above; or this ShinyApp created by Neil McLatchie). If your prior is quite vague, the posterior is largely determined by the data. Thus your new prior before looking at the next study will be a normal distribution with a mean equal to the mean of Study 1 and a standard deviation equal to the standard error of the mean from Study 1.

Thus you can meta-analytically combine evidence across a series of studies with the same DV in the following way: For the mean of the prior enter the mean (mean difference etc) of Study 1 and for the standard deviaton of the prior enter the standard error of the mean difference. (This distributon represents the rational beleifs to have about the population paraemter value after seeing Study 1, given a vague prior beforehand.) Enter the mean difference for Study 2 as the mean of the likelihood and the standard error of Study 2 as the standard deviation of the likelihood. The posterior then indicated by the program gives the best estimate of the populaton parameter and its uncertainty in the light of both Studies 1 and 2. This could form the new prior to combining with a Study 3, and so on iteratively. If you have several studies with the same DV, this procedure can be followed to obtain an overall estimate of the mean difference and its standard error, which can be used in a Bayes factor calculator to determine the overall strength of evidence for H1 versus H0, or to evaluate credibiltiy intervals overall (see Dienes, 2014, for the principles of inference by interval).

Example: a previous study found that asking people to peform maths problems for 5 minutes a day increased their self discipline generally so they ended up doing the washing up two extra days each week. You replicate with different will-power interventions in three studies, finding the following increases in number of days of doing washing up each week: 0.5 (SE = 1.2), 2 (SE = 0.9), -0.5 (SE = 1.5). After Study 1, and before Study 2, one's prior would have a mean of 0.5 days and a SD of 1.2 (assuming before study 1 one had a very vague prior). After Study 2 and before Study 3, one's posterior from Study 2, and hence one's prior for Study 3, now has mean 1.46 (SD = 0.72). Finally, after Study 3, one's posterior has mean 1.09 (SD = 0.65). To perform a Bayes factor on the three studies as a whole, enter 1.09 as the sample mean and 0.65 as the standard error. Using a half-normal with SD = 2 days (the effect size from the original study), B = 2.08, indicating the theory that practicing will-power increases washing up episodes does not have substanital evidence either for or against provided by the three studies. (It might be worth exactly replicating Study 2 or the original study to see how that affects the overall evidence.)

_____________________________________________________________________________________________________________________________

To test your intuitions concerning Bayesian versus Orthodox statistics try this QUIZ.



For practical examples of using Bayes factors:
Dienes, Z (2015). How Bayesian statistics are needed to determine whether mental states are unconscious. In M. Overgaard (Ed.), Behavioural Methods in Consciousness Research. Oxford: Oxford University Press.

For a discussion of Bayes and the credibility crisis in Psychology:
Dienes, Z. (2016). How Bayes factors change scientific practice. Journal of Mathematical Psychology, 72, 78-89.

For a discussion of why Bayes factors are ideally suited for severely testing theories:
Dienes, Z. (2023). Testing theories with Bayes factors. In Austin Lee Nichols & John E. Edlund (Eds), Cambridge Handbook of Research Methods and Statistics for the Social and Behavioral Sciencesvolume 1: Building a program of research, pp 494-512. Cambridge University Press.

 A talk on how to test theories severely using Bayes factors given in Paris in 2022

A talk on Four reasons to be Bayesian given at Oxford in 2017; and a follow up workshop on Principles for Bayes factors.

A talk on how to use Bayes given at Lancaster earlier in 2015.

This is a lecture I gave on Bayes to Masters students at University of Sussex in 2014.

 

An essay I set students is: "Perform a Bayesian analysis on a part of the data from your project or from a paper published this year (consider an interesting question tested by a t-test – one test will do). Compare and contrast the conclusions from your analysis with those that follow from an analysis using Neyman-Pearson (orthodox) statistics. "

See also this assessment of several topics from the book.