Popper

This is a lecture I gave undergraduates introducing Popper and a lecture I gave postgraduates on Popper (both late 2014). This is a version of the postgraduate lecture given a couple of years later that discusses the relation of Popper to the current credibiltiy crisis in psychology.

Other online material useful for understanding Popper:

The Karl Popper Web
Stanford encyclopedia entry on Popper Many Universities have access to the online Stanford encylcopedia - if you have problems accessing, try from a University PC.
Meehl, P. (1967). Theory-testing in psychology and physics: A methodological paradox. Philosophy of Science, 34, 103-115.

Applying Popper's ideas to research in psychology.

When reading a paper or designing your own research, training yourself to constantly bear in mind several aspects of Popper's philosophy will greatly help your critical skills and hence ability to evaluate and conduct research. Popper said good science involves:

1) A substantial theory being put up to test
2) Safe background knowledge to try to crack it against
3) A test that is severe

Taking each in turn:

1) Can you isolate a substantial theory that is being put up to test?

A substantial theory means some unifying idea from which many predictions follow. A prediction should say what should happen as well as what should not happen (a prediction that rules nothing out is no prediction at all.) Note the difference between a substantial theory and a statistical hypothesis. “I am going to see if any of these variables correlate with each other” involves no substantial theory but a number of statistical hypotheses. “Coffee improves concentration only because people believe that it does" is a substantial theory. “People will spot more spelling errors out of 30 within half an hour of drinking one rather than zero cups of decaf coffee” is a statistical hypothesis derived from the substantial theory - together with some background knowledge. A substantial theory is likely to be a statement of a mechanism of how a phenomenon is produced. It is not just the statement of a null or alternative hypnothesis. Popper said theories should be bold unifying ideas.

The introduction to a paper may mention many theories. Some of these theories will not be ones the authors regard themselves as testing as such. For example, in a paper on evolutionary psychology, Darwin's theory of natural selection may be mentioned, but it is unlikely the authors would give up this theory no matter how their results turned out. So some theories are treated as unproblemmatic background knowledge which is used to motivate other theories - and one of these further theories may indeed be held ransom to the data. A useful exercise is to see if you can simply state a substantial theory that the authors would give up if a possible set of observations in their study were obtained.

Popper's basic criterion of good science is falsifiability. The point here is to determine *what* if any theory is being potentially falsified by the paper.

2) Is the background knowledge safe?

A substantial theory makes predictions by using auxiliary hypotheses, which are part of what Popper called 'background knowledge'. One part of background knowledge are the theories mentioned above which are used to motivate the ideas tested. Another part of background knowledge are the assumptions needed for the test to be a test. If we decide to test the substantial theory (that caffeine is placebo) by comparing no versus one cup of coffee and of decaf, what auxiliary hypotheses are being used? Try to list some. (They are 'auxiliary' meaning seconday, in the sense that the substantial theory under test is the primary hypothesis; the auxiliaries are the other things that need to be assumed for the test to be a test.)

There are many but one is “Decaf tastes like coffee and will make people expect the same consequences to the same extent.” Is it safe to assume this? If we fail to confirm predictions could we just as plausibly reject this auxiliary as reject the substantial theory? If so, the test is not a good one. We need to critically evaluate auxiliaries. For example, we could take expectancy ratings, or use other ways of disguising the taste of caffeine in active and placebo drinks if we are worried that decaf coffee is not up to the job. In general in evaluating a paper or designing research, list relevant auxiliary hypotheses and check you are happy to treat them as safe. Auxciliaries that are generally assumed in psychology experiments include that the dependent and independent variables measure the things they say they measure. It is only if the dependent and independent variables measure what they say (e.g. that a measure of concentration actually measures concentration) that the test could test the theory.

The critical examination of background knowledge can involve testing it. For example, you could compare a caffeine and no-caffeine drink and see if people can taste the difference when given a sip in a psychophysical study. You can always open up results to further critical scrutiny. For example, even if you are happy that the drinks tasted the same, caffeine may have side effects when taken as a whole drink (e.g. face flushes). And maybe these side effects motivate greater expectancies. Appropraite controls then need to be taken.. But eventually there comes a time when auxiliary analyses have survived sufficient criticism that you decide to treat background knowledge as safe. A failure of the prediction can then be transmitted to the substantial theory.

In talking about the safety of backrground knowledge, we are interested in the safety of that part of background knowledge that is needed to make the test a test of the theory identified in (1). The safety of the background knowledge inspiring that theory matters a lot less because a theory is only ever a guess anyway.

In order to implement the criterion of falsifiability we must take some things to be safe and secure in order to attempt to break the substantial theory against them.

3) Is the test severe?

Just because a theory is falsifiable, it does not mean that a particular test severely tests the theory. Popper wanted theories to be made to stick their necks out. A test is severe if it makes the theory stick its neck out, that is, if it is likely to falsify the theory if the theory is false.

Consider a test to determine if caffeine and placebo drinks result in same or different performance on a concentration task (spotting spelling mistakes). Your substantial theory predicts no difference. According to Popper, the test is severe if the predicted outcome is likely given the theory and unlikely given the rest of background knowledge. Take again the theory that a cup of coffee will miprove concentration because of the unconscious power of expectations. So we give a subject a test of concentration. Then we say "Now drink this cup of coffee!" Then we give the subject the test again. If coffee improved test performance -as the theory predicts - would we be happy that we have confirmed this prediction of the theory? The prediction is also likely on another theory that is part of background knowledge, namely that if subjects can guess the experimenter's hypothesis and "help" the experimenter get a "good" result they will do so (the problem of "demand charactersitics"). This problem would need to be dealt with for the test to be severe. (For example, maybe drinking the coffee is incidental, not apparently part of the experimental procedure.) In sum, the test is not severe if the prediction of the theory is likely on other established theories. In effect, the theory is not very falsifiable in the experimental context that we have used: we expect the same results whether or not the theory is true. Thus, we have not really put the theory at risk. We have not made it stick its neck out..

All of the above points apply no matter whether the study is experimental, quantitative, or qualitative, observational or a case study. Nothing about Popper's philosophy demands quantitative studies. If potential observations, of whatever form, could cause you to give up on some of your ideas, then we are in Popperian business, and all the above points apply.

In quantitative studies, test severity can be analysed further. It has a natural interpretation in terms of likelihood statistics (see chapter five) but we will for current purposes interpret it in terms of orthodox Neyman Pearson statistics (see chapter three). Popper did not explicitly specify which if either interrpetation, likelihood or Neyman Pearson, he supported, though his own personal interpretation of probability was not entirely consistent with the Neyman Pearson assumptions (in ways that do not affect the arguments here). A test is severe if the predicted outcome is likely given the theory and unlikely given the rest of background knowledge. So if the theory predicts a difference, a significant difference should be likely give the theory. That is, the test must have high POWER! If power is low, getting a null result does not falsify the theory. Conversely, if a theory predicts no difference, a significant difference should be likely if the theory false - that is the test must have high power (= prediction unlikely given rest of background knowledge).Passing a low power test does not strongly corroborate the theory when it predicts no difference. But failing this test (getting a relevant significant difference) still falsifies it. And passing a high powered test does strongly corroborate a theory predicting no difference.

Note there are three types of theories discussed above: The theory you want to test; the theories that inspired that theory, whose safety is irrelevant to the experiment being good science; and the auxiliary theories needed for the test to be a test and which we must regard as safe for the experiment to be good science. In designing your own experiment think carefully about all three: What are the big ideas inspiring your particular theory? (Try to connect your experiment to a larger, interesting context.) What theory are you actually testing? (Is it interesting itself and could it be shown wrong?) And what assumptions do you need to make for the test to be a test?

In reality, most papers will fall short: For example, few will make only assumptions that are really safe. But remember Popper's criterion of falsifiability is not black and white. What Popper provided was norms which make sense of what we are trying to do as scientists, even if we don't always reach the ideal. Without knowing the ideals, we couldn't understand how science works.

In summary:

The test is good if:
An explicit substantial theory puts its neck out by
Having safe background knowledge in constructing the test
Being subjected to a severe test in the sense that the predicted outcome is likely given the theory and unlikely given the rest of background knowledge.

An exercise I give to my class is to take a paper and analyse it according to these considerations. What is the main claim (substantial theory) under test, what are some auxiliaries, are they safe, could you make the same predictions from other background knowledge, and what if any results could have falsified the main claim?

Note that a paper can contribute to the scientific process without satisfying all the above: For example, it might provide a hypothesis that is in principle falsifiable even if the paper itself did not test the hypothesis.

This lecture to undergraduates and this one postgraduates goes over similar material.

An essay I set students beaing in mind the above points is the following: "Discuss to what extent your project or an empirical paper published this year is scientific according to Popper's demarcation criterion (you may include some discussion of the extent to which domain of psychology of which the paper is a part is scientific according to Popper). "

See also this assessment of several topics from the book.

Popper and politics

Popper first became well known in the English speaking world as a philosopher of politics rather than science, with the publication in 1945 of The Open Society and Its Enemies, which Popper regarded as his “war effort”. It is an argument for democracy rather than fascism or communism, the latter being examples of “closed societies”. While a decade or two ago it might have looked like the case for democracy was a foregone conclusion, this is no longer so, and Popper’s arguments have a renewed relevance.

Popper’s political philosophy, the championing of democracy as the “open society”, shows a remarkable analogy between democracy and science. The analogy shows why some popular criticism against democracy misses the point. For example, it is sometimes said a failure of democracy is that it cannot guarantee choosing the best leader (think of any elected leader you consider a disaster). But Popper argued that posing the problem for a political system to solve as "how to choose the best leader" is ill conceived, because we cannot know who is the best leader, just as we cannot know if we have a true scientific theory. (NO system guarantees the best or even a good leader, least of all the alternatives to democracy.)

Science does not guarantee that we find the best of all theories, it only allowsus to remove a theory by rational criticism (rather than force). Democracy does not guarantee that we select the best of all leaders, it only allows us to remove a faulty leader by rational criticism rather than force.

In removing false theories hopefully we can make progress towards better ones. In removing bad leaders hopefully we can get a better one next time (but even if we don't, the act of keeping leaders with only a precarious grip on power, in a transparent government, keeps them relatively honest!)

In science any one can in principle advance an argument that influences a fate of a theory; in a democracy any one can in principle advance an argument that influences the fate of a leader.

In science all relevant information useful for evaluating a theory should be publically available; in a democracy all relevant information useful for evaluating a leader (as a leader) should be publically available.

The scientific duty of a researcher is to criticise theories just as the patriotic duty of a citizen is to criticise their leaders. (Nationalistic governments will often use the excuse of patriotism to argue the opposite, in order to save their hold on power.)

The essence of both science and democracy is the open society, i.e. the society that actively encourages the critical tradition (see Dienes, 2008, p 6) in a way that allows theories and leaders to be discarded by reason rather than bloodshed. Popper’s views provide simple and powerful arguments against the rising influence of totalitarian politics (e.g. Grayling, 2009; Kampfner, 2009; Wolf, 2007). We as scientists and academics should resist the influence of any politics that closes the open society (or maintains a closed one) because such influence contradicts the core of our beings as scientists and academics (cf Ferris, 2010) – and as distinctively human beings (uniquely amongst animals, according to Popper, capable of critical discussion, and thus of true freedom). Of course, as Popper points out, as human beings and subjects we will always be tempted by the apparent certainties of the closed society (and its absolute authority), just as rulers will be tempted by the greater power that a closed society brings.

Citizens do have a duty of loyalty to their state, but Popper argues that such duty must be combined with ” … a certain degree of vigilance and even a certain degree of distrust of the state and its officers … All power has a tendency to entrench itself and the tendency to corrupt and the last resort it is only the traditions of a free society – which include a tradition of almost jealous watchfulness on the part of its citizens – which can balance the power of the state by providing those checks and balances on which all freedom depends (cited in Notturno, 2003, p 44).” Thus the critical attitude is all important, and voting by itself (often identified as definitional of democracy) does not make an open society. It is but a tiny component. “Democracy will work fairly well in a society that values freedom and tolerance, but not in a society that does not understand those values. Democracy … can never create freedom if the individual citizen does not care for it. (ibid)”
Compare: The external trappings of science (peer review, glossy journals, learned institutions, mathematical or statistical methods) do not by themselves produce science. Science will work fairly well in a community that values critical discussion (contrast Kuhn, 1962). If however power becomes entrenched, or reviewers accept or reject papers purely based on social psychological factors (in-groups/out-groups etc), then what goes on may have little to do with science. We must always be vigilant for the breakdown of the open society in both science and democracy.

One of the current arguments against democracy is the proposal that greater wealth can be created by a stable society created by strong (undemocratic) government, given a relatively free market (a combination that has created wealth for e.g. Singapore and China). At a time when it was thought democracy was necessary for wealth production, in 1958 Popper argued that “democracy does not ensure that anything is accomplished – certainly not an economic miracle….we should choose political freedom not because we hope for an easier life but because freedom is itself an ultimate value that cannot be reduced to material values…[As Democritus once said] ‘the poverty of a democracy is better than every wealth under an aristocracy or autocracy, for freedom is better than slavery’” (one could elaborate, freedom for all is better than slavery for some). Of course, while a free society may produce good solutions because of its very freedom in producing ideas (witness science; also Manville & Ober, 2003), and totalitarian socieities are prone to rigid thinking and waste through corruption, Popper proposes that “..it is wrong to think a belief in freedom will always lead to victory. We choose freedom not because it promises this or that but because it offers the only dignified form of human co-existence (reprinted in Popper 2001, pp 91-92).”Compare: We value science not because it necessarily creates wealth, but simply because we value freedom and truth, and science is the expression of our uniquely human ability to advance our understanding of the world by free critical discussion. If one values science, then one values democracy, because science can only flourish in an open society.

Are some countries not ready for democracy? That is like asking if some potentially emprical areas of investigation are not ready for science. The best way to hone people's critical skills in politics and science is give them the freedom to openly discuss and evaluate all options, all the while implementing the infrastructure needed for transparency and openness.

Bibliography and further reading:
Ferris, T.(2010). The science of liberty: Democracy, reason and the laws of nature. Harper Perennial.
Grayling, A. C. (2009). Liberty in the age of terror: A defence of civil liberties and enlightment values. Bloomsbury.
Kampfner, J. (2009). Freedom for sale: How we made money and lost our liberty. Simon and Schuster.
Manville, B,., & Ober, J. (2003). A Company of Citizens: What the World's First Democracy Teaches Leaders About Creating Great Organizations. Harvard Bsuiness School Press.
Popper , K. R. (1945/1993). The open society and its enemies, volumes 1 and 2. Routledge.
Popper, K. (2001). The myth of the framework: In defence of science and rationality. Routledge
Nottorno, M. A. (2003). The Open Society And Its Enemies: authority, community, and bureaucracy. In I. Jarvie and S. Pralong (Eds), Popper’s Open Society after 50 years: The continuing relevance of Karl Popper. Routledge, pp 41-55.
Wolf, N. (2007). The end of America: Letter of warning to a young patriot. Chelsea Green.

Popper argued the critical traditional may have started in ancient Greece. For a discusson what we might learn from Athenian democracy, listen to this lecture by Josh Ober.