Last changed 3 Dec 2016 ............... Length about 6,000 words (40,000 bytes).
(Document started on 6 Oct 2012.) This is a WWW document maintained by Steve Draper, installed at You may copy it. How to refer to it.

Web site logical path: [] [~steve] [localed] [this page] [critical reviews]

Critical thinking as a general mental ability: Flynn's ideas

By Steve Draper,   Department of Psychology,   University of Glasgow.

This page was originally (part A) to present some notes and links on James Flynn's distinctive take on critical thinking (CT): notably that a few standard concepts, each underpinning a common argument schema, are most of what is needed for assessing any topic; and that this suggests how to test CT. Subsequently (part B) this page then offers a few notes about the range of ideas about what critical thinking as a general mental ability or skill is, as a "contrast set" to Flynn's. And offers some new points.

In terms of issues I already have web pages on, this page is obviously related to advice to students writing Critical Reviews and to exercises where students critique each other's work.

Stephen Downes' Guide to the Logical Fallacies, and his article on How to Evaluate Websites.

These ideas and abilities are all about skill at CT. A broader idea is the set of skills everyone should acquire (but are typically not taught in formal education), of which CT is at most one.

Contents (click to jump to a section)

  • Part A: Flynn's ideas on critical thinking
  • Part B: A brief outline of the range of contrasting ideas on CT
  • References

    Part A: Flynn's ideas on critical thinking

    Flynn (2009) — Prelude to Flynn on CT

  • Flynn's framework is that the last 100 years have shown a steady rise in IQ scores; this is now beyond reasonable doubt.
  • Yet how is this possible? Were our grandparents all morons? Educationally subnormal? Flynn collects these general problems into four paradoxes, and spends the main part of the book working out resolutions to them. The resolutions are based on analysis of the subscales in IQ tests, and the fact that some subscales have shown no improvement, others big improvements. He concludes that some cognitive skills have not improved (maths, English Literature), others have.
  • He summarises the improvements as essentially a great spread in the capacity for formal operational thinking: what Piaget called the "formal operational stage" as opposed to the "concrete operational stage": basically, manipulating abstract categories independently of specific and known examples.
  • He then asks what further cognitive improvements across the population can we hope or wish for. For real benefit, he argues that, in addition to intelligence / formal thinking, two more abilities need to spread. The third and final one is "wisdom" which amounts to a capacity for empathy, commitment to non-selfish ideals, and self-control, while the immediate next target is the spread of critical thinking.

    Critical thinking

    Jim Flynn has developed a view of what general critical thinking is needed by everyone:

    He thinks critical thinking is supported in practice by schemas or patterns of thought which you need to have ready in order to apply CT routinely and promptly. In 2009 he called these short hand abstractions (SHAs); in 2012 he switched to calling them "keys" (and I shall call them "schemas"). He has a list of these, and of what I would call "mal-schemas" which are schemas that are fallacious but often used in arguments. Cf. Downes' list of fallacies: guide to the Logical Fallacies.

    Flynn's first ten schemas

    In Flynn (2009) he lists 10 schemas (and 20 in Flynn, 2012). Here's my summary of the ten (you need to read the book to see longer explanations).
    1. Market (or the law of supply and demand).
    2. Percentage (i.e. proportions are often a more relevant measure than absolute quantities).
    3. Natural selection.
    4. Control group. Before and after comparisons of some intervention are usually misleading.
    5. Random sample (vs. biassed sampling): how a small sample really can represent a huge population.
    6. Naturalistic fallacy. (You can't argue from facts to values. Neither tradition nor trends in nature can make something good (or bad).)
    7. Charisma effect (Hawthorne effect). An innovative method may work as long as its inventor is there applying it, but may not transfer to others using it.
    8. Placebo effect. You can get good effects not from the drug or intervention, but from the participants' expectations.
    9. Falsifiability / tautology. They assert "All As are tall"; you produce an example A who is short; they say the example isn't a real A. In effect they change the definition of As to require being tall, at the price of circularity.
    10. "Tolerance school fallacy". An argument against ethical relativism.

    One important feature of Flynn's thinking is his suggestion that the character of human cognitive functioning in the general population has changed during the 20th century, and could change still further. He thinks the Industrial Revolution was responsible for the spread of scientific modes of thought (abstract categories and reasoning) from a minority to the majority. An aspect of this, then, is that a few specific concepts and patterns of reasoning have arisen which turn out to be widely applicable to many issues far beyond their original specialist scope. His schemas are examples of these. Given the length of time formal operational thinking took to spread, it is reasonable to consider that there could be more to come, and that a general HE should identify and promote these. In other words, just as a general education in the classics was once thought to educate learners for government, so Flynn is suggesting that our new scientific culture (taken very broadly) should be re-viewed for new general patterns of thinking.

    Looking for new schemas in other disciplines

    All disciplines should be looked at for possible new CT schemas that would be useful to everybody. We need to consult experts in each of these disciplines, not just reproduce what we happen to have learned ourselves. Conversely, this implies a view that disciplines are not solely specialist areas for insiders only (although that is how we almost excusively teach), but can occasionally produce ideas that are really useful outside the discipline: a truly educational (and optimistic) vision.

    So for (say) physics, we need to ask: What might a physicist add by way of universally useful schemas?

    One candidate schema from EDUCATION

    Clinchy's distinction between separate and connected ways of knowing. An agile critical thinker should be able to apply both ways of knowing to any given topic: but many are stuck with one or the other.

    Three candidate schemas from PHYSICS

    What might a physicist add by way of universally useful schemas?

    Three candidate mal-schemas from HISTORY


    If we are going to be serious about graduate attributes, we need to be able to test for them: for their presence or absence in an individual. Flynn addresses this: most people don't. There are however now one or two well regarded tests for students of graduate attributes, plus Flynn's.

    1. FISC: Flynn proposes a test of CT, which he calls the "Index of Social Criticism" or "FISC: Flynn's index of social criticism".

      It consists of 5 minute essays, one per schema or mal-schema, which are scored not for the conclusion reached but for whether the implicitly relevant schema is addressed i.e. deployed in the argument. E.g. if the question is "Motorway driving is safer than city driving because the latter causes more accidents": does the discussion use "percentage" i.e. argue on the basis of proportions of accidents per vehicle or per vehicle-mile or per journey, not absolute accident numbers.

    2. The CLA (Collegiate Learning Assessment) test.
      • A key reference about it is:
        Klein,S., Benjamin,R., Shavelson,R. & Bolus,R. (2007) "The Collegiate Learning Assessment: Facts and fantasies" Educational Review vol.31 no.5 pp.415-439
      • The website is:

    3. Ennis' 1985 critical thinking test is available from him: see

    4. The Cornell Critical Thinking Tests (from Ennis' test).   see here   see here.  

    One study which used Ennis' test is Timmons, Luke (2014) "The uncritical commute: The impact of students' living situations while at university".

    FISC tests 20 schemas. Cornell level Z says it tests: Induction, Deduction, Credibility, Identification of Assumptions, Semantics, Definition, Prediction in Planning Experiments. (My impression is that all this amounts to three kinds of thing: the logic of argument structures, analysing the meaning of words/usages, and logic applied to conclusions from an experiment. I also feel unhappy with some of the answers (forced choice between 3 options) for not including what I would regard as the most important analysis of the problem. I suppose it just shows that I can feel competent to write a web page on CT, yet either not possess competence at other people's version of it, or at least not understand their views on it.)

    Assessment of Flynn

    Flynn is a political scientist, a moral philosopher and a psychologist (in the sense that he has done a lot of work on IQ measures: both analysing the numbers, and critiquing their meaning). We might well say that the above list of schemas, which he presents as educationally universal, just reflect the disciplines he is trained in. What has he omitted, due to his own blinkers and narrow education? In fact, his framework for CT may still be perfectly good, but that doesn't mean this (meta-) critical question isn't worth exploring.

    What does EACH and EVERY discipline have to add? See above for some suggested additional schemas.

    General problems with the CT literature

    1. A problem underlying almost all CT literature is that it presupposes that the author is 100% expert at CT themselves. While natural in the context of, say, graduate attributes (the wishful, untested assumption that CT is something all graduates have acquired anyway, without specific training), this leads them to take the task to be one of them introspecting about what general patterns of thought they apply over and beyond their specialism, without considering either what they lack or what other people and other education can contribute. This in turn leads to patronising advice, self-satisfaction, and failure to see value in anyone else's discipline.
    2. The single biggest problem with most of the literature on critical thinking (CT): authors all assume that their own discipline is identical to critical thinking, and they lack the insight to even consider for a single moment that their own education might be deficient. Until you see a paper that works out what the authors themselves lack, and says they are now remedying their own deficiencies in CT, then you haven't yet found any that are capable of self-criticism.
    3. There is almost no discussion of CT in science: this is a second big symptom of blind parochialism. (Abercrombie (1960) is the important counterexample.)

    How does Flynn do?
    Flynn's proposals still suffer from these blindnesses. Nevertheless he improves the discussion to date in several original and important ways.

    1. Flynn at least (at last) includes suggestions from more than one discipline.
    2. Several of these are clearly from science. Some of his schemas are about standard criticisms of experiments and their interpretation; one is from Biology; and another is about using ratios (percentages) that compare numbers, to give meaning to raw numbers.
    3. He addresses the problem of testing CT.
    4. He identifies how most CT is the application of just a few arguments.
    5. And that each of these arguments originate in one discipline but are unusual in having a wide utility well beyond their parent discipline: a major observation and step forward. This implies a refreshing perspective of how disciplines might contribute to general intellectual culture, as well as to specialist expertise.
    6. Although he doesn't say this, his ideas make it easy to see that we should actively "mine" every discipline in turn for the one or two schemas that it may have which are useful well beyond the discipline. This is a large development programme for the future: Flynn has mined perhaps 4% of current disciplines for CT.

    Part B: A brief outline of the range of contrasting ideas on CT

    I first presented Flynn's ideas on CT: the first purpose of this web document. Then I assessed Flynn against the standard of typical writings on CT before him: a narrow view of CT. Now in Part B, I sketch a much wider range of conceptions of CT; not only to cite some landmark publications on CT, but also to indicate big contrasts in (often implicit) attitudes to what CT really is.

    1. An old but valuable text on critical thinking (for zoology students) is: Abercrombie,M.L.J. (1960) The anatomy of judgement: An investigation into the processes of perception and reasoning (Free association books: London).

      She found that science students were bad at scientific reasoning (possibly in part because they regarded science as a matter of fact). She set out to train a general skill: found wanting in her students. Interestingly, she characterises the issue as about the nature of "judgement": cf. Nicol, Pollitt (discussed below) and "evaluative judgement". Her views anticipate Perry's ([B] below): the students unconsciously believe their subject is all about facts, not about using evidence to make decisions, and so are unaware (and so uncritical) that they are judging; and that judgements are the basis of the observations on which they do science: "scientific judgement". They also anticipate Nicol's ([D] below): that judgement is even more fundamental than constructing (or critiquing) argument structures.

      A non-CT view of the content of a topic is of a set of answers to be learned; and if slightly more sophisticated, then answers each with a "reason" which is also to be learned like a catechism: like learning what Pythagoras' theorem states plus how to prove it, but with no practice at proving new conclusions.

    2. Perry (1968) can be read as implying that the content of each topic should instead be the range of tenable conclusions, the relevant reasons that can partially support these, and the skill to deploy these elements into arguments supporting or asserting any one of many alternative conclusions. (A brief description of Perry's views and some references are included in this web document.) Thus Perry sees CT as about the nature of the content of knowledge (epistemology). It makes no difference how brilliant a person is at CT as a method if they only know conclusions on a topic, not the evidence and range of tenable views.

      Perry (1968) believed this was a universal philosophical and educational truth about all knowledge, and so all disciplines / topics. Later work has shown that this presupposition is quite wrong. Educationally, some disciplines have no certainty and train students throughout on diversity of opinions, while others (including sciences) reach stable confidence on many topics while using active CT in researching new topics. In everyday life too, even quite young children are tacitly familiar with this. In discussing music, people generally assume that there are diverse tastes in music, and little prospect of convincing others by evidence and argument to change their taste; while contrariwise, no-one discusses alternative opinions about what valid arithmetic is. A "neo-Perry" view, then, is that for every topic, we should always learn reasons as well as conclusions. Additionally we need to learn, as part of the knowledge of the area, whether it is regarded as stable (certain) or as contested. If the latter, then we need to learn the more common rival conclusions, and the arguments supporting each. An extension of this would be to add some knowledge of who (what groups) support which conclusions.

      So, contrary to Perry's original views, in the neo-Perry view CT is not a universal feature of intellectual life, but is highly subject-dependent: both in terms of content and of procedure (the common techniques for applying CT to that discipline's content).

    3. There is also a topic in educational and psychological circles called "critical thinking", which views it as a teachable generic mental skill. A good reference is Kuhn,D. (1991) The skills of argument (Cambridge University Press: Cambridge). This literature seems to assume, without evidence, that critical thinking is a general mental skill unrelated to context or discipline; that it "should" be acquired by everyone without teaching (like your first language); and that (they look for evidence of this point) shockingly not all students have automatically got it. Kuhn's scheme (as instantiated in the marking scheme for her experimental test tasks) is basically to give one mark for mentioning more than one possible conclusion, one for giving reasons, one more if you acknowledge reasons going against your conclusions, and one for not sitting on the fence but nevertheless committing to a final preferred conclusion. It thus sees CT as about authoring an argument; and focusses on the structure of the argument, regardless of the content or validity of reasoning.

      Flynn's approach has as one aspect, paying attention to the validity of the reasoning, or the "reasons" given; but expecting most topics to concern only a handful of argument types each based on a concept. In this respect, it is actually about the validity of the argument structure not its mere surface format. As was Toulmin,S.E. (1958) The uses of argument (CUP). Distinguishing the form and appearance of reason from actual rational substance.

    4. Recently however David Nicol has argued that CT, or "evaluative judgement", underpins essentially all the general qualities an HE graduate should have which have been identified as graduate attributes.
      Nicol,D.J. (2010) The foundation for Graduate Attributes: developing self-regulation through self and peer assessment (Quality Assurance Agency for Higher Education). This view is basically that we commonly have to make and communicate judgements of the value of products (writings) by oneself or others. I.e. that making judgements, rather than constructing arguments, is a crucial skill in work and, arguably, in degree programmes. This is a distinct skill from that focussed on by Kuhn.

      This in turn opens the way to realising that evaluative judgements often have an implicit character: we can make them, but we do not use explicit reasoning to do so; nor can we always, or even often, fully justify them. This is the opposite position to that of Kuhn, Toulmin, and Flynn, who presuppose that explicit reasoning and rationality is the heart of CT. This has important implications for assessment in education; but is actually also pervasive in professional life. (Pollitt,A. (2012) "The method of Adaptive Comparative Judgement" Assessment in Education: principles, policy and practice vol.19 no.3 pp.281-300)

    5. Flynn's approach is genuinely novel. It amounts to the view that there are just a few ideas, schemas, which are hugely more valuable than most ideas for critical thinking outside their original area: and these are the schemas it might be worth every student learning. And these schemas are more specific than the claims of logicians that logic describes the eternal truths about all reasoning. Most of his schemas are relatively recent discoveries of ideas. He thinks humans are getting better at intellectual life.

      It implies (if we go beyond Flynn's own disciplinary egocentrism — he writes as if the disciplines he happens to know about are really the only ones anyone has any need to learn from for critical thinking) that every discipline might have one or two of these: what are they?

      He has in effect written a short textbook for teaching his version of critical thinking: this is an enterprise we could take up and apply to a broader canon of cross-disciplinary ideas. A model for a way forward.

      And if we react against his approach, then that too is interesting. He is arguing for all citizens being equipped with general critical thinking. But in reaction, I could point out that all humans (from a nuclear family to industrial society today) use specialisation of intellectual labour. We are defined by it. So any assumption that everyone should know the same seems profoundly blind: a madly out of date view of knowledge. In fact we operate by relying on others for almost all the knowledge which underpins our lives.

    My own views

    On methods: if CT is a useful transferable skill, then like nearly everything it comes from direct practice (not osmosis, infection or spontaneous generation), not as a byproduct of education. If we take this seriously, we need to consider that each subpart may need its own practice, including both authoring and reading separately: making holisitic implicit judgements, making reasoned judgements on argument validity, constructing new arguments, making careful analyses of arguments: not as invalid or dishonest but in trying to decide which of two different arguments should get more weight.

    On disciplines: what Perry's work ends up implying (although Perry himself failed to recognise this), but all the others ignore, is that CT is highly discipline dependent. It implies that it is part of the knowledge content of each discipline: nothing transferable about it. When someone doing their accounts goes over their spreadsheet tracking down an inconsistency, they are exercising CT for arithmetic. The methods for doing that (checking each calculation step separately, estimating approximate magnitudes in your head, calculating things in two independent ways, ...) are quite different from those that Flynn or Toulmin are considering. When physicists are confronted with a report implying that neutrinos travel faster than light, contrary to Einstein, they apply their own CT: e.g. is it the equipment, the calculations, or our theoretical assumptions that must give way? Many disciplines do not call it CT: but that doesn't mean it isn't, just that it is less general than some others would like.

    Summary of the space spanned by these views on CT

    The range of the above different approaches to CT could be re-considered as involving a number of independent dimensions or topics.

    1. Is CT general, or discipline-specific? (We may have to answer this differently for factual and conceptual knowledge, than for procedural skills.)

      Only users of the term "critical thinking" think it is general. Yet many disciplines spend no time at all teaching CT as a general skill. Instead they focus on the discipline-specific techniques of probing and achieving high quality conclusions. As judged by academics voting with their curriculum time, the vast majority do not believe that CT is general but specific. In contrast to both, Flynn's view is that each discipline contributes one or two general CT schemas which everyone should learn regardless.

      Is the content a learner must acquire for a given topic changed by a requirement for CT? Perry's theory ends up implying that the requirement for CT has a major effect on the kind of content learners need to acquire about each topic. For instance with CT the learner must now learn not only conclusions but reasons for them, and additionally a "weight" representing how confident you can be of each conclusion.

      Kuhn on the other hand assumes there is nothing at all to learn differently in each area. Flynn suggests that a very few technical ideas should be extracted from different disciplines, and applied everywhere.

    2. If CT is, or entails, a special skill or procedure, then what is the skill, or skills?
      • Is it holistic judgement?
      • Is it authoring new arguments?
      • Is it critiquing arguments?

    3. Format vs. validity. Is there a distinction between the format (surface structure?) of an argument, and its validity/meaning?
      • Kuhn and Toulmin offer a general account of arguments and can be used to do one kind of CT e.g. something isn't a valid argument because it doesn't give any reasons for its assertions.

      • But Flynn stresses another kind of CT which nevertheless attacks some arguments, which do give reasons, for being fallacious or invalid. This forces us to acknowledge that the approach of Kuhn and Toulmin can never reject major common fallacies which we see frequently in the media.

    4. CT in science. I haven't come across any account of CT in science. This probably reflects badly on CT advocates, rather than on science. If we are going to take the notion of CT seriously, we probably need to develop one. Here are a few notes towards this project.

      • Scientfic method. Feynman said "The first principle is that you must not fool yourself -- and you are the easiest person to fool. So you have to be very careful about that. After you've not fooled yourself, it's easy not to fool other scientists." Cargo cult science by Richard Feynman (1974)
        The grandiose claim would be, that the scientific method is CT at the best anyone has ever come up with. (And are not a cheap rhetorical trick, as CT can seem to be in some areas.)

      • Trouble shooting. Whenever an engineer or experimental scientist does trouble shooting on their equipment or experimental design, that is CT practically applied. I.e. to produce not words, but actions that solve a problem based on combining analysis and actual data/observations.

      • Real data and how to handle it. Most science degrees are poor (actually, are total failures) at getting undergraduates to wrestle with gathering real, messy data. But experimentalists have to apply extended and severe CT to their data: how to interpret it, improve the instrument, etc. Cf. Abercrombie (1960); and the physicist Deutsch observed by Turkle (1988) (p.133ff in

      • Deduction, maths. The biggest part of CT in much of science and engineering is done in maths. Some of the most famous physicists were theoreticians, and essentially spent time only on mathematical deduction. However it is not called CT but checking the calculation or proof. You don't learn rival views, you learn how to do proofs, and how to find errors in them.

    5. These considerations lead me to suggest that there are really not three but four main Perry stages, and using these shows how some critical thinking is in fact rather undeveloped in character.
      • PerryA: learning true not false material. Easy because you don't need to learn reasons, just what is "right". Obviously not critical.
      • PerryB: Learning that there are alternative tenable views on a topic. Much of what passes for reasoning is in fact limited to working out the logical consequences of evidence for, and evidence against, a given view: but such deduction cannot by itself help decide the relative merits of two opposing views.
      • PerryC: Coming to a judgement on which view is best. Evaluative judgement is the heart of critical thinking. Whether it is a philosopher debating but not resolving issues; law students in adversarial arguments; or mathematicians / physicists working out the consequences of a theory; none of them are resolving issues in most of their work. They are in fact stuck at the PerryB level.
      • PerryD: But there is a stage beyond just making a judgement under uncertainty. And that is designing a scientific experiment to resolve the question. Very few students learn to do this, to go beyond reasoning and get new information to settle a question. Yet this is the whole and entire point of science (though science students are seldom trained to do it). It is all that makes modern science better than Aristotle. It should be viewed as the highest critical thinking, not least because it escapes the connotation of critical being destructive and negative.

        To rephrase this away from the perhaps parochial context of science disciplines, and into terms of everyday decision making and management: Emergency decision making can only rely on information already to hand (PerryC), but if that is all you do you leave yourself, like a literary critic as opposed to a creative author, at the mercy of others' work. Better (PerryD) is to collect measures that tell you what you need to know (not just what is easy to collect). This applies in accountancy: do you just measure the balance of incoming and outgoings, or also cash flow (the time delay between these). It applies in manufacturing, where a great part is finding the means to measure quality in each item, rather than waiting for complaints ...


    Abercrombie,M.L.J. (1960) The anatomy of judgement: An investigation into the processes of perception and reasoning (Free association books: London).

    Downes, Stephen (2001) Guide to the Logical Fallacies

    Downes, Stephen (2005) How to Evaluate Websites

    Feynman,Richard (1974) Cargo cult science A Caltech commencement address. Also in Feynman, Richard P. (1985) "Surely you're joking, Mr. Feynman!": adventures of a curious character (New York, London: Norton)

    Flynn,J.R. (2009) What is intelligence?: beyond the Flynn effect (Cambridge : Cambridge University Press)

    Flynn,J.R. (2010) The torchlight list (Awa press: Wellington, New Zealand) [published, but currently expensive] the list.

    Flynn,J.R. (2012) How To Improve Your Mind: 20 Keys to Unlock the Modern World (Wiley-Blackwell)

    Flynn,J.R. (2012) Fate & Philosophy: A Journey through Life's Great Questions (AWA press)

    Flynn,J.R. (2016) Does Your Family Make You Smarter?: Nature, Nurture, and Human Autonomy (CUP)

    Klein,S., Benjamin,R., Shavelson,R. & Bolus,R. (2007) "The Collegiate Learning Assessment: Facts and fantasies" Educational Review vol.31 no.5 pp.415-439 doi: 10.1177/0193841X07303318

    Kuhn,D. (1991) The skills of argument (Cambridge University Press: Cambridge).

    Nicol,D.J. (2010) The foundation for Graduate Attributes: developing self-regulation through self and peer assessment (Quality Assurance Agency for Higher Education).

    Perry, W.G. (1968/70) Forms of intellectual and ethical development in the college years (New York: Holt, Rhinehart and Winston)

    Pollitt,A. (2012) "The method of Adaptive Comparative Judgement" Assessment in Education: principles, policy and practice vol.19 no.3 pp.281-300 doi:10.1080/0969594X.2012.665354

    Timmons,Luke (2014) "The uncritical commute: The impact of students' living situations while at university

    Toulmin,S.E. (1958) The uses of argument (CUP).

    Turkle, Sherry (1988) Report on Project Athena See p.133ff on the physicist Deutsch observed by Turkle.

    Web site logical path: [] [~steve] [localed] [this page]
    [Top of this page]