Vektorikuvitus. Avattu lehti tai kirja.

What Use are Numbers in Environmental Social Science?

YHYS ry logo

Yhteiskuntatieteellisen

ympäristötutkimuksen

seura ry

Yhteiskuntatieteellisen

ympäristötutkimuksen

seura ry

What Use are Numbers in Environmental Social Science?

28.2.2022

Yrjö Haila -gradupalkinnoksi valittu teos 2021

Read the text here in pdf form

Tim Harford: How to Make the World Add Up. Ten Rules for Thinking Differently About Numbers. London: The Bridge Street Press, 2020.

On Numbers
Tim Harford’s (aka Undercover Economist, a weekly columnist in Financial Times) new book “How to Make the World Add Up” is about numbers. Numbers are – well, numbers. Nothing could be clearer; so we might think. However, if we take numbers at the face value always and in every context, we end up on a slippery slope. Before getting to Harford’s book, I’ll begin with a small excursion into the world of numbers.

So, what are numbers? Preliminarily: numbers offer a device to be precise when uttering claims in which quantity matters. Quantity in the most direct sense is the number of items we are concerned with; in a bit less direct sense, quantity is about measures describing an entity we are concerned with. Measures, in turn, can be “external” to the entity like length or volume, or “internal” like density or weight; but these grade into one another smoothly, depending on the situation. To sum up: numbers do specifications for us concerning stuff in the world.

But this note invites another important specification: When describing an entity with numbers, we should make sure that the character of the numbers correspond to the character of the entity of concern. Quite simply: whether a direct count of items in a collection is informative depends on what items are counted and for what purpose. In other words, there is a relation between what the items really are, and what we want to know about them; an ontological connection, if you wish.

Simple examples are easy to come by: How many persons can enter a lift at the same time? – Are they all adults, or are they a couple with a number of small children? A more complicated (and much more relevant) example was provided by French physicist Henri Poincaré (1854-1912) with a note on why classical calculus is usable for describing the behaviour of atoms and other particles in gases:
“It might be asked, why in physical science generalisation so readily takes the mathematical form. … It is because the observable phenomenon is due to the superposition of a large number of elementary phenomena which are all similar to each other, and in this way differential equations are quite naturally introduced” (emphasis in the original).[1]

With the help of Poincaré’s aphoristic note we can articulate a principle: numbers, to make sense, must correspond in some ways with the nature and behaviour of the items of concern. Nobody is interested in the exact number of particles in a volume of gas; instead, we are interested in how the particles interact with one another, or, more precisely, what results from such interactions: what does a whole lot of gas do (as to volume, temperature, density)? This question opens the path toward systematic research in the field of statistical mechanics.

Poincaré’s note was about the behaviour of molecules in a volume of gas, but the same idea has been expanded to the realm of life, to population dynamics; this was initiated by the foundational figures of the field, Alfred Lotka (1880-1949) and Vito Volterra (1860-1940). Lotka’s justification for this solution is always worth citing:

“It would seem, then, that what is needed is an altogether new instrument; one that shall envisage the units of biological populations as the established statistical mechanics envisage molecules, atoms and electrons. …”[2]

Volterra came up with a similar analogy, most likely independently. They both realised that their object of research – biological populations – can be abstracted into vast collections of identical entities, analogous in this sense with a volume of gas molecules. – Later, a whole crowd of other population ecologists have relaxed this assumption in various ways, but it is a different story (albeit quite typical of how mathematics is used in the sciences).

By evoking statistical mechanics and population dynamics we move firmly into the realm of environmental research: to problems such as air pollution and demography. Everybody cognisant about environmental discourses knows the relevance of these problems, both historically – as foundational issues of the field – and also in the light of current discourses. But how to articulate the specific relevance of these problem areas, and what are contexts that matter?

Philosopher Alfred North Whitehead (1861-1947) provided a framing for exploring this question; as follows:

“There are two contrasted ideas which seem inevitably to underlie all width of experience, one of them is the notion of importance, the sense of importance, the presupposition of importance. The other is the notion of matter-of-fact. There is no escape from sheer matter-of-fact. It is the basis of importance; and importance is important because of the inescapable character of matter-of-fact. We concentrate by reason of a sense of importance. And when we concentrate, we attend to matter-of-fact.” (emphasis added).[3]
This binary stimulates us to orientate further, following its two dimensions, as it were. The first dimension is specifying matters-of-fact; this is about finding out what actually exists or has happened, and invites the use of numbers whenever needed. Simple existence statements may refer to singletons, items that are unique and do not add up to any larger units. This is exceptional, however; much more commonly, numbers are useful for describing and analysing patterns and processes in collections of different sorts.

The second dimension is specifying reasonable criteria for importance. This is no simple task because several unjointed perspectives are on offer; Whitehead takes up four: morality, logic, religion, art (p. 11; italicised in the original). Relevant questions here include: Are these criteria important together, or separately? What is the role of context? Can numbers help to specify what the sense of importance is grounded in, according to these or other criteria and in variable situations? How reliable are numbers in the first place, and what sort of numbers?

Tim Harford’s aim is to throw light on this sort of questions: he has organised his book as a series of ten rules, with an introductory story and an epilogue. The book includes splendid stories which make it a delight to read (literally). Let’s start the journey.

Addressing the World with Numbers
Introductory story: “How to lie with statistics”
Harford begins the journey by taking up a famous book “How to lie with statistics” published by American journalist Darrell Huff in 1954. The book was received with great acclaim (and also read with enthusiasm by a teenager named Tim Harford, in company with innumerable other teenagers; well, I was also at some stage an enthusiast[4]). – However, Harford grew more and more uncomfortable with the book and its quite dramatic but also carefully selected stories of misuse of statistical reasoning. Moreover, as he later learned to know, something else happened on the statistical front in the very same year, 1954: “two British researchers Richard Doll and Austin Bradford Hill produced one of the first convincing studies to demonstrate that smoking cigarettes caused lung cancer.” (p. 3).
The thing is, demonstrating the connection between smoking and lung cancer required a good command of sophisticated statistics. The researchers Doll and Hill began to collect data on patients from hospitals in London in 1948; the resulting data showed a positive correlation between smoking and cancer. To improve the credibility of the data, they sharpened the design by sending a ‘questionary’ to all doctors in the UK (59,600 altogether), asking them to report on their habit of smoking. Doctors were a suitable target group as they would stay on the medical record and their causes of death would also be preserved in the archives held by authorities. After having waited for a couple of years, Doll and Hill had a rich enough data set to show that smoking, indeed, causes lung cancer; they published the results in The British Medical Journal.[5]

All environmental social scientists are familiar with what happened next: Tobacco industries launched a fierce counter-attack on the credibility of the research, emphasising all sorts of uncertainties that could possibly be evoked concerning the case; “correlation is not causation” is the basic refrain in this struggle which was carried over first to air pollution and acidification, then to climate change.[6] This claim is, of course, valid, but the statistical reasoning has been built upon much more variable and convincing materials over the years than mere correlations; sophisticated statistical methods have been an irreplaceable help in the course of the fight. – So, ironically, the (in)famous foundational book playing down statistical evidence, and the brilliant pioneering use of statistical evidence to cast light on a health calamity, came out in the same year.

Indeed, numbers can gain the best kind of publicity and support from cases in which their use has been invaluable for common good. We are living in the middle of another model case: the COVID pandemic. Originally, in early 2020, very little was known of the nature and behaviour of the virus. Surprisingly quickly, however, data detectives were able to dig up data that offered answers to a whole range of crucial questions; we all know the main features of the story. We also have learned how complicated problems real uncertainties inherent in the Covid-story can create.

After offering these two case descriptions as an introduction, Harford spells out his aim (p. 20):

“Many of us refuse to look at statistical evidence because we are afraid of being tricked. We think we are being worldly-wise by adopting the Huff approach of cynically dismissing all statistics. But we’re not. We’re admitting defeat to the populists and propagandists who want us to shrug, give up on logic and evidence, and retreat into believing whatever makes us feel good.”

The first rule: Search your feelings
The theme of the stories collected under this rule is motivated reasoning: cases in which people have believed what they wanted to believe. A more direct name for the term is wishful thinking.

Two Dutch gentlemen and their intermingling fates frame the description of the rule: a famous art historian Abraham Bredius, a specialist on seventeenth-century master Johannes Vermeer, and a forger Han van Meegeren who had succeeded in producing a new painting that was presented to Bredius as an original Vermeer. When presented with the forger, Bredius fell into the trap, convincing himself that the forgery was genuine (this happened in 1937). The case was later revealed, but only after the painting and several other ones were acquired by Dutch art museums, for high price.

The story weaves together two deep ironies: First, Bredius was fooled precisely because of his deep knowledge of Vermeer’s known works, not in spite of his knowledge; this was “wishful thinking” in action. Second, van Meegeren lived during the Nazi occupation through the war years as a highly respected specialist in art history, and enjoyed a luxurious position with plentiful privileges. After the occupation ended he was arrested, but at the court he presented himself as somebody who had “fooled” the Nazis and successfully avoided their repression, not as the collaborator he actually had been. So successful was this forger of his own identity that just before he died in 1947, he was evaluated in a popular poll as “(except for the prime minister) the most popular man in the country” (p. 49).
This double-story brings into focus the importance of emotions in the shaping and ossifying of opinions on important issues: on the personal level in the case of Vermeer specialist Bredius, fooled because of his specialism, and on a collective level in the case of van Meegeren, promoted to a hero because of the wish of the Dutch public to get forward from the destruction – physical and mental – of the occupation. Harford goes through other, more general and well-known cases of wishful thinking such as the denialism of AIDS-HIV link (or the reality of HIV), climate change, and COVID-pandemic. These make heavy reading, but all these cases lead to a similar conclusion (p. 49):

“If wishful thinking can turn a rotten fake into a Vermeer, or a sleazy Nazi into a national hero, then it can turn a dubious statistic into solid evidence, and solid evidence into fake news. But it doesn’t have to. There is hope. We’re about to go on a journey of discovery, finding out how numbers can make the world add up. The first step, then, is to stop and think when we are presented with a new piece of information, to examine our emotions and to notice we’re straining to reach a particular conclusion.”

The second rule: Ponder your personal experience
This rule is about the relationship between personal experience/ impression of events in the surroundings, and statistical data describing the same events. The relation can work both ways: statistical view (also called “bird’s eye view”) and personal view (also called “worm’s eye view”) sometimes are in mutual coherence but sometimes they are in conflict. Harford asks (p. 53): “what should we do when the numbers tell one story, and day-to-day life tells us something different?”
There is no simple and straightforward answer. First of all, personal views can be distorted for several reasons. Opinions and beliefs among our peers also have an influence. An example is suspicion concerning vaccines. Harford takes up the widely held belief that the vaccination agains measles, mumps and rubella (MMR)[7] may cause autism although this belief has been shown wrong in several well-designed studies. Vaccine refusal during the Covid pandemic is an actual example. Also, the stream of news in different media can create an expectation that colours everything that happens next; or sensational pieces can get an overwhelming weight. Harford calls this source of error naive realism: “the sense that we are seeing reality as it truly is”.
An insidious conflict of perspectives can arise if some particular type of statistics are collected in order to build up norms for activities within a particular field. For instance, several studies have demonstrated that if doctors are renumerated on the basis of successful operations they tend to avoid taking on risky cases. Another similar example: when colleges in the USA were rewarded according to how selective they were when accepting students, the colleges improved their ranking by inviting as much applicants as possible: the percentage of approvals declined. The danger inherent in using complicated statistical measures as an indicator for the success of particular activities was described by psychologist Donald T. Campbell as follows:
“The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.”[8]
Campbell’s aphoristic maxim captures aptly controversies concerning the level of inflation, unemployment, or other difficult to measure social phenomena. As to overcoming the problem of bird’ eye view vs. worm’s eye view, Harford’s advice is (p. 69): “try to take both perspectives… They will usually show you something different, and they will sometimes pose a puzzle: how could both views be true?” – And as fuel for the imagination of us readers he offers the internet address of the website Dollar street which offers photos and short videos of lives of families from different countries and different social positions.[9]

The third rule: Avoid premature enumeration; don’t put things into numbers too hastily
Numbers are at a key position in all statistical reasoning; but questions to always ask are: What do the numbers stand for? What is being counted, and for what purpose? If such questions are not clarified at the outset, the resulting figures may lead to a blind alley. Harford’s term for this type of blunder is premature enumeration: coming up with numbers without being in the clear as to what they mean.

Harford takes up a couple of dramatic examples. First, an observation dating to the mid-2010s that there was considerable variation in the rate of deaths among newborn babies between different counties in England. An instant suspicion arose that there was corresponding variation in quality of the care offered at different hospitals. However, it turned out the the difference was due to variation in the “cut-point” between stillbirths and live births adopted in different hospital groups, the critical difference being twenty-two vs. twenty-four weeks of pregnancy. If babies born within this time margin are counted as live births, not stillbirths, the increase in fatality rate is sufficient to explain the difference. Of course, the death of a newborn is a tragedy in any case, but the implications as to the quality of the care is quite different depending on how the “cut-point” is defined.
Another case open to misinterpretation is the number of gun deaths in the USA: the total was 39,773 in 2017. The figure is certainly an indicator of wide-spread gun violence, but when it is complemented with a note that about 60 % of the fatalities were suicides, the implications change. On our home turf, in Finland, we could take up the number of fatal traffic accidents per year as a comparative case. The figure is tragically high continuously, no doubt, but we also know that a number of them are actually suicides: Does this matter when the focus is on traffic safety?

Harford offers a detailed exploration of indicators that are used to demonstrate differences in wealth across countries; “an important issue, but also an issue about which many people have very strong beliefs, yet a weak grasp of the definitions involved.” – The whole section (pp. 82-92) is worth reading several times to make sure that we understand where the potential errors and pitfalls in this case are hidden. The question is: What are relevant indicators of wealth, and what do they indicate?
To introduce the issue, Harford takes up a widely publicised statistical claim put forth by Oxfam in 2014 and reported by the Guardian with the following title: “85 Richest People as Wealthy as Poorest Half of the World”; he also cites other strikingly misleading pieces of news based on the same Oxfam numbers. Next he shows in detail that the figures are highly misleading. The key problem is the way “richness” was measured. “What’s being measured is net wealth – that is, assets such as houses, shares and cash in the bank, minus any debts.” (p. 85).

While the figures Oxfam used in their research were as good as can be dug from public sources – which are quite good – the measure itself leads astray. As Harford shows in detail, there is nothing surprising in that “net wealth” has accumulated to people representing the upper classes of the rich societies; it is a good indicator of their economic security. However, it is worthless for bringing forth significant variation in types of poverty. After going though the data used by Oxfam (a bit updated version), Harford concludes (p. 88): “Very roughly speaking, the richest half a billion people have most of the money in the world, and the next billion have the rest. The handful eighty-five staggeringly wealthy super-billionaires are just a handful, so they own less than 1 per cent of this total.”
Harford continues by looking at some other possible ways of measuring inequality – which is worth reading carefully, as I already mentioned – and concludes (p. 91-2):

“Statisticians are sometimes dismissed as bean-counters. The sneering term is misleading as well as unfair. Most of the concepts that matter in policy are not like beans; they are not merely difficult to count, but difficult to define. … The solution, then: ask what is being counted, what stories lie behind the statistics. … What I hope we’ve learned over the past few pages is that the truth is more subtle yet in some ways easier [than offered by sheer numbers]: our confusion often lies less in numbers than in words.”

The fourth rule: Step back and enjoy the view
This rule is about the relation between ‘matter-of-fact’ and ‘importance’ I took up above, citing Whitehead – namely: Harford’s advice is that we always look at a bigger picture at the same time as we are focusing on a remarkable single fact or event. Bad news tend to dominate the news stream, but by increasing the temporal or spatial horizon, as the case may be, puts pieces of news into a better perspective. Typically, people think the situation on any indicator is better in their own environment than in the nation as a whole; for instance in the UK, immigration is considered a problem on the national scale more commonly than locally. As we know, the setting is quite similar on our home turf in Finland. – Which one of these perspectives is more appropriate?

There is an interesting asymmetry between good and bad news – both as they come through in the media and as they can be imagined as potentialities in the immediate future. In the media: dramatic accidents are almost instantaneous whereas positive developments unfold slowly. Thus, it is to be expected that media competing for their share of the public attention tend to emphasise dramatic events. On the personal level: it is much easier to imagine what bad could happen tomorrow or next week than to think what improvements in one’s life could happen within the same time window. Good statistics presenting long-term trends or developments across larger areas might offer a remedy by providing a proper perspective; Harford’s advice (p. 109):

“So however much news you choose to read, make sure you spend time looking for longer-term, slower-paced information. You will notice things – good and bad – that others ignore.”

The fifth rule: Get the back story
The main theme of this rule is survivorship bias, i.e. the tendency of those cases to become recorded that have “survived” an earlier, unintended screening – which, of course, means that the sample finally analysed tends to be biased. This rule, too, originates from a story: During WW2, mathematician Abraham Wald was asked by the US air force to give advice on how to reinforce their planes. The material to use comprised all planes returning from their sorties, and the traces of damage – bullet holes – that showed up. The problem with this material was precisely ‘survivorship bias’: planes that were fatally damaged never returned back to the base. – The point of the rule is (p. 115):

“[W]hat we see around us is not representative of the world; it is biased in systematic ways. Normally, when we talk of bias we think of a conscious ideological slant. But many biases emerge from the way the world presents some stories to us while filtering out others.”
From this presentation of the principle of this rule, Harford moves to a story that is at the heart of academic research: the reproducibility – or not – of results of research published in prestigious journals. The problem rose into attention in psychological research. In 2005 epidemiologist John Ioannidis published a paper entitled “Why Most Published Research Findings are False”; he has later addressed what is known as “replication crisis”, i.e. that a large proportion of the results of empirical research published in scientific journals cannot be reproduced. Harford was originally skeptical concerning the “hyperbole” of the title but became later more and more convinced that the case is real; his description of the discussion that followed is well worth reading and learning from (pp. 127-143). He ends the section with recommending two invaluable internet resources that publish critiques of research on medical and social policy research: warm recommendations![10]

How come? – Harford notes that most of the problems are due to publication bias, a close kin to ‘survivorship bias’ – i.e., that journals are “simply” more likely to publish research presenting new, preferably sensational results than research trying to replicate old results. Ordinarily, replications do not survive long enough to get published. But in this mess is also hidden a possibility of less innocent malpractice: purposelful fudging on specific aspects of the research procedure. Harford offers as an example an informative, demonstrative test case put up by three psychologists (p. 126), and comments on their purposefully fudged “results”: “All utter nonsense, of course – but utter nonsense that bore an eerie resemblance to research that had been published and taken seriously.”

The sixth rule: Ask who is missing
Quantitative empirical research in the social sciences is relatively new; it dates back to the decades following the second world war, by and large. As an introduction to his sixth rule, Harford takes up famous experiments conducted by a pioneer in the field Solomon Asch in the early 1950s; the research was about social conformity in opinion forming. The results are fascinating, but are they representative of “human nature”? – Well, there is a problem; as Harford writes (p. 147): “Was it really necessary that not a single one of his experimental participants, neither the stooges nor the subjects, was female?”

The question that this observation triggers is: Who are included in samples? An all too common bias is that some part of the target population is simply left off from the population actually sampled. Gender bias has been rampant in medical research up to recent times. A current example comes from the early weeks of the COVID pandemic; as Harford narrates (p. 150):

“The gender blind-spot has yet to be banished. A few weeks into the coronavirus epidemic, researchers started to realise that men might be more susceptible than women, both to infection and to death. Was it because of difference in behaviour, in diligent hand-washing, in the prevalence of smoking, or perhaps a deep difference in the biology of the male and female immune system? It wasn’t easy to say, particularly since of the twenty-five countries with the largest number of infections, more than half – including the UK and the US – did not disaggregate the cases by gender.”

In more technical terms, the bias that is represented here can be described as the difference between sample error and sample bias. The former is primarily about sample size: the larger the sample, the more reliable an analysis is possible. But the size of a sample is not all to it; a more insidious bias is created if the sample is systematically biased, if some part of the target population is missing. This is a problem haunting all international-level efforts to measure, for instance, the success of sustainable development, using any of the indicators developed for this purpose. Harford takes up examples in which disaggregation according to gender would be necessary but cannot be carried out because of the nature of the survey data.

He also notes that some proponents of the use of ‘big data’ claim that in some data sets the sample covers the whole population. However, the assumption “N = All” is generally bonkers; there are almost always some corners of the population that are not reached by the survey or poll. As we have recently learned, this problem concerns election polls: a portion of the respondents are less likely to respond honestly than the rest.

The lesson? – Be alert (p. 162):
“[O]ften a quick investigation will reveal that the study has a blind spot. If an experiment studies only men, we can’t assume it would have pointed to the same conclusion if it had also included women. If a government statistics measures the income of a household, we must recognise that we’re learning little about the sharing of that income within a household.”

The seventh rule: Demand transparency when the computer says no
Quite a few of us probably remember the hype produced back in 2009 by “Google Flu Trends”, a search algorithm published by Google that was supposed to reliably predict flu epidemics way in advance of the responsible institution Center of Disease Control and Prevention. The algorithm was based on statistics describing how often Google came across searches using terms that suggested people were anxious about falling ill with flu. Google Flu Trends was even hailed by the science journal Nature as a revolutionary invention and use of ‘big data’. However, the performance of Google Flu Trends declined during the next few years, and the algorithm was shut down in 2015. This is the story Harford uses as the introduction to his discussion of computers and ‘big data’.

So, what happened? – The details are still obscure, as Google famously does not reveal any details of the algorithms they build up and use. What is obvious, however, is that the correlation detected by Goole was spurious, probably for several perhaps interconnected reasons; a mere correlation in how people use a search engine does not provide reliable insight. “If you have no idea about what is behind a correlation, you have no idea what might cause that correlation to break down.” (p. 167).

In a discussion with an experienced statistician Harford summarises the cheerleading claims concerning ‘big data’ as follows: the algorithms supposedly show merits “… of uncanny accuracy, of making sampling irrelevant because every data point was captured, and of consigning scientific models to the junk-heap because ‘the numbers speak for themselves’.” – Quite expectedly, the statistician comments on these claims (p. 173-4): “complete bollocks. Absolute nonsense. … There are lots of small data problems that occur in big data. They don’t disappear because you’ve got lots of the stuff. They get worse.”

So much for ‘big data’, but as Harford notes in the stories in the rest of the chapter, the use of algorithms is well defensible in many choice situations: it is quite possible that an algorithm performs better than a human specialist who is worn down by stress, excessive working hours, momentary depression, or something similar. Diagnosis of a medical condition is a case that has been scrutinised already in pre-computer era, using simple checklists, with quite good success. In any case, it is important that the criteria of choice be made public, whether the agent is a person, a checklist, or a computer algorithm. – As Harford notes, this maxim has an analogy in the early stages of modern science: a critical difference between alchemy and scientific experimentation in the early modern era was precisely secrecy of the former and publicity of the latter.

The eighth rule: Don’t take statistical bedrock for granted
As an introduction to this rule, Harford tells a story about a car crash in Washington, DC on Monday, 9th October, 1974. The driver (pathologically drunk): Wilbur Mills, “one of the most powerful men in the United States … an Arkansas congressman since the 1930s, who as the long-serving chair of the House Ways and Means Committee effectively had veto power over most legislation.” The passenger: Annabelle Battistella (aka Fanne Foxe), “an erotic dancer at the Silver Slipper nightclub”.
The story functions as an introduction to a description of, or rather homage to the importance of the Congressional Budget Office (CBO), established in the aftermath of the event (How on earth were the car crash and the establisment of CBO entwined? – You’ll find out on pp. 197-201).

The thing is, as Harford shows, that reliable official statistics cannot be taken for granted. Quite to the contrary, both in the US and in Britain there have been efforts to change statistics in accordance with the wishes of this or that political agent. This is a permanent worry in authoritarian countries. Most recently – as of this writing – the Turkish president Recep Tayyip Erdoğan sacked the head of the state statistics agency because they released data showing that last year’s inflation rate hit a 19-year high of 36 % (Financial Times, January 29, 2022; this was for December -21, for January -22 the rise was 48.7 %).[11] – In the genre represented by Erdoğan, Donald Trump was in a class of his own, but Tim Harford tells hair raising stories also from Greece, Argentina and India.

Political pressures on people who are in charge of official statistics have occasionally been heavy almost everywhere, one time or another. As a remedy, publicity is what Harford promulgates also concerning this issue; but there are no guarantees. He wraps up the rule with a paean to labouring statisticians (p. 226):

“When a country’s national statistics fall short, an international community of statisticians will complain. When an independent statistician is attacked or threatened by politicians, that same community will rally to his or her defence. Statisticians are capable of greater courage than most of us appreciate. Their independence is not something to take for granted, or casually undermine.”

The ninth rule: Remember that misinformation can be beautiful too
Visual presentation is increasingly used as an argument in all media – increasingly because efficient methods to produce striking graphs are commonly available and easy to use. Harford’s next rule is that we had better look under the surface; stylish illustrations may have serious biases concerning the type of message they deliver.

An unexpected hero of the story introducing the rule is Florence Nightingale, better known as the dedicated nurse she was among the British troops during the Crimean War in the mid-19th century. But she was primarily a statistician, the first woman to become a fellow of the Royal Statistical Society. Moreover, she was a pioneer in presenting statistical data as diagrams (p. 230): “Her ‘rose diagram’ was arguably the first ever infographics. That makes her perhaps the first person to grasp that busy, influential people would pay more attention to a vivid diagram than to a table of numbers.”[12]

Harford states a clear starting point for this rule (p. 231): “ Most of the data visualisation that bombards us is decoration at best, and distraction or even misinformation at worst.” He turns out to be a great fan of clear visualisation; so, this perspective offers him a good chance to provide us with a critical overview of all the ways in which visualisation can lead astray.

First of all, we should be clear about what the numbers a figure draws together actually stand for, and make sure that the categories are comparable in the way a picture suggests. A particular error he brings forth is comparing stocks with flows when assessing costs of some particular procedure; as an example he takes up estimates of the Iraq war (p. 237-8): “That’s the equivalent of comparing the total cost of buying a house with the annual cost of renting one; it’s not a trivial confusion.”

Florence Nightingale is present in the text throughout, as an example of someone who understood that ultimately the soundness of the underlying data is a key question. Visualisations may be deceptive precisely because the visual force tends to take over and, as a consequence, when looking at a striking image we are easily carried away by subjective preferences and moods – the themes Harford takes up under previous rules.

The tenth rule: Keep an open mind
This rule is a warning against blind stubbornness. The anti-hero of the story is Irving Fisher, one of the greatest economists of the last century. Fisher, similar to many other prominent economists was an active participant in the financial markets, originally with great success. However, when the conjunctures began to slide toward the Great Depression in 1929-30, he stubbornly believed that the phenomenon was temporary – and suffered total financial ruin.

The story is a strong reminder that even specialists – if not specialists in particular – filter new data through their established prejudices. Harford takes up a couple of surprising examples from the exact sciences. They circle around the gradual stabilisation of the estimates of several important constants of nature used in the sciences (the electric charge of a single electron; Avogadro’s number; Planck’s constant). We need not go into detail here, but the background is that such constants are extremely difficult to measure; hence, it is no surprise that the estimates have become more and more accurate over the decades. The stories demonstrate that improvement in accuracy of the estimates has been retarded by an attachment to previous estimates; in other words, when scientists have to interpret results of their meticulous, instrument-driven work, they are likely to be influenced by previous experience, similar to any of us in our every-day life.

Another example Harford takes up is forecasting social changes. His hero is psychologist Philipp Tetlock who has applied ingenious empirical methods to assess the success of specialists in predicting political and geopolitical processes. The results: usually forecasting fails, sometimes miserably, and typically the specialists are able to come up with ad hoc explanations for their failures that seem credible afterwards but are actually cooked up.

Furthermore, there is a public dimension to this phenomenon, too, originally (most probably) recognised by psychologist Kurt Lewin: making a public argument “freezes” attitudes in place concerning that question – the more strongly if it later turns out that the argument was false.

At the end of the chapter, Harford takes up another great 20th century economist, John Maynard Keynes, as a sort of antidote to Irving Fisher’s stubbornness. He offers, as a piece of advice, a citation by Keynes (which Keynes perhaps said, or maybe not, but the idea is important anyway; p. 279):
“When my information changes, I alter my conclusions. What do you do, sir?”

The final story: Be curious!
To get going with the final story, Harford goes briefly through the titles of his ten rules. Then he takes up some examples of a theme that has cropped up here and there in the previous chapters: the prevalence of motivated reasoning, i.e. stabilised views on controversial matters that are glued to existing prejudices and strengthened by tribalism.

Depressing? Yes, certainly. – But Harford offers a remedy: curiosity. He reviews studies in social psychology that confirm that curiosity is, indeed, an identifiable characteristic that helps those who have it to let loose from preconceived fixed ideas. To wrap up the chapter, he offers a maxim for all of us who want to clarify important but often difficult ideas to other people in such a way that we ourselves learn something new, too (p. 293):

“Those of us in the business of communicating ideas need to go beyond the fact-check and the statistical smackdown. Facts are valuable things, and so is fact-checking. But if we really want people to understand complex issues, we need to engage their curiosity. If people are curious, they will learn.”

What About Environmental Social Science?
I am curious about what Tim Harford’s ten rules have to do with environmental social science. He does not offer much material that would directly refer to environmental issues; it is his approach that is highly relevant. He takes up such perspectives and issues related to numbers that we, as environmental social scientists, often come across.

Harford’s stories offer an antidote against a common view that qualities cannot be put into numbers, not convincingly anyway; I’ve been as firmly of this prejudice as anybody else. The problem is real, but Tim Harford’s ten rules offer inspiration to explore the issue more deeply. Numbers can be used in more imaginative ways than most of us have been able to believe.

Our task is to carry the work forward. In this context, I’ll be satisfied with naming some problem fields that I think deserve focused attention:

[1] Estimates of the current human ecological predicament are most commonly presented as numerical aggregates on the global scale. Familiar themes include ‘Anthropocene’; ‘footprints’ of various types; ‘Safe Operating Space’ of humanity; ‘Millennium Ecosystem Assessment’; sixth mass extinction; and so on. These estimates give rise to three kinds of questions:
First: What is the quality of the data sets underlying the estimates? Is there significant variation “within” the aggregates? Are various factors combined in a balanced way when forming the aggregates? “Who is missing” [cf. Rule 6]?
Second: Are the numbers and calculations sound? Are there “back stories” such as “survivorship bias” [cf. Rule 5]? How about “premature enumeration” [cf. Rule 3]?
Third: Can we identify dynamic processes that maintains internal unity of the phrenomena described? – Climate change is driven by the global dynamics of weather; how about the others? If underlying prosesses are variable by nature, aggregates may be misleading.
[2] Do we know credible ways of assessing, using numbers, the success vs. failure of particular policy measures? – This works, at least in principle, in the case of cap-and-trade procedures, but would be much more difficult in the case of qualitative targets such as Best Available Technology, (BAT). Do vested interests (“greenwashing”) interfere with assessment? How about indicators that become corrupted when used [see Rule 2]?
[3] What numbers are usable for describing and analysing the criteria people use for choices in their personal lives? Well-being and affluence are clearly important standards. Can we credibly claim, using numbers as a support, that a shift to more sustainable ways of life doesn’t undermine targets people have set for themselves? Is this important?
[4] Finally, a question relevant for all qualitative evaluation: Can we avoid tribalism – “motivated reasoning” [cf. Rule 1] and “naive realism” [Rule 2]?

A beautiful thing about numbers is that they never stand still. Quite to the contrary, numbers have references that are in constant movement. I take one more citation from Whitehead’s Modes of Thought (p. 93): “All mathematical notions have reference to process of intermingling. … There is no such entity as a mere static number. There are only numbers playing their parts in various processes conceived in abstraction from the world-process.”

In the spirit of Harford’s book we can adopt a pragmatic perspective on the usability of numbers: on how to work with numbers, and how to let numbers work on us. But it is wise to strengthen our understanding of what the potentialities are. My late colleague evolutionary biologist Richard Levins [1930-2016] explained his view on mathematics as follows:

“Mathematics has various tasks in science. It allows us to make predictions that can be tested. But it is something more: its most important task is to educate the intuition so that the obscure becomes obvious. Complexity is overwhelming not because it is intrinsically incomprehensible but because we have posed the problems badly, and with a change of vision it becomes more manageable.” (emphasis in the original).[13]

I have a personal take on this theme, too: One of my frustrations when developing the curriculum of environmental policy in Tampere was that there was no space (i.e., no resources) to develop a decent course on understanding numbers; instead, our (poor) students had to sit through a basic course in statistics offered by professionals in the field. Introduction to statistical theory dominated the course; the teachers seemed not to have any interest in helping the students to grasp what numbers mean in our field. An educational opportunity wasted, if there ever was one.

Yrjö Haila

[1] Henri Poincaré: Science and Hypothesis. New York: Dover, 1952, 158.
[2] Alfred Lotka: Elements of Mathematical Biology. New York: Dover, 1956, 39.
[3] Alfred North Whitehead: Modes of Thought. New York: The Free Press, 1968, 4.
[4] Most Finns probably know the stupid rhyme: “vale – emävale – tilasto” [lie – big lie – statistics].
[5] Harford’s source is Conrad Keating: Smoking Kills. Oxford: Signal Books, 2009.
[6] A classic exposition: Naomi Oreskes & Erik Conway: Merchantss of Doubt. How a Handful of Scientists Obscured the Truth on Issues from Tobacco Smoke to Global Warming. London: Bloomsbury Press, 2010.
[7] “Kolmoisrokote” MPR eli tuhkarokko- sikotauti- ja vihurirokkorokote.
[8] Original in his “Assessing the impact of planned social change.” Evaluation and Program Planning, 2(1), 1979; Harford p. 64.
[9] https://www.gapminder.org/dollar-street – warmest recommendations!
[10] https://www.cochranelibrary.com; https://www.campbellcollaboration.org
[11] https://www.ft.com/content/8ea4f3ea-cdbf-45ff-b4f1-d77451109453 and https://www.ft.com/content/8d7e220d-396f-4752-9d69-b3ad30310f01?segmentId=3f81fe28-ba5d-8a93-616e-4859191fabd8, respectively.
[12] Nightingale’s “rose diagram” presented the losses among the troops in the Crimea arranged in a circular graph as sectors, each one of which represented a causal category. A most casual glimpse would reveal that the sector representing deaths by epidemic disease captured a far higher proportion of the graph than deaths by battlefield wounds.
[13] “Educating the intuition to cope with complexity”, in Richard Lewontin & Richard Levins: Biology Under the Influence. Dialectical Essays on Ecology, Agriculture, and Health. New York: Montly Review Press, 2007, 185.