Section 6 We need a “soft arithmetic” for causal maps

The forthcoming sections are the main theoretical contribution of this guide. The are an attempt to put together a kind of “soft arithmetic” for causal maps: a set of rules which tell us how to combine different pieces of causal information and how to make deductions with them. To do so, we have to address a wide and challenging range of issues. Many of these issues are familiar to practitioners like QuIP coders, indeed to anyone who tries to piece together fragments of causal information, but in most cases there is no consensus on how they are to be answered. I’ll be arguing that if we want to try to address even the most basic needs of evaluation users, we are compelled to try to make some use of some sort of numbers when encoding causal information – but in such a way as to keep things as fuzzy as they really are, rather than trying to demand or claim an unrealistic level of precision.

6.1 An ambitious project

Yes, this attempt to find a common logic behind the different approaches – from Theories of Change to Structural Equation Models – that come under the rubric of causal maps – see the list here – is ambitious.

6.2 Users of a causal map expect to be able to deduce some kind of comparative information from it

On the basis of a causal map, we must in principle, at least some of the time and with suitable qualifications and caveats, to be able to ask typical (evaluative?) questions like these:

Was the influence of C on E in some sense positive, or upwards, or increasing?

Was the influence of C on E not just theoretically present but of some meaningful size?

Is the influence of C on E bigger / more important than the influence of D on E?

Is C one of the biggest / most important influences on E?

Is the influence of C on E meaningfully bigger / more important than the influence of D on E?

… as well as, of course, questions about combinations, like:

Is the combined influence of C and D on E important?

Do both C and D act to increase E?

Will C affect E if D is high/present?

and so on.

In particular, we would like also to be able to ask them when E is not an immediate daughter of C but is further downstream of C.

(Going a step further, we’d like to be able to say things like “show me only the strongest influences in this map” or “show me a summary overview with only the most important things”. This kind of summarising and aggregation presupposes the ability to make statements like those above.)

I’d claim that these are among the central reasons clients commission QuIP reports, (or indeed most other kinds of evaluation), in summary:

to get overview findings about the influence of C (an intervention) on some valued downstream factor like well-being — and in comparison with other interventions;
and to get more in-depth details about the whole causal map, what causes what, what helps the intervention, what hinders it, and so on.

… again, we can have caveats and discussions about who decides, who participates etc; and we should tirelessly emphasise the high level of fuzziness and uncertainty involved. Often we will answer questions like those above with “sorry but we cant really be sure”. But still, if we always answer “there were a lot of links between C and E, but we have no further information”, we would not be doing our job. It would be like a telecoms agency delivering only metadata about the communications history of a criminal – certainly not useless, but a thousandth as useful as knowing what was actually said, whether in detail or in summary. Also we would not be doing the main job of an evaluator, according to Michael Scriven Scriven (2012) which is not just to describe what happened but to (help) judge is that good enough.

By all means we can let our evaluation stakeholders play with the encoded raw data and let them “draw their own conclusions”. But in order to draw their conclusions they have to know at least implicitly how to draw conclusions from causal maps. They might have better information than we do about the content, but our job as experts in causal maps is to advise them specifically on how to draw conclusions, how to make decisions about which parts to focus on, how to summarise them, and which parts to filter out.

6.3 Asking and answering those kinds of “typical questions” of a causal map boils down to assigning some kinds of numbers to its elements

The point is that these statements all presuppose, make use of, a kind of “soft arithmetic” for causal maps. To be able to make any kind of “bigger than/smaller than” claims about causal maps, we need the individual arrows to have, as a minimum, a (probably hidden) property like, in some sense, “strength” and “direction” as well as a way of comparing those properties. That is, we are already doing a kind of vague arithmetic. To be able to make double comparisons like “The influence of B on E is bigger than the influence of C on E” we are essentially implying that the arrows have some kind of numbers attached to them. That is in a sense the definition of what numbers are: attributes which justify comparisons. It remains to be seen what kinds of numbers those would be.

But, you say, you really don’t want to actually put numbers on the arrows or the variables on your maps, you really don’t want to claim that this is an effect of size 7.51, or -102, etc.

I don’t want to put numbers on the maps either. But numbers in some form have to be there in the background, however flexible and fuzzy and inaccurate, or we couldn’t make useful comparisons.

Maybe you often try to withhold judgement when asked to summarise a causal map. But if you see a hundred thick and well-evidenced arrows from smoking to cancer, and only a few from other causes, you’d be wrong to stay silent. The point is that at least in extreme cases like that, you’re implicitly doing soft arithmetic; you’re using some ideas about how to make a summary on the basis of the amount of evidence and how to combine that evidence. But what are the implicit rules of this soft arithmetic which you must be using?

6.4 Aren’t there strategies to encode causal information without using any kind of number?

You might say “ah, but I have a strategy for encoding causal information which does not involve any kind of number or even any idea of strength or gradation – for example, C just contributes positively somehow to E, basta.”

By all means, there might be some such strategies, which have various severe limitations. I’ve addressed the issue here.

A technical note

The enormous advantage of quantitative approaches in social science, using ordinary integers and rational numbers, is that this arithmetic is pretty straightforward and mostly very well understood. It jumps over many of the issues which we have to tackle in the forthcoming sections.

But we have some advantages over the statisticians too. For one thing, the special status of causal maps within quantitative social science, and the fact that they are a completely different kind of creature from mere correlative structures, Pearl (2000), is still controversial in statistics.

Also, their central problem is “how can we get from these mere correlations to actual causal knowledge” whereas our causal maps already consist of (putative) causal knowledge; our problem is validating it, manipulating it, combining it, summarising it.

What I am trying to do in the next sections is apply a modest subset of Pearl’s ideas to the kind of ill-defined and fuzzy data we get in most actual social science, rather than the kind of quantitative data he (mostly) presupposes.