Section 63 Summary of the rules for inference in causal maps, aka “Soft Arithmetic”

63.1 The inference rules for causal maps

These are rules for reasoning, drawing inferences, with causal maps and/or ordinary, causal, narrative sentences. These rules tell us how to go from one set of maps and/or narratives to another.

63.2 The mini-map coding rule

Information like “the influence variables B, C and D all have some kind of causal influence on the consequence variable E” can be coded with a mini-map in which one or more variables (the “influence variables”) are shown with arrows leading to another (“the consequence variable”). The information and the map are equivalent.

I call this a “coding rule” but more generally it is the first rule of inference for causal maps. This first rule tells us how to go from a fragment of causal narrative to a diagram (coding) and back again (interpretation).

Mini-maps are the atoms of causal maps. One action of coding something produces one mini-map. You can build up any causal network from them. In 90% of practical applications, a mini-map will just contain a single influence variable. But we don’t want to get stuck when we need a package of two or more influence variables, e.g. when someone says “Both B and C affect E” or even “B and C interact to affect E”.

It’s crucial that a mini-map codes information about causality, not co-incidence. So the causal map “C → E” should not be interpreted along the pattern of “if you observe (a high level of) C you are more likely to observe (a high level of) E”, though that may or may not be a corollary of the causal information. The strongest and most correct interpretation is “if you intervene in the system and manipulate C, which may involve breaking any causal links from other factors to C itself, then this manipulation will produce a corresponding effect in E”.

The mini-map is also equivalent to a functional expression: \[E_{posterior} = f(B, C, D, E_{prior})\] This just says that the value of E is influenced via B, C, and D according to a function \(f\), which shifts the value of E away from our previous or prior best guess about E to a value determined not only by the influence variables but also that prior value.

63.3 Focus on “lo/hi” variables and functions between them

The function \(f\) above could be anything. The app itself is not fussy and should be able to cope with any function for which a definition exists or can be written in R. BUT in Gary / James’s article, and I think in QuIP, we focus on what I call “lo/hi” variables and functions between them. lo/hi variables vary between 0 “as low as you can get” and 1 “as high as you can get”.

So in the world of lo/hi variables, we will focus on a limited set of functions.

63.4 Recording the actual values of variables

It is also possible to record the reported actual value of a variable. So this involves coding information about the variable, not about the causal link, i.e. the coder is using a different widget in the app / a different part of the coding sheet.

In the case of lo/hi variables, this value can have quite general meaning:

  • the value within an empirical range e.g. “this summer is as hot as we have ever had” (this summer temperature ≈ .95)
  • the value without reference to an empirical range e.g. “the teachers’ skills were appalling” (teachers’ skills ≈ .1)
  • the proportion of a set of things which have a binary property e.g. “most of the children seem happy” (happiness of the children ≈ .75)
  • the strength of the membership of something in a certain set e.g. “country X is only partly democratic” (country X democracy level ≈ .5)
  • the probability of a binary variable e.g. “the chances of war are now very low” (chance of war ≈ .1)
  • the strength of our information about a binary variable

63.6 The rules for coding different types of influence; single influence variable

These are the most important functions when (as usual) we just have a single influence variable:

  • PLUS: \[E_{posterior} = B\]
  • MINUS: \[E_{posterior} = 1-B\]
  • NECC: \[E_{posterior} = B* E_{prior}\]
  • SUFF: \[E_{posterior} = 1 - (1 - B)* (1 - E_{prior})\]

… it is easy to add more.

“PLUS” is just your normal “positive” influence; if B is high, E will be high; if B is low, E will be low, and so on.

“MINUS” is the opposite. High B causes low E.

“NECC” is just a necessary condition. For example: “you can’t have E without B”. Or “if you have E, you must have / have had B”. But NECC is generalised to cope with when B (and E) are not exactly 0 or 1. So you might know that a democratic government is necessary for (say) racial tolerance, but the country in focus is only partially democratic.

“SUFF” is just a sufficient condition. For example: “if you have B, you will get E too”. SUFF is also generalised.

63.7 The rules for coding different types of influence; packages of multiple influence variables

The most obvious thing to do is just generalise the above functions, so for example:

  • NECC: \[E_{posterior} = B*C*D*E_{prior}\]

this is equivalent to the frequently mentioned “necessary AND”. When any of B, C, D … are zero, the value is zero. When all of them are 1, the value is just the prior value of the consequence variable, i.e. the influence variables have no effect.

Similarly for SUFF: - SUFF: \[E_{posterior} = 1 - (1 - B)* (1 - C)* (1 - D)* (1 - E_{prior})\]

We can extend PLUS and MINUS in a similar way, but obviously the value can too easily hit 1 for PLUS and 0 for MINUS. There are different possible ways to deal with this, for example HARDPLUS aka HARDADD just truncates the value at 1. Another possibility is SOFTADD and I am looking at Bayesian ways to combine this kind of information. There are other, simpler functions which do not take into account the prior value of the consequence variable:

There are other, simpler functions for more than one influence variable which do not take into account the prior value of the consequence variable:

  • MIN: \[E_{posterior} = minimum(B,C)\]
  • MAX: \[E_{posterior} = maximum(B,C)\]
  • MULTIPLY: \[E_{posterior} = B*C\]

MIN and MULTIPLY are continuous analogues of Boolean AND. So we don’t need AND, because for binary variables you can use either, but you should think about the difference too. The diagram above suggests that crop quality is only as good as the weakest of the three influence variables. So if rainfall is poor, there is no point trying to improve soil or increase training. MULTIPLY embodies the same idea but is more aggressive.

Any other function which makes sense over lo/hi variables, i.e. between 0 and 1, can be used, for example measures of central tendency and dispersion, like the average of the variables in a package. These can be entered directly into the app / coding sheet.

It is important to note that all these functions are applied to a package (even if the package contains only one variable) at once; they are a property of the whole package, not the individual variables. For this reason it is a bit misleading to display the information on the arrows.

63.8 INUS

Personally I find INUS a big yawn, and there are different ways to render it. Its main interest is in trying to emulate the ordinary-language expression “B is a cause of E”. But as we think harder about causation, this kind of expression loses its interest. The best answer to the question “what causes what here” is just a causal map, or a simplified causal map (derived from the original using the rules of inference for causal maps), perhaps with any pedagogical / visual aids we can give the reader to understand connections and phenomena which are not obvious.

The INUS-like configuration below is somewhat equivalent to Mackie’s original. One can argue about whether the ANDs should be NECC or whether the ORs should be SUFFs. Insofar as there are other real conditions Y mentioned, there is no need to provide an option to code INUS directly because the Y (or its constituent parts) need to be coded anyway, and INUS is by its nature a complex beast with multiple parts.

I guess this is a possibility too:

It would be possible to offer a specific option to directly code this kind of combination if that was deemed important.

63.8.1 SOFTADD

One additional multi-function is called SOFTADD. I think this may be the most frequent case. It fills a gap for an incremental, addition-like function for lo/hi variables. Every additional positive parameter to the function potentially adds at least something to the result. So if someone says just “the training helps increase crop yield, but so does the pre-existing skill level and of course the weather”, SOFTADD is the best option. If any of these influence variables have values equal to 1, the consequence variable will also be 1, but if they are all below 1, the consequence variable will never reach 1, but will get ever closer to it.

63.9 Coding the strength of the influence of a package

We can also code the strength of the influence of a package, also between 0 and 17

63.10 Causal claim coding form

Pick / select / name the “from” variable(s) and the “to” variable.

  • Code the type of the influence(s). Default is PLUS, but also possible are NECC, SUFF, MINUS, maybe others.
  • Strength of link (according to the source) between 0 and 1
    • If (rarely) there is one or more arrows with a different influence or different strength within this package, need to click / select / specify this differing influence and/or for this variable(s)
  • Add specific / “broken” aka cited text
  • Source and context (can be automatically coded by the app. Can then also be linked back to the source and to characteristics of the source)
  • Add the “Complete text” (also added automatically)
  • Is this a conceptual link? Default is NO, but possible also are “directed” or “undirected”.
  • Certainty (default is .5). Can this also optionally differ between links within the same package?
  • Possibly other kinds of qualifier e.g. trust, explicitness, but only really useful if we have inference rules to tell us what difference they make

There is no reason to specifically code a causal chain as this can be constructed from several atomic pieces of coding. Trying to code multiple links at the same time would just mean repeating the interface n times. An app should give feedback and show where each new link fits in the grander scheme, and anyway records the fact that the pieces all come from the same quote.

There is no reason to specifically code cycles or loops because this can be done just by coding A = f(A) or A = f(B) and then B = g(A) etc.

63.11 Problems

Some of these are really thorny.

  • what happens when coding creates a cycle. The only situation where one might want to code such a cycle directly is in the case of so- called “self-loops” i.e. “A influences A” (or “A and B influence A”). At the moment this is allowed in the app, and the coder can select the requisite function (e.g. MINUS) as required, as usual.
    • a coder might code “C influences D” and, immediately after or later, “B influences A”. This is also fine and also does not need a special interface. However in both these cases, causal inference becomes potentially very tricky!
  • Variables vs. events
  • what to do when variables have a specific time attached to them
  • how to combine “package-free” combinations of influences mentioned separately by different sources. With OR? With SOFTADD?
  • how to code information about a zero influence
  • what role does the number of mentions of a causal link play when a causal map is aggregated / simplified?
  • Can’t we do it simpler? e.g. just restricting ourselves to PLUS and MINUS influences? Do we really need NECC and SUFF?
  • It’s much easier if the strength and type of each influence remains the same for each arrow within a single causal package.
    • (otherwise we need a separate set of tables)

Yes: (the default)

Yes:

Yes:

Harder to code directly, because it is necessary to store separate information for each arrow, not just about the package. How frequent?

There is a workaround:


  1. It is also possible to think in terms of a negative strength, but as we already have MINUS influences, we don’t strictly need this.