Section 17 The shared consequence rule, functional version

Suppose we have coded the fact that something (call it B) affects E according to one function \(f\) (e.g. it contributes positively to it), and we have also coded the fact that somethinge else (call it C) also affects E according to another function \(g\). The shared consequence rule says we can snap together these two pieces of information. That seems innocuous.

But also, according to the equally innocuous-seeming “mini-map coding rule”, the new map says that there is a single rule which (at least partially) determines E given information about B and C. Suddenly this is a big deal. What is that rule? How can we construct the correct function from \(f\) and \(g\)?

For example, if we have the information:

you can’t have well-being without good health

and

income contributes to well-being

we can encode them like this:

What is the new causal function implied by this new mini-map?

..… we have two single-variable packages which influence a variable “hunger” or “amount of hunger”, the first perhaps expressing a necessary condition, the second perhaps expressing a continuous relationship. How can we combine this information to give us a single function of the form

\[hunger = h(drought, poverty)\]

What is \(h\)?

In the special case of the “quantitative” paradigm of (rational numbers and linear effects), creating this function is assumed to be easy. The rule is that we simply add the two effects, and assume there is no interaction between them unless we are told so. We also assume that the effect of B on E does not depend on E.

But in social science we often cannot fulfil these stringent assumptions. For example, we may be working with binary/Boolean “false/true” variables, and the effect of B on E can depend on E, for example in the case of necessary conditions, see later xx.

Later we will make a lot of use of variables and functions which range between 0 and 1, so-called “lo/hi” variables, as in fuzzy cognitive maps. Here, if we repeatedly applied a function which increments E by an equal amount, the value of E would sooner or later exceed 1. So the increment has to get smaller as E gets larger, or at least get curtailed at 1. So we have to use a different way to combine functions.

In this section I suggest that there is no perfect solution to calculating \(h\). However I do later make a suggestion.

We could ask whether the constituent maps are considered to complement or contradict one another; and I recommend understanding them as contradictory by default. That means that if someone says “drought drives hunger” we assume that they are making an exclusive claim, and would object to the simultaneous claim “poverty drives hunger”; although if there is both drought and poverty, we would probably expect the level of hunger to be higher than if there was only drought or only poverty.

Some kinds of influence like a necessary condition by their nature are liable to contradict others. So if C is necessary for E, B cannot be sufficient for it.

One alternative would be to assume that only when the maps which we are merging come from the same source4 may we assume that the information is complementary. When they come from different sources, it is less clear how we should combine the information but perhaps we would prefer to default to treating them as contradictory.


Sometimes we might feel that the information in the influencing packages contains a complete or a partial contradiction, in other cases not. Our common sense pulls us in different ways about what to do in different cases.

We could try

\(E=f(B) + g(C)\)

But even if arithmetical addition makes sense in this case, there is no reason to suppose that we can just “add” the influences. Why shouldn’t they interact so that the influence of f and g together is much stronger than either alone? On the other hand, why shouldn’t they work against one another? The right answer might equally be multiplication or anything else:

\(E=f(B) * g(C)\)

Or, perhaps the effect of \(g\) is higher when \(f\) is low, or vice versa? The problem is that we are missing the additional causal information about how to combine these two (sets of) influences.

And yet, we can hardly insist that for every random two pieces of causal information there has to be another which tells us how to combine them. If a farmer tells us “drought leads to hunger” we can’t go back to them and say, aha, but how does this rule apply in the presence of information about how poverty leads to hunger? Or, worse, in the presence of each and every specific piece of information about how anything leads to anything else?

The whole point of the mini-map approach to coding, aka the graphical approach to causal inference, as Pearl insists, is that mini-maps are the portable, robust atoms of our causal understanding. They are rules which we carry around with us and can apply and combine whenever they become relevant. We can’t appeal to another whole set of rules which govern how to apply the first set of rules. There must be some default understanding of how we combine separate pieces of information about the influences on a variable.

17.1 Contradictory combinations

To take the less usual case of contradictory combinations first, if we hear:

the victim was murdered with a hammer

and

the victim was murdered by strangling

we might encode them like this:

… but this might feel weird. We would perhaps prefer to work out which putative cause was actually effective. Or we might suggest XOR as the relevant function.

In the absence of further information, we might look at the two alternatives

\(E=f(C)\)

\(E=g(C)\)

and simply take the average of the two, for all values of C. Again, we could weight these averages.

17.2 But which case is which?

In a later section I present a default rule for combining complementary causal maps. And above I started to sketch one for combining contradictory causal maps. The big problem is that we don’t necessarily know which one to use. This has to do with whether we think of “G -> E” as meaning

G (and possibly other things) affects E

or

only G affects E

In the first case, the mini-maps complement one another, in the second case, they contradict one another. The problem is particularly hard because one causal claim might be considered to be exclusive (to contradict others) in one context but not in another. For example in this case, the second claim would probably be taken to contradict the first, but not necessarily vice-versa:

Heart attacks are caused by chakra imbalance

Heart attacks are caused by a blockage of arteries

… and on the other hand, in this case

Heart attacks are caused by a blockage of arteries

Stress contributes to heart attacks

the person asserting the first would most likely not contradict the second claim but see them as complementary in some sense, and vice-versa.

These are all real problems. If we want to be able to “zoom out of” and simplify our causal maps automatically, we have to do one of the following:

  • wherever necessary, code information like “be careful, these are contradictory fragments, don’t combine them with the default combination rule” during the atomic coding of individual causal packages.
  • assume that all causal fragments combine in complementary fashion and simply refuse to code any which are not
  • resign ourselves to the gruelling process of reviewing “by hand” each and every variable by variable every time we want to view a causal map or create a new view (filter, merge, combine, zoom out) of a causal.

In conclusion I will repeat this (somewhat arbitrarily) decision:

When maps are combined using the “same consequence” rule, we will assume the constituent maps are contradictory rather than complementary unless we are told otherwise.

Nadkarni and Shenoy (2004) makes the same assumption. Most approaches to causal maps implicitly assume some default approach, though mostly without specifically mentioning or arguing for it.


17.3 Extension to causal packages

Above we have only considered mini-maps with single influence variables. In general, mini-maps show how several influence variables together influence the consequence variable as a causal package. The logic is the same, but if we combine two causal packages, we have to remember the packages and which variable belongs to which package. We can’t just smash them all together (unless we make very strong linearity assumptions). This means that in this example, there are two causal packages, one which combine to influence E via logical AND, the other via logical OR (but the same applies to any kind of functional package).

Here the variables have been coloured to visually record which variable belongs to which package (and in general, there is nothing to stop the same variable belonging to both packages). Our app has to remember this too (and indeed it does).


  1. We will discuss the concept of a source in section xx