Wednesday, February 26, 2020

Probability, complexity, reflexivity & Goodhart's law (2020)

Goodhart's law states that “any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes”. Or more popularly: “when a measure becomes a target, it ceases to be a good measure” (Strathern). Similarly, Campbell’s law: “the more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor” (Muller 2018). It is also related to the McNamara fallacy, named after US Secretary of Defense Robert McNamara, which says that an exclusive focus on what can be quantified frequently misses what is important, even crucial but cannot be readily quantified (e.g. body count vs enemy morale in Vietnam war). Last but not least, the McNamara fallacy has affinities with the so-called streetlight effect (also known as drunkard’s search).

Social scientists and policy-makers need to be circumspect when devising and implementing policies on the basis of quantification and correlation. The assumed correlations may weaken or vanish or even invert, post-intervention. The quantified target variables may cease to be a good measure of the social phenomena to be influenced by the policies. This may be due to fundamental system-inherent instability or it may be due to the existence of actors adjusting their behaviour in view of changed incentives, costs and benefits, thereby changing the outcome of the policy intervention, which in turn affects expectations and behaviour and so on. Policy intervention is sometimes better conceputalised as a strategic interaction between policy-makers (via intervention tool) and the target group (and its behaviour). In international politics and game theory, this is simply called strategic interaction (Schelling 1960Lutwak 2001). It can also be thought in the context of self-fulfilling prophecies and multiple equilibria, or even in terms of a Keynesian beauty contest. In different ways, Karl Popper, Ernst Nagel and George Soros have highlighted the problem of predictions and changed expectations affecting outcomes/ behavior in turn affecting individual or collective behavior and expectations. This notion of adjusting expectations and behaviour tomorrow due to a change in expectations today also features prominently in rational expectations theory. This general idea is sometimes referred to as (most recently and popularly) reflexivity. The affinities with Goodhart's law are obvious. (As an aside, quantum physics allows for an even more fundamental instability where the mere fact of observation/ non-observation alters/ determines what is in fact the case (Schrodinger’s cat).) This is of course not relevant to all or even the majority of policy interventions. But when it is relevant, its implications can be very significant.



Policy intervention can and not infrequently does lead to unintended consequences. Government regulation that forces drivers to wear seatbelts, for example, may lead them to drive faster. This may lead to an increase rather than the intended decrease in traffic-related injuries and deaths. According to Robert Merton (1936), unintended consequences can be attributed to (1) complexity, (2) perverse incentives, (3) stupidity, (4) self-deception, (5) weak human nature and (6) cognitive and emotional biases. An increase in traffic-accident-related deaths following the introduction of a seatbelt law may be attributed to perverse incentives. Driving faster is (seen as) less risky than before. When designing policy interventions, it is important to generate defensible and plausible hypotheses about the target group’s likely reaction function. 

Complex systems theory suggests that there are fundamental – rather than merely epistemic – obstacles to successful policy interventions. If a social system is a complex system – not all systems are complex – it is typically characterised by fundamental uncertainty (in the Knightian sense). Even if it is not, it may be prone to (1) sudden transitions (non-linearity or tipping points), (2) limited predictability, (3) large events, (4) evolutionary dynamics and (5) self-organisation. All these characteristics make it difficult and even impossible to rely on correlations to inform policy interventions. Moreover, when designing policy interventions to deal with rare and large-scale events, it may be difficult, often impossible to generate any “significant” correlation or relationship whatsoever between intervention and target variables. While a Bayesian approach may be more useful than a frequentist one, successful policy intervention often relies on deductive models (e.g. what causes world wars?) rather than inductive logic and probability. Even if a fair amount of data exists, it may be difficult to generate reliable probability estimates and correlations given the complex nature of the system that is to be intervened in.

Policy interventions typically take place on the basis of estimated, established correlations that are meant to be exploited. To the extent that causation (and policy intervention) is about nomological expectability rather than necessity, it is not surprising that correlations often weaken and vanish – given absence of necessity. Nomological expectability may be statistical in nature (Friedman 1953). That may be good enough, provided Goodhart’s law is not operative. It turns out that some phenomena are normally distributed (e.g. height), while others follow a power law distribution (e.g. Zipf’s law, earthquakes). Yet it is not obvious what type of probability distribution social phenomena that are to be intervened in should be modelled on. This further complicates successful policy intervention, even if the problem is epistemic rather than fundamental uncertainty.

A normal distribution may capture a policy variable accurately – until it doesn’t. Equity market returns may be normally distributed, especially if one excludes extreme moves. Extreme moves can and do occur in the case of herd-type behaviour (e.g. panic). If investors sell assets and prices overshoot, while falling asset prices erode capital buffers, further selling will occur. Financial market participants then behave like rioters in a riot model (Page 2018). This raises the issue of the validity versus the reliability of statistical models that seek to capture social phenomena. If the underlying, assumed model is wrong, the most sophisticated statistical analysis won’t be of much help. Estimating the wrong model accurately is less useful than estimating the right model less accurately. Monte Carlo simulations, for example, suggested that a major financial crisis should occur only once every 10,000 (?) years or so. In reality, financial crises occur much more frequently. The models were probably wrong because they were based on wrong assumptions.

All of this makes policy intervention, and especially large-scale policy intervention, a precarious undertaking – to say the least (Scott 1998). Basing policy intervention on an established correlation is problematic due to fundamental uncertainty, causal complexity and/ or reflexivity that often leads to unintended consequences. Policy intervention and especially large-scale interventions were responsible for the greatest humanitarian catastrophes in the 20th century. Stalinist and Maoist economic experiments possibly killed as many people as the two world wars. But it is also worth bearing in mind that initially successful intervention on a smaller scale can lead to outsized catastrophe in the long-term. The use antibiotics helps address selected health issues but may create drug-resistant bacteria with possibly devastating consequences later on. This strongly suggests that policy interventions need to be conceputalised in terms of a dynamic understanding of systems (Perrow 2007) and correlations and target variable responses should be evaluated very critically and cautiously.

To the extent that some social systems are complex systems, policy intervention can prove hazardous – whether due to the limited (epistemic) understanding of, or  the fundamental (metaphysical) uncertainty of, the effects of policy interventions. Stable correlations should not be assumed to hold in light of policy intervention. The proverbial burden of proof is on those arguing that the correlations will hold. Second-round effects – both short- and long-term – should be estimated and evaluated (e.g. clearing underbrush to prevent small forest fires often leads to catastrophic forest fires later on). In the face of potential non-linearity, causal complexity and/ or reflexivity, policy intervention is far from straightforward and not infrequently ranges from precarious to hazardous in terms of outcomes (e.g. lowering interest rates is supposed to increase demand through higher investment but may actually increase household savings and reduce aggregate demand due to the lower return on assets and lower household income.)

Policy interventions need to incorporate a dynamic conceptualization of systems. This will not always lead to success, but it is less likely to end in failure than a more static approach based on the assumption of correlational stability. This is not at all an argument against government policy intervention, just a plea for greater circumspection and intellectual farsightedness. Albert Hirschman (1991) pointed out that the rhetoric of reaction seeks to shoot down policy reform by first pointing to their perversity, then their futility and finally their danger. Policy intervention can prove very successful and includes a significant improvement in human development indicators and lower levels of (intra-state) violence (Pinker 2011) to name just a few prominent examples. It is important to bear in mind that human cognition is not very adept at dealing with non-linearity and causal complexity. A more thoughtful approach to policy interventions needs to take into account reflexivity, the interactive nature of social phenomena and potential non-linearities. After all, policy interventions can be successful; but they can also be catastrophic. While aligning oneself with a libertarian or reformist-interventionist ideological position may be psychologically satisfying, it is also intellectually lazy. What is needed is a more balanced and thoughtful evidence-driven debate and a recognition that epistemic and metaphysical uncertainty are major causes of the failure of policy interventions. This recognition is extremely crucial to devising successful policies, not least in view of the major risks facing mankind (Bostrom 2011Best 2018).