The mathematics of mind-time
The special trick of consciousness is being able to project action and time into a range of possible futures
I have a confession. As a physicist and psychiatrist, I find it difficult to engage with conversations about consciousness. My biggest gripe is that the philosophers and cognitive scientists who tend to pose the questions often assume that the mind is a thing, whose existence can be identified by the attributes it has or the purposes it fulfils.
But in physics, it’s dangerous to assume that things ‘exist’ in any conventional sense. Instead, the deeper question is: what sorts of processes give rise to the notion (or illusion) that something exists? For example, Isaac Newton explained the physical world in terms of massive bodies that respond to forces. However, with the advent of quantum physics, the real question turned out to be the very nature and meaning of the measurements upon which the notions of mass and force depend – a question that’s still debated today.
As a consequence, I’m compelled to treat consciousness as a process to be understood, not as a thing to be defined. Simply put, my argument is that consciousness is nothing more and nothing less than a natural process such as evolution or the weather. My favourite trick to illustrate the notion of consciousness as a process is to replace the word ‘consciousness’ with ‘evolution’ – and see if the question still makes sense. For example, the question What is consciousness for? becomes What is evolution for? Scientifically speaking, of course, we know that evolution is not for anything. It doesn’t perform a function or have reasons for doing what it does – it’s an unfolding process that can be understood only on its own terms. Since we are all the product of evolution, the same would seem to hold for consciousness and the self.
My view on consciousness resonates with that of the philosopher Daniel Dennett, who has spent his career trying to understand the origin of the mind. Dennett is concerned with how mindless, mere ‘causes’ (A leads to B) can give rise to the species of mindful ‘reasons’ as we know them (A happens so that B can happen). Dennett’s solution is what he calls ‘Darwin’s dangerous idea’: the insight that it’s possible to have design in the absence of a designer, competence in the absence of comprehension, and reasons (or ‘free-floating rationales’) in the absence of reasoners. A population of beetles that has outstripped another has probably done so for some ‘reason’ we can identify – a favourable mutation which produces a more camouflaging colour, for example. ‘Natural selection is thus an automatic reason-finder, which “discovers” and “endorses” and “focuses” reasons over many generations,’ Dennett writes in From Bacteria to Bach and Back: The Evolution of Minds (2017). ‘The scare quotes are to remind us that natural selection doesn’t have a mind, doesn’t itself have reasons, but is nevertheless competent to perform this “task” of design refinement.’
I hope to show you that nature can drum up reasons without actually having them for herself. In what follows, I’m going to argue that things don’t exist for reasons, but certain processes can nonetheless be cast as engaged in reasoning. I use ‘reasoning’ here to mean explanations that arise from inference or abduction – that is, trying to account for observations in terms of latent causes, rules or principles.
This perspective on process leads us to an elegant, if rather deflationary, story about why the mind exists. Inference is actually quite close to a theory of everything – including evolution, consciousness, and life itself. It is abduction all the way down. We are thrown into the world as a process already in motion; and processes can only reason towards what is ‘out there’ based on sparse (if carefully selected) samples of the world. This view dissolves familiar dialectics between mind and matter, self and world, and representationalism (we depict reality as it is) and emergentism (reality comes into being through our abductive encounters with the world). But just how did inference happen before there were inferrers around to do it? How did inert matter ever begin the processes that led to consciousness?
Let’s first establish a few ground rules about the nature of processes, and see how far we get. We’re interested only in the processes that make up complex systems, those objects of study that are more than the sum of their parts. A good way to understand this notion is to look at its opposite. If you fire a gun at a target, it’s easy enough for a physicist to anticipate which part of the bullseye it will hit, based on the angle and momentum of the bullet as it leaves the barrel. That’s because the firing range is nearly a linear system, whose overall behaviour is determined by the interaction of its constituent bits, in a one-way fashion. But you can’t pinpoint the precise position of an electron when it’s circling an atom, or say for sure if and when a hurricane will hit New York next year. That’s because the weather and atoms – like all natural processes – are not reliably determined by their initial conditions, but by the system’s own behaviour as it feeds back into the interactions of its component parts. In other words, they are complex systems.
According to physicists, complex systems can be characterised by their states, captured by variables with a range of possible values. In quantum systems, for example, the state of a particle can be described by a wave function that entails its position, momentum, energy and spin. For larger systems, such as ourselves, our state encompasses all the positions and motions of our bodily parts, the electrochemical states of the brain, the physiological changes in the organs, and so on. Formally speaking, the state of a system corresponds to its coordinates in the space of possible states, with different axes for different variables.
Everything should actually get more random, dispersed and chaotic as time marches on. So what’s going on?
The way something moves through this space depends on its Lyapunov function. This is a mathematical quantity that describes how a system is likely to behave under specific conditions. It returns the probability of being in any particular state as a function of that state (or, put differently, as a function of the system’s position in the state space, similar to how air pressure is a function of the density of air molecules at the point at which it’s measured). If we know the Lyapunov function for each state of the system, we can write down its flow from one state to the next – and so characterise the existence of the whole system in terms of that flow. It’s like knowing the height of a mountainous landscape at every location, and then being able to describe how a stream of water will run over its surface. The topography of the mountain stands for the Lyapunov function, and the movement of water describes how the system evolves over time.
Now, an important feature of complex systems is that they look like they are using their Lyapunov function to move towards more and more probable states. That is, the number returned by the function gets smaller and smaller. In turn, this means that such systems tend to occupy only a small number of states and, moreover, that those states tend to be frequented again and again. To pursue the mountain stream analogy, water flows downwards to the sea, after which it evaporates and returns to the mountainside by rainclouds. Or you might take your own body as an example: your temperature hovers within certain confined bounds, your heart beats rhythmically, you breathe in and out – and you probably have a daily or weekly routine.
What’s remarkable about this sort of repetitive, self-organising behaviour is that it’s contrary to how the Universe usually behaves. Everything should actually get more random, dispersed and chaotic as time marches on. That’s the second law of thermodynamics – everything tends towards chaos, and entropy generally increases. So what’s going on?
Complex systems are self-organising because they possess attractors. These are cycles of mutually reinforcing states that allow processes to achieve a point of stability, not by losing energy until they stop, but through what’s known as dynamic equilibrium. An intuitive example is homeostasis. If you’re startled by a predator, your heartbeat and breathing will speed up, but you’ll automatically do something to restore your cardiovascular system to a calmer state (following the so-called ‘fight or flight’ response). Any time there’s a deviation from the attractor, this triggers flows of thoughts, feelings and movements that eventually take you back to your cycle of attracting, familiar states. In humans, all the excitations of our body and brain can be described as moving towards our attractors, that is, towards our most probable states.
On this view, humans are little more than ‘strange loops’, as the philosopher Douglas Hofstadter puts it. We all flow through an enormous, high-dimensional state-space of manifold possibilities, but are forced by our attractors to move around in confined circles. We are like an autumn leaf; tracing out a never-ending trajectory in the turbulent eddies of a stream, thinking our little track is the whole world. This description of ourselves as playful loops might sound teleologically barren – but it has profound implications for the nature of any complex system with a set of attracting states, such as you or me.
With every new experience, your organism engages in inference to fit what’s happening into a familiar pattern
To recap: we’ve seen that complex systems, including us, exist insofar as our Lyapunov function accurately describes our own processes. Furthermore, we know all our processes, all our thoughts and behaviours – if we exist – must be decreasing the output from our Lyapunov function, pushing us to more and more probable states. So what would this look like, in practice? The trick here is to understand the nature of the Lyapunov function. If we understand this function, then we know what drives us.
It turns out that the Lyapunov function has two revealing interpretations. The first comes from information theory, which says that the Lyapunov function is surprise – that is, the improbability of being in a particular state. The second comes from statistics, which says that the Lyapunov function is (negative) evidence – that is, marginal likelihood, or the probability that a given explanation or model accounting for that state is correct. Put simply, this means that if we exist, we must be increasing our model evidence or self-evidencing in virtue of minimising surprise. Equipped with these interpretations, we can now endow existential dynamics with a purpose and teleology.
It’s at this point that we can talk about inference, the process of figuring out the best principle or hypothesis that explains the observed states of that system we call ‘the world’. Technically, inference entails maximising the evidence for a model of the world. Because we are obliged to maximise evidence, we are – effectively – making inferences about the world using ourselves as a model. That’s why every time you have a new experience, you engage in some kind of inference to try to fit what’s happening into a familiar pattern, or to revise your internal states so as to take account of this new fact. This is just the kind of process a statistician goes through in trying to decide whether she needs new rules to account for the spread of a disease, or whether the collapse of a bank ought to affect the way she models the economy.
Now we can see why attractors are so crucial. An attracting state has a low surprise and high evidence. Complex systems therefore fall into familiar, reliable cycles because these processes are necessarily engaged in validating the principle that underpins their own existence. Attractors push systems to fall into predictable states and thereby reinforce the model that the system has generated of its world. A failure of this surprise minimising, self-evidencing, inferential behaviour means the system will decay into surprising, unfamiliar states – until it no longer exists in any meaningful way. Attractors are the product of processes engaging in inference to summon themselves into being. In other words, attractors are the foundation of what it means to be alive.
To the extent that you accept the above formulation, you now have the ultimate deflationary account of every kind of complex system, living things included. Any process (like you or me) that repeatedly occupies certain states must, by virtue of its very existence, be performing inference.
But does this make sense? You’d hardly consider the process of evolution or natural selection in terms of inference – or would you? In fact, that’s exactly the interpretation currently found in theoretical neurobiology. It turns out, for example, that the way nature ‘selects’ organisms for their capacity to survive and reproduce is based on inference. Take a population of crabs as the system in question, and the aggregated features or phenotypes of its individuals as its ‘state’. These crabs can have claws of different sizes, shells that are harder or softer, and eyes that see better above water or below. Such diverse phenotypes amount to multiple hypotheses about what might ‘work’; each individual is a hypothesis or model of what should occupy this ecological niche, and must compete for selection under pressure from the environment.
Since evolution is a complex system, it must also be self-evidencing – that is, it will always ‘choose’ organisms that are more and more likely to occupy their ecological niche. Large claws might persist because they’re better at catching prey; hard shells might help to resist predators; marine eyesight might make it easier to spot food where the food is most plentiful. Adaptive fitness, then, is nothing more or less than the marginal likelihood of finding a phenotype in its environment. In other words, its survival is nothing more or less than the evidence that it is a good model for its niche.
A virus has all the self-organising dynamics of a process of inference; but it doesn’t have the same qualities as a vegetarian
Applying the same thinking to consciousness suggests that consciousness must also be a process of inference. Conscious processing is about inferring the causes of sensory states, and thereby navigating the world to elude surprises. While natural selection performs inference by selecting among different creatures, consciousness performs inference by selecting among different states of the same creature (in particular, its brain). There is a vast amount of anatomical and physiological evidence in support of this notion. If one regards the brain as a self-evidencing organ of inference, almost every one of its anatomical and physiological aspects seems geared to minimise surprise. For example, our brains represent where something is and what something is in different areas. This makes sense, because knowing what something is does not generally tell you where it is and vice versa. This sort of internalisation of the causal structure of the world ‘out there’ reflects the fact that to predict one’s own states you must have an internal model of how such sensations are generated.
But if consciousness is inference, does that mean all complex inferential processes are conscious, from evolution to economies to atoms? Probably not. A virus possesses all the self-organising dynamics to qualify as a process of inference; but clearly a virus doesn’t have the same qualities as a vegetarian. So what’s the difference?
What distinguishes conscious and non-conscious creatures is the way they make inferences about action and time. This part of my argument rests upon the reciprocal relationship between the system and the world. The world acts on the system to provide the sensory impressions that form the basis of inference. Meanwhile, the system acts upon the world to change the flow of sensations to fit with the model of the world it has discerned. This is just another description of the cycle of action and perception; for example, we look, we see, and we infer where to look next.
If action depends upon inference, then systems must be able to make inferences about the consequences of their actions. You can’t pick what to do unless you can make a guess about the probable outcome. However, there’s an important twist here. A creature cannot infer the consequences of its actions unless it possesses a model of its future. It needs to know what to expect if it does this as opposed to that. For example, I need to know (or subconsciously model) how my sensations will change if I look to the left, to the right or, indeed, close my eyes. But the sensory evidence for the consequences of an action is not available until it is executed, thanks to the relentless forward movement of time.
As a result of the arrow of time, systems that can grasp the impact of their future actions must necessarily have a temporal thickness. They must have internal models of themselves and the world that allow them to make predictions about things that have not and might not actually happen. Such models can be thicker and thinner, deeper or shallower, depending on how far forward they predict, as well as how far back they postdict, that is, whether they can capture how things might have ended up if they had acted differently. Systems with deeper temporal structures will be better at inferring the counterfactual consequences of their actions. The neuroscientist Anil Seth calls this counterfactual depth.
So if a system has a thick temporal model, what actions will it infer or select? The answer is simple: it will minimise the expected surprise following an action. The proof follows by reductio ad absurdum from what we already know: existence itself entails minimising surprise and self-evidencing. How do systems minimise expected surprises, in practice? First, they act in order to reduce uncertainties, that is, to avoid possible surprises in the future (such as being cold, hungry or dead). Nearly all our behaviour can be understood in terms of such uncertainty-minimising drives – from the reflexive withdrawal from noxious stimuli (such as dropping a hot plate) to epistemic foraging for salient visual information when watching television or driving. Second, the actions of such systems upon the world appear to be endowed with a purpose, which is the purpose of minimising not-yet-actual, but possible, surprises.
We might call this kind of system an agent or a self: something that engages in proactive, purposeful inference about its own future, based on a thick model of time. The distinction between thick and thin models of time, then, suggests that viruses are not conscious; even if they respond inferentially to changes in their external milieu, they do not embody a deep understanding of their past or a long-run view of their future, which would enable them to minimise that hasn’t-yet-happened surprise. Vegetarians, on the other hand, are surprise-minimising and self-evidencing in a prospective and purposeful way, where the future prospects of the agent becomes an inherent part of action selection. For example, if we were operating at the level of the virus, we could reflexively counter low blood sugar by mobilising our glucose stores. However, our vegetarian might take a much longer-term view of herself and start preparing a meal. In a similar vein, we sidestep the problems of calling evolution conscious. The process of natural selection minimises surprise (that is, it maximises adaptive fitness) but not uncertainty or expected surprise of the whole system (that is, adaptive fitness expected under alternative, non-Darwinian evolutionary operations).
The key difference between consciousness and more universal self-organising processes, then, appears to be the imperatives for selection. In non-conscious processes, this selection is realised in the here and now; for example, with selection among competing systems (such as phenotypes in evolution) or the evocation of reflexes (such as chemotaxis in simple organisms, in which they move towards or away from a higher concentration of a chemical). Conversely, the sort of selection we have associated with consciousness operates in parallel but within the same system – a system that can simulate multiple futures, under different circumstances, and select the action with the least surprising outcome. The conscious self is simply a way of capturing these counterfactual futures, in a way that facilitates active inference.
All biological processes – from evolution right through to conscious processing – can be seen as performing inference
Does consciousness as active inference make any sense practically? I’d contend that it does. From a psychiatric perspective, altered states of consciousness come in two flavours. There can be a change in the level of consciousness; for example, during sleep, anaesthesia and coma. Alternatively, there can be altered conscious states of the sort associated with psychiatric syndromes and psychotropic or psychedelic drugs. Different levels of consciousness are entangled with their impact on action. Put simply, the hallmark of reduced levels of consciousness is an absence of responsiveness. Try to imagine someone who is not conscious but acts in response to stimulation. The only responses one can elicit are reflexes that reflect minimisation of surprise in the here and now. By contrast, once this person is awake, she can fire up her predictive machinery about the past and future. In our daily lives, this suggests that temporal thickness or depth waxes and wanes with the sleep-wake cycle – that there’s a mapping between the level of consciousness and the thickness of the inference we’re engaged in. On this view, a loss of consciousness occurs whenever our models lose their ‘thickness’ and become as ‘thin’ as a virus’s.
As a psychiatrist, I’m drawn to the notion of altered conscious states as altered inference for several reasons. Key among these is the ability to understand psychiatric disorder as false inference. For example, in statistics, there are two types of false inference: false positives and false negatives. False positives correspond to inferring something is there when it is not, such as hallucinations and delusions. Conversely, false negatives are when one fails to infer something when it is there, such as a failure to recognise something or to entertain impossible ambiguities (for example, common questions posed by patients: Who am I? or Am I the right way up?). This translates clinically into disorientation and the various forms of agnosia that characterise dementias and other organic syndromes of the mind. From a practical point of view, this is a useful perspective because the neuronal machinery behind active inference is becoming increasingly well-understood.
We’ve gone fairly rapidly through the arguments. First, if we want to talk about complex systems, including living ones, we have to identify the necessary behaviours that their processes exhibit. This is fairly easy to do by noting that living entails existing in a set of attracting states that are frequented time and time again. This implies the existence of a Lyapunov function that is identical to (negative) self-evidence or surprise in information theory. This means that all biological processes can be construed as performing some form of inference, from evolution right through to conscious processing.
If this is the case, then at what point do we invoke consciousness? The proposal on offer here is that the mind comes into being when self-evidencing has a temporal thickness or counterfactual depth, which grounds the inferences it can make about the consequences of future actions. There’s no real reason for minds to exist; they appear to do so simply because existence itself is the end-point of a process of reasoning. Consciousness, I’d contend, is nothing grander than inference about my future.