Friday, 29 August 2014

Philosophy of Mind and Psychology Reading Group -- The Predictive Mind chapter 7

Kengo Miyazono
Welcome to the Philosophy of Mind and Psychology Reading Group hosted by the Philosophy@Birmingham blog. This month, Kengo Miyazono, post-doctoral research fellow at the University of Birmingham, introduces chapter 7 of Jakob Hohwy's The Predictive Mind (OUP, 2013).

Chapter 7 - Precarious Prediction
Presented by Kengo Miyazono

In chapter seven, Hohwy describes the ways in which perceptual inference is tuned in order to represent the world correctly and the ways in which it goes wrong.

The chapter discusses many different issues, but the central idea is that maintaining the "balance between trusting the sensory input and consequently keeping on sampling the world, versus distrusting the sensory input and instead relying more on one's prior beliefs" (146) is crucial in the successful operation of perceptual inference. Maintaining it is not a trivial task. And, the failure of the maintenance might be responsible for pathological conditions such as delusions or autism.

In principle, the prediction errors that are expected to be precise are allowed to drive revisions of prior beliefs higher up the hierarchy, while the prediction errors that are expected to be imprecise are not allowed to do so. Ideally, we expect high precision in predictions errors and, hence, allow them to drive belief revisions when they are trustworthy and expect low precision in prediction errors and, hence, do not allow them to drive belief revisions when they are not.

But, the process can go wrong. A possibility is that some people expect sensory prediction errors to be much less precise than they actually are. Hohwy suggests that this is what is happening in people with delusions; "a persistent, exaggerated expectation for noise and uncertainty would lead to imprecise sensory prediction errors mediated by low gain on sensory input and heightened reliance on top-down priors. 

An individual who constantly expects the sensory signal to be noisier than it really is will then tend to be caught in his or her own idiosyncratic interpretations of the input and will find it hard to rectify these interpretations. This gives rise to the idea that psychosis and delusions stem from such faulty uncertainty expectations." (159)

Another possibility is that some people expect sensory prediction errors to be much more precise than they actually are. Hohwy argues that this is what is happening in autism; "A perceiver who expects very precise prediction errors will increase the gain on bottom-up prediction error and will rely relatively more on the senses than on prior beliefs, as compared to a perceiver who expects the world to be somewhat more noisy. 

This will lead the perceiver to sample the world more and for longer, to appeal less to higher-level parameters in an attempt to generalise over particular perceptual inferences, and so will rely less on contextual influences in attempts to resolve uncertainties in the input. Described like this, the perceiver begins to fit the profile of sensory perception characteristic of the autism spectrum." (162)

Now, I believe that the idea of providing a unifying account of delusions and autism on the basis of a simple principle of prediction-error minimization is fascinating. But, there are some worries and questions.

Autism: Hohwy argues that his account explains the fact that children with autism have difficulty in mentalizing. "The simple idea is then that autism there is a bias towards particularist perceptual inference as a result of failed optimisation of expected precisions. That is, in autism much of the world is represented correct but is harnessed by hypotheses that are causally shallow, that miss out on longer-term regularities and that cannot well predict deeply hidden causes." (164) This explains the difficulty in mentalizing because mentalizing is a kind of causal inference about deeply hidden causes. 

But doesn't this account predict that children with autism have difficulty not only in mentalizing but also in causal cognition and learning in general? It is far from obvious that the prediction is true. Baron-Cohen famously argues that folk psychology is compromised in children with autism spectrum disorders, while folk physics is intact and even enhanced, and that autisim might be genetically connected in the strength in engineering or physics (Baron-Cohen 1997, Baron-Cohent et al. 2001).

Delusions: Prediction-error accounts of delusions are getting popular these days. What is really interesting about Hohwy's account is that it says something really different from other notable proposals. Hohwy argues that (1) people with delusions expect sensory inputs to be very imprecise and (2) they rely more on top-down priors than bottom-up sensory inputs. On the other hand, other prediction-error theorists tend to make opposite claims. They tend to argue that (1) people with delusions experience too excessive sensory prediction error signals or the they expect them to be very precise;

"the experience of mismatch when there is none drives an individual to invent bizarre causal structures to explain away their experiences, these are manifest clinically as delusions." (Corlett et al. 2007: 238)

"delusional systems may be elaborated as a consequence of imbuing sensory evidence with too much precision." (Adams et al. 2013: 2)

and (2) they rely more on bottom-up prediction errors than top-down priors;

"the problem that leads to the positive symptoms of schizophrenia starts with false prediction errors being propagated upwards through the hierarchy. These errors require higher levels of the hierarchy to adjust their models of the world. However, as the errors are false, these adjustments can never fully resolve the problem. As a result, prediction errors will be propagated even further up the system to ever-higher levels of abstraction." (Fletcher & Frith 2009: 55)."  

I wonder what Hohwy thinks about the relation between his and other accounts. Does he think that his is correct and they are wrong? But, then, what are the problems of others accounts? Or, does he think that his account is compatible with others? But, then, how can they be compatible with each other despite the stark contrast?

My own impression is that other accounts and Hohwy's correspond to different stages of delusional thoughts; formation and maintenance. In the formation stage, it looks as though people with delusions rely more on sensory and affective inputs than prior beliefs. For example, a person with Capgras delusion might have a certain abnormal perceptual/affective experience of the face of his wife and rely more on the experience than prior beliefs about his wife. This is so-called "bias towards observational adequacy" (Stone & Young 1997). 

On the other hand, in the maintenance stage after adopting delusional hypotheses, they seem to rely more on prior (i.e. delusional) beliefs than new and contradictory observations. For example, a person with Capgras delusion might adopt the hypothesis that the woman in front of him is not his wife and, then, no new observation can revise his conviction about the matter. So-called "bias against disconfirmatory evidence" (Moritz & Woodward 2006) might be responsible for this.

Presumably, other accounts and Hohwy's correspond to formation stage and maintenance stage respectively. The formation stage would be nicely explained by the hypothesis that people with delusions experience excessive sensory prediction errors or they expect them to be very precise. (Indeed, McKay (2012) argues that excessive sensory prediction errors would be responsible for the bias toward observational adequacy.) On the other hand, the maintenance stage would be nicely explained by the hypothesis that people with delusions expect bottom-up prediction errors to be very imprecise.

So, maybe, other accounts and Hohwy's might jointly tell a whole story of delusional thoughts. Still, some problems remains. In this combined story, a person with delusion expects bottom-up prediction errors to be very precise up to a moment and, then, begins to expect them to be very imprecise after that. But, how can it be the case? And, even if it is possible, what are the causes of the shift?


Baron‐Cohen, S. (1997). Are children with autism superior at folk physics?. New Directions for Child and Adolescent Development, 1997(75), 45-54.

Baron-Cohen, S., Wheelwright, S., Spong, A., Scahill, V., & Lawson, J. (2001). Are intuitive physics and intuitive psychology independent? A test with children with Asperger Syndrome. Journal of Developmental and Learning Disorders, 5(1), 47-78.

Corlett, P., Honey, G., & Fletcher, P. C. (2007). From prediction error to psychosis: ketamine as a pharmacological model of delusions. Journal of Psychopharmacology, 21(3), 238-252.

Fletcher, P. C., & Frith, C. D. (2009). Perceiving is believing: a Bayesian approach to explaining the positive symptoms of schizophrenia. Nature Reviews Neuroscience, 10(1), 48-58.

McKay, R. (2012). Delusional inference. Mind & Language, 27(3), 330-355.

Moritz, S., & Woodward, T. S. (2006). A generalized bias against disconfirmatory evidence in schizophrenia. Psychiatry research, 142(2), 157-165.

Stone, T., & Young, A. W. (1997). Delusions and brain injury: The philosophy and psychology of belief. Mind & Language, 12(3‐4), 327-364


  1. Very nice comments, Kengo. I think you point to some really important aspects of this chapter. I’ll make one preliminary point about the notion of precision that plays a large role in this chapter. You say: “Ideally, we expect high precision in predictions errors and, hence, allow them to drive belief revisions when they are trustworthy and expect low precision in prediction errors and, hence, do not allow them to drive belief revisions when they are not”. You’re absolutely correct here, and it is crucial to see that the ideal is not to have high precision all the time. The ideal is to learn what the actual precisions are and weight prediction errors accordingly.

    When it comes to delusion and autism, you raise two important points. The first point is about autism and you ask whether it follows from my account that individuals with autism should have the same amount of difficulty with causally deep nonsocial inference as with social inference. There are four things to say in response. First, in our work on autism (including some of the experiments with the rubber hand illusion), we argue that the context dependence of hierarchical inference is crucial to understand how PEM might apply to autism. There can be much overlap in performance between groups with and without autism as long as the levels of uncertainty do not change too much. Differences occur when the level of uncertainty change, and the system therefore needs to recruit different, higher levels on the hierarchy. We see this as an important part of the story because the overall picture of autism is very heterogeneous and any account should be able to explain this. Second, even though you're right that high functioning autism have very intact representation of deep causal structure, I think we should not overlook that at the bad end of the spectrum the level of disability is really severe and seems to extend into non-social cognition. Third, in other work, with Colin Palmer, we have argued that there is something special about social cognition, namely that much social cognition involves common knowledge (i.e., solving coordination problems by iterative representation of each other's mental states). This may be an area that is especially challenging given the kinds of issues with precision optimisation that we speculate about (here Fourth, yes, the prediction would be that for hidden non-social causes at the same depth and with similar levels of uncertainty and context-dependence as social causes, individuals with autism should show similar deficits (we are testing this in my lab at the moment).

  2. The second point you raise is about the relation between the account that I give of delusion formation and the very nice accounts given by Phil Corlett and Adams et al, and Frith and Fletcher. I've been wondering about these differences as well. It may be as you say that the differences concern the formation and maintenance of delusions, though I am not sure these aspects of inference should be dealt with entirely separately (for PEM inference and learning are intimately connected so it is hard to pin-point pre and post formation in my view). Another important distinction is between states and traits in schizophrenia, which may require different kinds of treatments in terms of prediction error minimisation. It is also possible that we need to tell different stories about how the same kinds of biases with respect to precisions might manifest at different levels of the cortical hierarchy.

    There is one particular idea I’d like to air, which seems to me to deal with formation and maintenance in one go. On my account, the idea is that expected low precision attenuates the incoming signal more than it should. This means that priors are weighted more. Kengo asks what happens with the other side of the story, the increased weighting to aberrant prediction error, which the other accounts focus on. I think my account deals nicely with this. If the system in the individual with schizophrenia is already suppressing prediction error in a fairly global manner, then it seems likely that aberrant prediction error might occur and that these prediction errors will be salient. The reason is that against a background of generally dampened down prediction error, sensory input that by chance happens to be stronger than the rest will stand out, and will demand to be explained away by the already overactive priors.

    Here is a toy example. Imagine you get a series of prediction errors in like this: 16, 9, 2, 27, 21, 13, 687, 6, 9, 11, 25. Say your threshold for weighting a prediction error is 3, so anything under 3 is not trusted. Then the sample ‘2’ won’t make the posterior change. It seems likely that when the sample ‘687’comes in you will disregard it as an outlier, since you know the samples tend to be much lower (you’ve calculated a mean of 17.2 at that stage). So ‘687’ is not weighted in your inference. But if you have dampened down the gain on prediction error, in the way I think people with schizophrenia have, then your threshold might be 30, so you don’t trust and don’t weight anything below 30. Now ‘687’ will seem mightily important: it will be an aberrant, overly precise prediction error. I am saying that the schizophrenia case is somewhat like the latter case, and that this explains formation and maintenance in one go. In particular, if you already have a wrong prior of 600, then ‘687’ will be more Bayesian grist to the delusion. (There are many ways of playing with this toy example, for example, such that gating is not an all or nothing affair).

    1. Thank you for the detailed responses! They are very helpful as well as insightful.

      About autism, my question was about the asymmetry between physical and social causal cognition in autism, which poses a prima facie problem for your account. But, you seem to have some resources to meet the challenge. It sounds very nice.

      About delusions, I really like the story. I agree that it is a coherent story about both adoption and maintenance stages. Still, I have some worries about the details.

      There are two crucial explanatory factors in the story. One is the global high threshold for weighting of prediction errors (e.g. the threshold at "30"). The other is specific prediction errors with very high precision (e.g. "687"). One might think, however that the the global high threshold is redundant after all. Once you posit specific prediction errors with very high precision, they explain at least adoption stage by themselves. People with delusions adopt delusional hypotheses in response to specific prediction errors with very high precision in a bottom-up manner. This is, in my understanding, the core idea of other PE accounts. It looks as though there is no need for global high threshold hypothesis as far as adoption stage is concerned.

      Still, global high threshold hypothesis is attractive when it comes to maintenance stage. But, there is a puzzle about your account of maintenance stage. If you think that prediction errors with very high precision occur at formation stage ("687"), you probably do not have reasons to rule out the possibility that similar prediction errors occur again in maintenance stage ("693", "546", "971"). The puzzle is, then, why do the new prediction errors at maintenance stage fail to cause the revision of prior DELUSIONAL beliefs in a bottom up manner? "687" in your story causes the revision of prior HEALTHY beliefs at formation stage because "687" is extremely important relative to the threshold. But, then, probably "693" is also extremely important relative to the threshold as well. Why does "693", then, fail to revise the prior DELUSIONAL beliefs at maintenance stage?

    2. Hi Kengo, thanks for these comments. It is interesting trying to work out the details of this account, which is still work in progress for me. I don't think that positing prediction error with very high precision will be able to explain formation by itself. The idea is that a healthy system will reject these high precision prediction errors as outliers. So my story is trying to find a way on which the system will treat high precision prediction error in a fallacious way. The idea is that poor prior learning of second order statistics can accomplish this. The explanatory problem for the other PE accounts is that they can’t really explain why the high precision prediction errors aren't just rejected as being outliers (something we are quite good at generally).

      My initial thoughts about the maintenance stage are rather simple. Globally low weighting on prediction errors would make it hard to revise any kind of prior. We have to be careful how we wield this simple toy example - it probably has its limits. But if gain on prediction is low then it will take a long time to revise the delusional belief (e.g., that the mean is say about 700) on the basis of a series of samples that are much, much lower and which are given low gain. This problem is compounded if outliers do occur from time to time. That is, the sluggish efforts at reality testing are undermined by the new incoming outliers. In a healthy system, these outliers are treated as outliers and therefore they don't undermine learning.

      I am sure however that there is much more to be said in this debate

    3. Thank you Jakob! I now understand your view and it's relation to other accounts much better.

      I am not sure, though, about outlier objection about other PE accounts. The core of the objection seems to be the idea that a healthy system will reject the prediction errors with extremely high precision as outliers. But, I guess this depends on details. Maybe, the relevant prediction errors are not as precise as your example suggest (e.g. 100 rather than 687). Alternatively, the healthy threshold might be higher than you think (e.g. 15 rather than 3). I don't know about evidence suggesting that the relevant prediction errors in people with delusions are so precise (relative to the threshold) that normal system will reject them as outliers.

      For example, Corlett et al. 2007 study suggests a prediction error signalling abnormality in people with delusions. What is abnormal about those people is not that their brains (right prefrontal cortex) show some extreme prediction error related activities. The problem is rather that the brains respond both to surprising and unsurprising evemnts in the very similar ways. The result, of course, can be interpreted in several different ways, but maybe what this suggests is not that people with delusions are experiencing extremely precise prediction errors, but rather that they experience reasonably precise prediction errors in inappropriate contexts such as the contex where it is obvious that there is no real prediction error.

      Still, I agree that we need to know more facts before coming to a conclusion about this issue.

  3. There is no such thing as "mental illness". Psychiatry itself is a bogus science. The following articles and essays explain this:

    12 Part essay that exposes psychiatry as a bogus science

    Inventor of ADHD: “ADHD is a fictitious disease”

    Co-Founder of DSM admits there is no way to scientifically prove that mentall illness is real

    One year old babies and younger being put on psychiatric drugs

    Psychiatric Drugs Shorten Life Span by 15 years on average

    Psychiatry is based on lies and falsehoods

    Psychiatry is a fake science

    Ritalin damages young boy's brains

    Every human emotion is now a "mental illness"

    Ten Myths about Psychiatric Drugs

    Studies show psychiatric drugs have no benefits and are dangerous

    Psychiatry is now giving 3 year old children drugs

    Psychiatric drugs make you sicker

    A few free eBooks talking about how psychiatry is a massive hoax

    A list of THOUSANDS of psychiatrists who have committed crimes against their patients

  4. How different are Predive Coding accounts from the two-factor account? I mean delusional beliefs are not dismissed by delusional subjects as non-veridical, while illusions or hallucinations are. It seems to me that there's something missing vis a vis Coltheart's model, which would answer also questions about delusion's maintenance vs. formation.

    1. Hi Marcin,

      Thanks for the nice question. There is a nice conversation between Coltheart and Corlett at the Imperfect Cognition blog about the relation between their accounts.

      Can you say more about why something is missing in two-factor account? It sounds very interesting.

  5. Ah, I meant something was missing in the predictive account, not the two-factor account. Basically, Bayesian (or generative) models of hallucination, illusion and delusion seem all to focus on generation rather than on subsequent evaluation, maintenance or durability of such representational episodes. That's probably because they have very limited resources for describing cognitive architectures, and describing functionally different brain areas.

    1. Oh, I see. I share the impression that prediction error accounts of delusions tend to discuss formation stage rather than maintenance stage. But, probably, they do not accept that they are not able to explain the maintenance stage because of the limited theoretical recourses. One might think, just like Hohwy, that delusion maintenance is the result of strong top-down inferences due to low expected sensory precision. Again, the recent works by Corlett and colleagues provide a nice account of maintenance processes on the basis of the analogy between belief and memory. Even if you don’t buy their account, you might think that their empirical evidence is remarkable (e.g. Corlett et al. 2007) and even two-factor theorists can’t ignore it. Indeed, Coltheart (2010) and McKay (2012) incorporated the evidence into their two-factor accounts.

      * Coltheart, M. (2010). The neuropsychology of delusions. Annals of the New York Academy of Sciences, 1191(1), 16-26.
      * Corlett, P., Murray, G., Honey, G., Aitken, M., Shanks, D., Robbins, T., et al. (2007). Disrupted prediction-error signal in psychosis: evidence for an associative account of delusions. Brain, 130(9), 2387-2400.
      * McKay, R. (2012). Delusional inference. Mind & Language, 27(3), 330-355.

    2. Hi Marcin - I agree there is a real issue about how these accounts combine. The Corlett-Coltheart chronicles that Kengo refers to are very interesting in this regard. I foresee some overall combination of all this at some stage. Corlett is in fact suggesting that there is more in common between them than meets the eye, and I tend to agree. I see the need for a second factor, which Coltheart points to, but I also see the need to avoid a second factor since a domain general belief deficit would predict widespread delusions. Coltheart et al. has an answer to that, but there is also a 1 factor answer to the problem cases where someone has the missing response to known faces but fails to develop the delusion (some of this plays out in the blog posts mentioned, some in Coltheart's papers, and some in my recent papers). It then becomes an empirical question which is right, rather than an issue that we have strong reasons for deciding on right now - and the evidence is not in yet.

      As to illusions and hallucinations, I think that it is not always the case that we dismiss illusions and hallucinations as non-veridical. Sometimes we do, just as people with delusions sometimes are able to take onboard evidence against some of their beliefs. But we presumably have all sorts of illusions all the time (e.g., when you listen to others talk your brain constantly fill in the gaps, move locations around, and block out irrelevant input). You might assert that you believe these perceptual inferences are non-veridical, when the errors are pointed out to you, but that doesn't mean your perception changes at all. I think there is a lot more in common between delusions, hallucinations and illusions than we normally think (as discussed in my recent papers in Neuroethics and in Mind & Language).

    3. In general, I think the problem with 1 and 2 factor accounts, and combination accounts for that matter, is that they don't fully incorporate optimization of expectations for precision of prediction error and the hierarchical processing that comes with that. As we know, there is no prediction error minimization without precision weighting, so it is natural to look to precisions too. In a sense this is what Frith and Fletcher suggested, and what I have been arguing too. So, I think it may in fact be more productive to go back to square one and reconsider all the evidence and reasoning in the traditional 1 vs. 2 factor debate, and then reformulate the whole story.

  6. Loving the discussion this month, thanks all. I'm interested in the PEM account of autism in particular, and though it seems like ongoing empirical investigation will clarify the prospects for that account, I wanted to add one more a priori response to the worry Kengo raises (that PEM predicts difficulty with representation of deep causes generally, whereas autism impairs only representation of mental or social causes and may even enhance that of physical causes).

    The response is that the ability to represent the mental states of others in any practically useful way (e.g. in social situations) presupposes at least adequate representation of relevant physical states (in particular states of the body of the person to whom one attributes mental states), but not vice-versa. Thus, attributing mental states to others involves deeper causal representation than the kind of modeling that goes on in folk physics, because it takes representations of the latter kind as input.

    I can think of many examples to illustrate the point, but here's just one: suppose that a person standing to your left suddenly bumps into your shoulder. If this can be fully "explained away" in terms of physical causes (a sudden gust of wind; you are both standing on a rocking boat), positing a mental cause is unnecessary. But absent any obvious physical cause, one might infer aggression or some other mental state. Generally, it's in the absence of an adequate explanation of a given physical occurrence in terms of the physical forces manifestly at play in the current environment that mental causes need to be represented. Of course this isn't meant to have metaphysical implications: all the causes represented could be thought of as physical, with mental states being complex internal physical states, if that's what you're into.

  7. This is a tempting thought, Alex. But perhaps it is tricky to generalize entirely. In some cases, the default model may be positing mental causes over external physical causes, and then inference to external physical causes become the intuitively deeper one. Perhaps you’re in a mosh pit and fully expect to be bumped by the other people there, because you expect them to be excited about the music and therefore throwing themselves around. But if the bumping suddenly gets more violent you might not be able to explain it in terms of mental states and instead need to invoke a deep folk physics cause, such as that the floor is collapsing.

    The underlying point is that what matters for inference, and for probing causal depth, is non-linearity in the input, which calls for positing extra causes that modulate more simple, shallow, causal relations. Probably deeper causes can be both folk physical and folk psychological.

    In general, I suspect we are such social creatures that the default is often to explain events in mental terms rather than situational folk physics terms. It does seem we have a tendency to ascribe deep, character-traits like causes rather than more situational, shallow (but still somewhat mental) causes, cf. the fundamental attribution error.

  8. This is all at a rather high level of generality. I think matters would be advanced if one thought about how to explain specific delusions, and the best one to start with is Capgras delusion, because that is the one which has been researched in most detail and reported more often than any one delusion. So, Jakob, how would you explain why some people come to possess the delusional belief that people emotionally close to them - their spouses, for example - have been replaced by similar-looking or identical-looking impostors?

  9. I’m joining the party rather late but thought I would like to add something to this discussion thread. I have read and greatly admired Jakob’s book and have followed, with interest and enjoyment, Kengo’s percipient comments and questions.
    Just wanted to draw out a point that is perhaps implicit in the discussion of expectancies about prediction error and the degree to which a consideration of these expectancies may relate to the maintenance as well as the formation of beliefs. In a nice paper (Annals NY Acad Sci 2007) Preuschoff and Bossaerts highlight two important contributions to a variable that governs the degree of updating that occurs following a given level of prediction error. First, the covariance between (best possible) predictions and past prediction errors; second, the variance of the prediction errors (referred to as the “prediction risk”). To give an example of the former, they imagine a coin toss in which the best prediction for value (let us say heads is a “win”) is actually zero since, over time one wins and loses an equal number of times. However this prediction will not correlate with the prediction errors. Under such circumstances (low or absent correlation between best prediction and past prediction errors) then the learning rate should be low (or zero). Or put more simply, as I read it at least, in an environment where shifting one’s predictions doesn’t do much to change prediction errors, it’s best not to update. The latter (prediction risk) they take to be important because it is the basis for scaling prediction errors: (as with the discussion earlier in the thread) a history of highly variable prediction errors means that the errors will be scaled down, necessitating a higher magnitude one to have an effect (the example they give here is of choosing one’s options in a stock market – when fluctuations are expected to be large (high variance) then one is less inclined to buy or sell on the basis of a single deviation, even a high one))
    The long and the short of this is that the prediction risk scales prediction errors and the impact of scaled prediction errors on updating is thereafter determined by covariance between predictions and errors.
    Perhaps the transition from the elastic, hyper-associative, explanation-hungry state of early psychosis to the fixed and more evidentially impermeable state of established delusions, might to some extent reflect this alteration in the learning rate parameter - governed by an initially high level of prediction error (with mis-specified precision) , via an increased estimation of prediction risk (due to lots of high prediction errors) leading to a downscaling of prediction error effects (as described by Jakob) and an increased experience of poor covariance between predictions and errors, the latter two effects conspiring to reduce the learning rate (that is to reduce the chance that updating of beliefs will occur).
    Or, to put it a little more succinctly, perhaps the person moving into psychosis is experiencing and forming beliefs as a response to the violations of expectancies. Simultaneously tracking these violated expectancies causes an increasing mistrust of the degree to which the world is providing reliable information, the end result being a reluctance to update beliefs even when the worldsends strong signals that the beliefs are wrong.

    Best wishes
    Paul Fletcher

  10. Hi Paul – I like these ideas a lot! The Preuschoff and Bossaerts 2007 paper is very nice; also Payzan-LeNestour & Bossaerts PLoS Com Biol 2011. The discussions there are similar to that in Mathys et al from Frontiers 2011, which is phrased in terms of prediction error minimization. Some of the things I have discussed above (and back in posts on Ch 1-2-3) use parts of Mathys et al (though I wouldn’t say I master every equation in these papers!).

    Covariance and prediction risk are going to be useful tools because they speak to hierarchical, longer term learning of uncertainties. Adding unexpected uncertainty will also matter (i.e., changes in the covariance: the coin changes to being biased, or the market changes due to yet another war). Bayesian inference in the real world must have all these elements, so simpler Bayesian schemes will almost certainly miss the mark as accounts of delusions. Having these three elements of uncertainty assessment gives richer possibility for accounting for both delusion formation and maintenance within a purely Bayesian framework.

    Paul’s suggestion begins with the important early phase of psychosis, which is marked by a strange openness to stimuli, and a strong drive to explain it away (Freedman recounts patients who say “my eyes became markedly oversensitive to light. Ordinary colors appeared to be much too bright, and sunlight appeared dazzling in intensity”; patients “felt they suddenly opened up to a wealth of perceptual stimuli of which they had not been aware previously”.) This sounds like a sudden increase in overly precise prediction error, which craves to be explained away. It seems likely that now inference is beginning to be undermined, since these prediction errors are spurious. So prediction risk increases, and in the longer run one finds that the covariance decreases. This should then force a decrease in the learning rate (“I can’t trust the predictions, and when I do try to update I find that it doesn’t help me”). This gives us both formation and maintenance, not as separate factors but as natural (rational) parts of the same one story.

    The question arises, why is there the initial surge in overly precise prediction error? Also, why does psychosis often floridly fluctuate and get embellished, rather than retain the same content and unwillingness to alter belief?

    The basis of it all could be poor processing of changes in unexpected uncertainty (that is, poor assessment of volatility). We all know that sometimes we need to update our model. We might have hit on predictors with good covariance, but at discrete points in time the predictors begin to perform poorly, and covariance goes down. This is hugely context-dependent, and can take very long to learn (it might be easy for coin tosses, since we rarely come across biased coins and conmen; but the stock market is so volatile that no-one ever has learnt to prevent their market predictors from going bad).

    Perhaps the prodromal sudden surge in prediction error is based in a faulty volatility inference that a new context has been entered where prediction error is precise and trustworthy. Similarly, later fluctuations may be faulty inference that the context has changed again, even if it hasn’t. This could perhaps all boil down to a high hyperprior for volatility. This could have the cascading effects on prediction risk and covariance that together gives the total package: aberrant prediction error and poor ability to revise the learning rate and the model. To me it makes sense that the hardest level of uncertainty would be centrally involved in psychosis. Incidentally, and almost too neatly, Bryan, Colin and I have argued that autism has a low hyperprior for volatility.

    Plug: Peter Bossaerts is keynote speaker at our Australasian Society for Cognitive Science Conference in Melbourne Dec. 8-10. Come along!

  11. Max asks how these kinds of ideas might apply more concretely to a case such as Capgras’. Thanks for the challenge, Max! As we have discussed above, there is much difference between these monothematic delusions and the florid polythematic ones in psychosis. It is more difficult to apply the trifecta of risk, covariance, and volatility to cases such as Capgras’. But let me try out some ideas.

    Assume, with Maher and many others, that Capgras begins with an unusual sensory input: the predictor that usually ensures little prediction error begins to fail. In Capgras’, this predictor is for some autonomic response to seeing the spouse. The unusual sensory input is that this autonomic response is different – this is then a prediction error.

    The prediction error could be in the mean autonomic response (toy example: rather than a 10, it is now unexpectedly a 1). Or it could be in the precision (the mean response is still a 10 but there is now unexpectedly much more variability around that mean). Or both: unexpectedly much variability and a surprising mean around a 6 perhaps.

    The typically organic damage that leads to Capgras’ could, as far as I can tell, produce any of these prediction errors. Different types of damage could perhaps cause different errors. Some damage means the autonomic response is preserved but is processed noisily, while other damage knocks it out altogether.

    It seems likely that different prediction errors will lead to different inferences. The individual may infer that the spouse is a stranger, if a very strong expectation of autonomic response is clearly violated. Here we are back with volatility again: an unexpectedly strong prediction error normally suggests that a new modulating hidden cause has arisen, such as an impostor. Or the individual may report a more inchoate sense of uncertainty when the mean autonomic response has shifted somewhat and become noisier.

    In both cases it seems likely that noisy measures like skin conductance and heart rate cannot distinguish between the cases as seen from the outside. This would alleviate the need to look for a second factor that would explain why some don’t develop the delusion.

    This still leaves the two classic questions: 1. Why do they prioritise the impostor hypothesis and not the brain-lesion hypothesis as an explanation of the unusual input? I am tempted to appeal to priors here: we are exposed to a huge number of strangers who do not excite our autonomic system, and we have brain damage extremely rarely. It is then rational infer that the spouse is an impostor. We know that frequencies determine empirical priors from work on visual illusions such as the Müller-Lyer (e.g., Purves’ work), so it would be unsurprising if this is the case for the delusional case too. There is much more in common between illusions and delusion than many think (or so I’ve argued).

    2. Why is the delusion maintained? We could rephrase: why is the impostor hypothesis repeatedly prioritized? Perhaps part of the answer is simple: the prior that strangers cause no autonomic response is very strong and is constantly “topped up” as we encounter strangers (just as the Müller-Lyer is topped up all the time). In addition, for autonomic responses it seems there is little scope for active inference (i.e., reality testing), which means the individual cannot rid themself of the delusional belief. The individual don’t know what to do to test the hypothesis that the spouse is not an impostor, apart from trying to restore the autonomic response. With the Müller-Lyer, I might be able to put the lines right next to each other, which would convince me they are in fact the same length, but if my hands were tied down and I could ever only look at the lines from a distance, then I would be justified in not trusting anyone else telling me the lines are the same length (Raben and I suggested this in 2005). In other words, it would be highly surprising if someone who don’t elicit an autonomic response is not a stranger, so witnesses who tell me otherwise are not reliable (a la Bovens and Hartmann).

    1. This is very interesting. Unusual autonomic response creates a kind of prediction error, and delusional beliefs are formed in response to it. The crucial issue here is that prediction errors might not be sufficient for the development of delusions since people with damages to ventromedial regions of prefrontal cortex would share similar prediction errors but do not develop delusions. You suggested that this might be explained by different kinds of prediction errors and different ways in which people respond to them.

      It seems to me that a very simple PE explanation of Capgras/ventromedial asymmetry is that, in Capgras cases, the relevant prediction errors are regarded as very precise for some reasons and people make bottom-up inferences from them. In ventromedial cases, on the other hand, the prediction errors are not regarded as very precise and people do not make bottom-up inferences. So, the difference between Capgas patients and ventromedial patients is not about prediction errors, but rather about expected precision of them. Is this a part of your story?

      Maybe, this simple PE account is consistent with two-factor account, since we can regard the prediction errors (shared by both Capgras and ventromedial patients) as the first factor, and the high precision of the prediction errors (only in Capgras patients) as the second factor. Indeed, it is natural to expect that high precision of the prediction errors explains what is often called the bias towards observational adequacy (Stone & Young 1997), which is regarded as the second factor by some two-factor theorists (but not Max) . McKay (2012) wrote about a very similar idea of combining two-factor and PE accounts although he didn't talk about precision.

    2. Dear Jakob,

      I think here you have not considered what I take to be a crucial point. We agree re Capgras that the unusual sensory input is that this autonomic response is different – this is then a prediction error. More precisely, what's different is that nonCapgras people show much greater autonomic responding to familiar than unfamiliar faces, whereas Capgras people show equal responsivity to these two classes of face.

      But patients with damage to ventromedial frontal cortex also show equal autonomic responsivity to familiar and unfamiliar faces: yet these patients are not delusional. Why not? Our answer: it is because a second factor is needed for delusion to occur, on top of the first factor (which is absence of autonomic response to familiar faces). Do you agree that this line of reasoning indicates that no one-factor theory can explain Capgras delusion?

      And btw I think it is not correct to draw a distinction, as Kengo does, between PE theory of delusion and two-factor theory of delusion. Since its inception in 2001 the two-factor theory has always been a prediction-error theory. For all the monothematic delusions, factor 1 generates the initial delusional idea precisely because factor 1 cause a prediction error to occur, and prediction errors are what trigger the belief generation and evaluation system.



    3. Why might people with vmPFC lesions not come up with a delusional interpretation? One possibility is that they do not have a pervasive deficit (in the sense that there is a prediction error signal limited to a portion of the inferencing hierarchy but this does not extend throughout that hierarchy). They quickly learn that the absent autonomic response is to be predicted and can be easily encompassed within an existing world view that does not require the possibility of impostors.
      I am really puzzled that you call the two factor theory (at least in part) a prediction error theory. As I understand it there is (must be) a prediction error (caused by abnormal experience) but this isn’t abnormal. I don’t understand how it can be a prediction error theory. Or perhaps you are saying that it’s a prediction error theory in the sense that a normal prediction error is part of the chain of events that ultimately results in delusions. If so, it’s a prediction error theory in a very different sense. Or perhaps I am misunderstanding.

  12. I share Paul's sentiments here - would love some clarification on how 2-factor theory has always been about prediction error.

    Furthermore, I think that the growing literature on causal model theory and model spaces - learning to learn - might be useful as we consider the hierarchy of inference.

    In associative learning theory, the degree of learning and belief updating, and the formation of new beliefs is proportional to the prediction error [see Tony Dickinson's comprehensive Bartlett Memorial Lecture piece (2001)]. To summarize, in the face of constant aberrant prediction error (either in terms of magnitude, timing or precision) new belief formation is necessary.

    Which beliefs are formed and why is less clear from a simple associative account.

    However, theorizing and data on causal learning, inference, and belief have advanced somewhat since 2001. In Karl Friston and others' most recent paper on habits in the Bayesian framework, they appeal to prediction errors and precision in examining the space of possible beliefs/explanations.

    They suggest that prediction errors and their precision necessitate learning about a particular model (or belief).

    However, after a point - the errors (surprises/odd experiences) necessitate a new belief - I (and Michael Waldmann, Josh Tenenbaum et al) call this a prediction error 'over beliefs' - the associative structure (e.g. from causal factors to effects) is inadequate to explain away the prediction error and so a new model is necessary. A new associative structure is necessary. Note though - this is the same error signal, the same single factor. That single factor (aberrant prediction error) leads to belief evaluation (their Factor 2 - in our account intimately related/inseparable from the initial prediction error) and because of its magnitude, timing and/or precision, it leads to an aberrant belief - in the Capgras case - an imposter belief.

    In our model, aberrant prediction error drives both the experience and the subsequent belief - we don't need a second factor.

    All the best,


    1. Every time I have seen my wife's face in the past, this has been followed by an autonomic response. Hence I have learned to predict that such a response will always occur after I have seen her face. Last night I had a stroke which produces the forms of brain damage that are the neural substrate of Capgras delusion. This morning I see her face and, as usual, predict that an autonomic response will occur. This prediction is erroneous i.e. the expected autonomic response does not occur. That's the form of prediction error that, on our 2-factor theory, is the basis of Capgras delusion. We have analogous accounts, all involving some type of prediction error, for various other monothematic delusions.

      Is it now clear in what sense our theory involves prediction error?

  13. One other brief thought - I've really enjoyed this thread and wanted to add that Jacob, Kengo and others have highlighted a key dimension in which prediction errors can be aberrant.

    I see precision as an error over errors, or an expected (or unexpected) uncertainty.

    In formal learning theory this is described as the way in which prediction errors from prior learning episodes are remembered and used to allocate attention to the cues that engendered them, in the future.

    This is one form of attentional salience or, more formally, associability. Events that are unpredictable, garner more attention and they are more likely to enter into associative relationships in future (pace - my error over models).

    Clearly there are a lot of empirical data to be gathered. Ultimately, that ought to be the test of the models - rather than their intuitive explanatory adequacy - that’s the beauty of predictions, prediction errors and science.


  14. I agree with Phil's idea that a prediction-error theory can deny the second factor and, hence, fail to be a two-factor theory. As Paul suggested, a prediction-error theorist might simply explain the Capgras/vmPFC damage asymmetry by the idea of limited deficit.

    But, I am sympathetic to Max ("I think it is not correct to draw a distinction, as Kengo does, between PE theory of delusion and two-factor theory of delusion") to some extent because I do think that there could be hybrid theories. For example, a two-factor theory can be a prediction error theory by either identifying the first factor with a kind of prediction error (e.g. identify the first factor of Capgras delusion with the lack of expected affective response to familiar faces (Ellis & Young)), or explaining the second factor by prediction errors (e.g. explain the bias towards observational adequacy (Stone & Young) by aberrant prediction errors or precision). Maybe hybrid theories are attractive because they might have the virtues of both accounts...

  15. Hi Kengo,

    We don't think it is meaningful to say that the two factor theory is a prediction error theory because it is expressed at a different level of description (psychological vs. cognitive neuroscientific).

    In as much as two factor theory does relate to prediction errors, it treats them as being normal.

    PE theory, as it has been formulated, could indeed be invoked to explain both abnormal perceptual
    experiences and abnormal beliefs.

    But this does not, in our view, make it a two factor theory even though we may choose to use descriptive terms that differentiate stages of delusions, like formation and maintenance.

    All the best,

    Phil Corlett & Paul Fletcher

    1. Are Phil and Paul suggesting here that the idea that new beliefs are triggered by prediction error somehow belongs to cognitive neuroscience? It predates the rise of cognitive neuroscience e.g.“conscious awareness and ideation tend to arise primarily at moments of conflicting sign-gestalts, conflicting practical differentiations and prediction" (Tolman, 1932); (“recognition is too easy to arouse vivid consciousness. There is not enough resistance between new and old to secure consciousness of the experience that is had . . .‘Consciousness’ is the more acute and intense in the degree of the readjustments that are demanded, approaching the nil as the contact is frictionless and interaction fluid” (Dewey, 1934); “At any moment this system is compared with the stimulus actually in operation . . The nervous system thus elaborates a forecast of future stimuli . . . and compares these forecasts with the stimuli actually in operation . . . If the stimulus and model are not identical, impulses signifying discrepancy develop and an orienting reaction occurs” (Sokolov, 1958; despite the reference to "nervous system" here, Sokolov's theorizing is at the cognitive level). So this idea is a very old one; I'd bet that one can find it somewhere in William James.

      I think the first person to offer a prediction-error theory that did specifically refer to particular brain structures (i.e. was essentially cognitive-neuroscientific in nature) was Jeffrey Gray (see e.g. his 1995 BBS paper "The contents of consciousness"

  16. Hi Phil and Paul,

    This is really helpful! I agree with most of what you have said.

    It might be a slight oversimplification, though, to say that two-factor and prediction-error theories are at completely different levels. Two-factor theorists sometimes talk about neurophysiological basis of the first and second factors. Prediction-error theorists are making some claims about what is going on in psychological levels when they are talking about, for instance, abnormal salience.

    Still, I do agree that two theories are primarily about two different levels of explanation. But, my thought is just that there could be hybrid theories. In other words, nothing rules out the possibility that someone coherently endorses some important ideas from two-factor theories and some from prediction-error theories. (This is a bit weaker than what Max's claim that there is no clear distinction between two-factor and prediction-error theories. But, I guess, this might be what he really wanted to say.) The fact that they are primarily at different levels opens up, rather than closes, such a possibility. After all, two different theories about the same phenomena at exactly the same level are incoherent with each other in many cases.

    Best wishes,


  17. Just to be clear – when we say that the two factor theory and the PE theory are expressed at two different levels, this is not to exclude the possibility that each may be expressed in terms of the other. We certainly wouldn’t argue that the two factor theory takes no heed of underlying neurophysiological processes (after all, the observation of the single dissociation that Max describes above is the basis for asserting that two different forms of underlying process must aggregate in the formation of Capgras Syndrome). It is both natural and essential that these two layers of explanation should be brought together. Nor can one level of explanation ignore observations made at another level – the prediction error model cannot ignore this dissociation found at the higher level of description. But it’s important to remember that one can imagine the two factor theory surviving quite happily without abnormal prediction error needing to be invoked.

    This is what we meant in underlining the fact that, ultimately, they are expressed at different levels. Not that consilience cannot be attained but that we need to be careful about sliding from one level to another. Otherwise we are in danger of getting into potentially fruitless arguments about which theory/model is best. It seems to me entirely plausible that one could express an abnormality in terms of prediction error signal and that, if one moved up a level or two, one could make prophecies about an individual with such a disrupted signal having both altered perception and disturbances in the ways that beliefs are formed and held. I wouldn’t see this as a hybrid theory, rather I would see it as a pleasing consilience across the two models. And of course, the predictions can and should go in the opposite direction – from higher to lower levels of description.

    (Phil may wish to comment here - the last addition arose from a conversation we had (hence came from both of us) but this comment comes without consultation )

  18. Dear Max,

    We agree, of course, that the idea of prediction error as a driver to updating beliefs long predates the emergence of cognitive neuroscience and we hope too that the referencing in our papers is sufficiently wide to exculpate us from any suggestion that we would harbor such a belief.

    Indeed, we would not, without having read Coltheart et al on abductive inference and delusions, have been moved to read Pierce and to note that he was clear that abductive inference is driven by a novel or surprising experience.

    One might argue (rightly) that prediction error theorists have been getting along perfectly happily without the need to worry too much about the brain. And, of course, the associative learning field has made the key advances in theorizing about prediction error as the basis for learning largely on the basis of behavioral studies. Again, we hope that the debt we owe to the associative learning theorists has been adequately acknowledged in our referencing.

    However, when it comes to going beyond the simple perspective of “I didn’t expect that, therefore I will change my belief about the world” we feel that exploring the different aspects of prediction error at
    different levels in the system will be critical since many of them will be signals that are not available to conscious report even though they ultimately shape how we model the world and how we evaluate our models of the world.

    This exploration needs deeper measures, including those at the level of the brain. As an example, the two factor theory of delusions relates an abnormal perceptual experience to a surprise which causes belief updating. Since the evaluation of beliefs is faulty the new world model is delusional.

    The observation, as you have clearly demonstrated, that it is possible to have the perceptual aberration without the faulty belief evaluation suggests that the two processes may
    be different.

    But immediate questions arise, - for example, how does one evaluate a belief?

    It’s hard to imagine this process without some further comparison process (does the nature of this belief violate prior experiences?).

    So how does prediction error relate to belief evaluation?

    We appreciate that you have never speculated on the nature of prediction error in the papers adumbrating two factor theory so hope you don't mind us introducing it as a possible signal important in the belief evaluation process itself.

    We find it hard to imagine pursuing this line of enquiry without having recourse to measures of the brain and that is why we emphasize the cognitive neuroscience of prediction error signal. In this field we believe that real progress is being made in providing a theoretical framework that enables us to think about this signal in terms of its characteristics (magnitude, risk, reliability, precision) that offer new ways of thinking about perception and belief.

    Critically, although these ways of thinking go all the way back to Pierce’s assertion that we engage in abduction when we have a surprising experience, they offer a much more complex and satisfying
    perspective than simply saying that an unusual conscious experience causes a new association to be formed.

    All the best – Phil Corlett & Paul Fletcher

  19. Dear Phil & Paul,

    We agree that Pierce’s assertion that we engage in abduction when we have a surprising experience is a key insight, and in various two-factor theory papers we have applied this insight to the explanation of a number of different delusions: Capgras, Cotard, Fregoli, Mirrored-self misidentification, Somatoparaphrenia and alien control delusion, for example. In our 2011 Annual Review of Psychology paper (and in other papers), we have proposed what the specific surprising experience is for each of these delusions - it is different for each delusion, of course, since each delusional brief has a different content.

    When you consider what we say about each delusion, would you want to say that the particular surprising experience we mention is sufficient for the delusion to be present? I think that is what a one-factor theory has to say, isn't it?

    But we painstakingly combed through the literature and were able to show, for every one of the different surprising experiences, that there exist reports of patients whose neuropsychological impairments would generate these experiences but who were not delusional. I have been making this point about Capgras but it is a general point about all of the delusions we have discussed. If someone believes that a one-factor account is viable for every one of these delusions, it behoves them to explain, for every one of these delusion, why it is that some with the particular experience relevant to the delusion are delusional and some are not. Until a one-factor theorist succeeds in doing that, the only answer to this question that is on the table is this: in all cases, a second factor needs to be present if a delusion is to be present.

    The two-factor theory claims that this second factor must be the same in all of these different delusions. We propose that this common second factor is impaired belief evaluation, and that an impairment of right dorsolateral prefrontal cortex is associated with this second factor. There is evidence of damage to this particular region in a variety of the kinds of delusions we consider - not just in one of these e.g. not just in Capgras.


  20. Some belated thoughts on the discussion, which has been really interesting.

    Several comments up, Max asks what would account for the difference between people with monothematic delusions and people with the same experiences but without the delusion. For example, those with capgras vs. the ventromedial patients.

    Max’s own answer is that the difference must be impaired belief evaluation – the same impairment across all the delusions, possibly related to right DLPFC damage. It seems that for each monothematic delusion, there are patients with different lesions, who have the experience in question but not the delusion.

    In contrast, the PEM-based answer would have to be that Max’s question makes an unwarranted assumption, namely that the experiences are the same. For example, it seems possible that the capgras patient and the ventromedial patient have similar but not identical experiences. If that’s true, then the difference between them might not be in a second impairment but in the character of the experience. In particular, the experience in the ventromedial case might be associated with less precision so may not be so surprising and therefore not require delusional update of the belief system.

    This is the type of story Phil, Paul and I suggest in earlier comments. It is very nicely consistent with risk, volatility and the other aspects of precision optimization we have mentioned. I don’t think Max’s account can rule this out.

    How likely is it that there are similar experiences associated with different precisions? It seems to me very likely. We know that perceptual inference is highly sensitive to precisions of prediction errors. This is clear from studies of multisensory integration such as my favorites the ventriloquist and the rubber hand illusion. The precisions of the sensory input, which is always evaluated relative to the expected precisions, can determine whether a person experiences the ventriloquist illusion or not. Different people can have very different experiences of the rubber hand even if the input is the same: one person experiences it in an almost delusional manner and the next person reports more inchoate experiences of fluid, uncertain touch. Again, in binocular rivalry, a main determinant in perceptual inference is the precision of the prediction error (cf. Levelt’s propositions).

    Even if it is granted that the offending experiences are different rather than identical, does that mean the experience in the delusional case is sufficient for the delusion? Max might argue that even if there is more precision in the prediction error in the delusional case, this does not explain why the delusional belief is not revised in the light of new evidence (and hence the need arises for a second impairment, in belief evaluation).

    Again PEM is resourceful. If the precise prediction error is chronic, then perceptual inference would be solidified and much harder to undermine with other evidence. In addition, as Phil also suggests, delusional belief likely happens when a new model of reality is selected. A toy example: I may have a coin toss model and expect heads and tails (or 1s and 2s). As pr empirical Bayes I adjust my expectations for the frequency of heads and tails according to the evidence of 1s and 2s I get in. If my evidence begins to include 3s, 4s, 5s and 6s, then prediction error will accumulate and eventually I will reject the coin toss model and adopt a die throw model instead. It takes more evidence to overturn a model (die rather than coin) than to revise its parameters.

    That is, the high precision prediction error in the capgras case forces a change of models, which automatically makes the new belief more impervious to other evidence. The difference between the capgras patient and the ventromedial patient is then that the former adopts a new model in reponse to the precise prediction error and the latter merely updates the parameter of the old model (in terms of amplitude of the autonomic response, for example) in response to the less precise prediction error.

    1. Why are patients with ventromedial frontal damage and autonomic insensitivity to familiar faces not delusional when Capgras patients with autonomic insensitivity to familiar faces are delusional? To avoid having to postulate a second factor to answer this question, Jakob, Phil and Paul suggest that " it seems possible that the Capgras patient and the ventromedial patient have similar but not identical experiences". In the most recent study of Capgras delusion (Frontiers in Human Neuroscience, it was reported that the patient had medial frontal-lobe atrophy as well as the delusion. Assuming that this patient also had autonomic insensitivity to familiar faces (since that has been observed in all cases of Capgras delusion in which it has been investigated) then it is the case that this Capgras patient and the Tranel patients had both medial frontal damage and autonomic insensitivity to familiar faces. Why then should their experiences be different? It is hard to see why this should be, and there is no evidence at all for such a difference. So this seems a difficult path to follow as a way of avoiding the postulation of a second factor in cases of Capgras delusion.

  21. This story is just another way of appealing to the hierarchy in empirical Bayes, where higher levels are posited to account for non-linearities in the input (e.g., the sudden shift from 1s and 2s to 1s-6s is explained by the coin tosser being substituted by a die thrower). Importantly, it seems irrational to abandon the die model if the evidence remains 1s-6s, even if someone says “but it really is a coin”.

    So an unusual experience is sufficient to produce a delusion. People who don’t have the delusion have different experiences (and, I believe, different lesions too), which deals with the cases Max reports.

    The benefit of this PEM story is that we don’t have to posit a domain general impairment of belief evaluation, which would predict widespread delusions inconsistent with the monothematic element. To avoid this prediction one would (as I think Max prefers) have to posit a difference in the salience of the delusion-inducing prediction error, making it a special case (in response to Phil, Max says about the factor 2 impairment “The system will fail to reject a belief only when there is persistent and strong evidence favouring this particular belief. In monothematic delusion there is only one belief of this kind (the belief that Factor 1 prompts)”. But with that proposal, it seems to me we are back in familiar PEM territory, which parsimoniously just needs such differences to explain everything, without a Factor 2.

    So I don’t think there is convincing reason to abandon the one-factor, PEM style account. This is not to say that no pair of impairments could possibly explain some delusions in the way Max suggests. But as I see it, the evidence does not especially favour the two-factor story. The rDLPFC evidence might push in favour of Max here, but to my eye there is still some way to go in establishing this lesion across all cases, and establish that rDLPFC has this role. (For what it is worth, I am impressed by studies of right hemisphere lesions from Danckert’s lab suggesting involvement in model update.) I am also very open to there being individual differences in hierarchical inference, which can tip some but not all people into delusional belief (this is the approach we are taking in our studies of autism).

    As Phil rightly insists, there are diminishing returns on ruminating about all these conceptual accounts. Clearly new, direct evidence is needed. The holy grail in cognitive terms would be inducing delusions in healthy people by exposing them to prediction error and varying precisions. This could be done with the coin vs. die type case I explained above.

    In more neural terms, one might expect the damage in the capgras and ventromedial cases to involve backwards connections mediating expected precisions. For example, when I see a loved one, I expect to experience high precision arousal so the gain on arousal is turned up. If this mechanism is broken, then the arousal will be less than expected, giving rise to a strong prediction error. If the mechanism is merely damaged then the prediction error may noisier. It ought one day to be possible to distinguish such cases, perhaps with DCM, looking at the modulation of SCR and the involved areas of the brain in capgras and ventromedial patients.

  22. I am not sure what the "abnormal" means in "abnormal prediction error". When a Capgras patient predicts that an autonomic response will occur and it does not, that prediction was erroneous, so a prediction error occurred. But was this an abnormal prediction error, and if so what is an example of a normal prediction error?

    The two factor theory cannot survive without prediction error being invoked, since the reason the first factor works is always because it causes prediction errors. In our papers we have worked through a number of monothematic delusions (Capgras, Cotard, Fregoli, somatoparaphrenia, alien control, mirrored-self misidentification and others). For each one we have proposed a specific first factor which is always a neuropsychological impairment that causes something to happen that the patient doesn't expect i.e. doesn't predict: that is, the first factor causes a prediction error. The new (delusional) belief is an attempt to update the patient's web of belief so that it no longer generates the incorrect prediction.

    Then we attempt to show for all these examples that there are patients who have the relevant neuropsychological impairment but not the delusion. That leads us to argue that an additional impairment - the second factor - is needed if the patient is to adopt the delusional belief/

    Kengo suggests that I claim that there is no clear distinction between two-factor and prediction-error theories, but I don't claim that. What is the prediction-error theory for each of the delusions I list above, in the sense of a theory that works while claiming that there's only one factor involved, viz., prediction error? I don't think there are any such one-factor theories for any of these delusions.