- Nov 24, 2025
Simulating Decoded Neurofeedback to "Fix" It
- Brendan Parsons, Ph.D., BCN
- Neurofeedback, Neuroscience, Artificial intelligence
The article by Olza, Santana and Soto introduces DecNefLab, a modular simulation framework designed to study decoded fMRI neurofeedback (DecNef) as a machine learning problem rather than just a clever neuroimaging trick. This is presented as new emerging research with novel insights into how we design, test and interpret neurofeedback protocols, especially those using complex decoders trained on high‑dimensional data.
In classic biofeedback, people see or hear a real‑time representation of their own physiology (for example EEG rhythms, heart rate variability or skin conductance) and learn, through practice and reinforcement, to shift it in a desired direction. Biofeedback is the broader term for using physiological signals from the body; neurofeedback is simply biofeedback that focuses on the brain. Decoded neurofeedback, or DecNef, takes this a step further: instead of feeding back a simple band power or regional activation, the system uses a machine‑learning decoder trained on fMRI data to detect a specific brain pattern and then covertly rewards the person when their brain activity matches it – often without telling them what the target actually is.
This is powerful, but also fragile. The paper emphasises long‑standing issues in DecNef: domain shift between the training and induction phases; the risk of reinforcing the “wrong” brain pattern because classifiers are imperfect; and the uncomfortable fact that many participants are labelled as non‑responders. At a deeper level, the authors challenge the so‑called “decoder’s dictum” – the idea that if we can decode something from brain activity, the brain must be representing it in a functionally meaningful way.
DecNefLab offers a way to explore all these issues safely in silico, by replacing the human participant with a generative model that has its own latent “cognitive” space and observable “brain‑like” signals.
Methods
DecNefLab is built around a simple but elegant idea: instead of scanning a real human, use a latent variable generative model as an artificial “participant”. Internally, this model lives in a low‑dimensional latent space that stands in for cognitive states; externally, it generates observable data (for example images or synthetic fMRI‑like patterns) that a classifier can read, just as an fMRI scanner would in real life.
The core components are:
A generator G with an encoder and decoder. The latent space Z (the “cognitive” space) is mapped to a data space X (the “brain‑like” space) via the decoder. Here, the authors used a variational autoencoder (VAE) trained on the FASHION‑MNIST image dataset, but the same approach could be applied to other modalities.
A probabilistic classifier D, trained on labelled data from a target class and an alternative class (for example, “T‑shirt/top” versus “trouser” or “dress”). This mirrors the typical DecNef pipeline where an fMRI decoder is trained to distinguish a target brain pattern from some comparison condition.
A learning rule L that describes how the artificial participant updates its latent state from moment to moment based on feedback.
The VAE learns a two‑dimensional latent space in which similar images lie close to each other. The authors define “latent prototypes” for each class by averaging the encoded representations of labelled examples. These prototypes let them initialise simulations from different parts of the cognitive space and systematically probe how starting conditions affect learning.
For the classifier, they train a convolutional neural network on binary problems: target (T‑shirt/top) versus alternative (either trouser or dress). The classifier outputs p(y = target | x), which is then turned into a feedback signal.
The learning strategy L is intentionally simple but neuro‑plausible in spirit. The agent’s next latent state z_{t+1} is a mixture of its current state and a random exploratory move whose variance shrinks when reward is high and grows when reward is low. If feedback drops sharply, the agent “regrets” that move and jumps back toward the previous state before trying again. Parameters representing trust in feedback and reactivity (impulsivity) control how strongly these adjustments are applied.
The authors then simulate many DecNef sessions by:
Picking different initial latent states (z0) sampled around each class prototype.
Running 10 independent trajectories from each starting point to capture randomness.
Comparing true DecNef feedback (from the classifier) with a control condition where feedback is random and unrelated to the agent’s state.
This setup allows them to ask: how do classifier design, initial state, and stochastic exploration shape both feedback trajectories and the hidden latent trajectories over time?
Results
The simulations reveal several subtle but clinically relevant points about decoded neurofeedback.
First, the choice of alternative class turns out to be crucial. When the classifier is trained to distinguish “T‑shirt/top” (target) from “trouser” (alternative), the resulting decision landscape in latent space is quite forgiving: many regions of the generator manifold are assigned high target probability, even when the generated image is clearly not a T‑shirt. In contrast, when the alternative class is “dress”, the classifier becomes more conservative: high target probabilities are confined to a much smaller region, and many images (including shoes and bags) are assigned low T‑shirt probability. Yet in both cases, the agent only sees a scalar reward, not the underlying landscape.
This leads to the second key result: high feedback does not necessarily mean the agent is evoking the intended target pattern. Because the classifier operates outside its training distribution and is shaped by the chosen alternative class, the agent can receive strong reinforcement for cognitive states that are far from the “true” target region. In other words, DecNef can encourage maladaptive learning in which the system and the participant are both “happy”, but the underlying representation is wrong.
Third, initial conditions and randomness matter a lot for whether an agent appears to be a “responder”. Simulations launched from low‑reward starting points tend to explore widely and eventually climb toward higher feedback regions, while those starting already in high‑reward zones explore very little. Some trajectories show steadily increasing feedback; others stagnate or even decline, despite identical parameters. The very same artificial participant – same learning rule, same generator – can look like a successful learner or a non‑responder depending purely on where it started and how random fluctuations unfolded.
The visualisation of latent trajectories makes this stark: in some regions, reward‑seeking trajectories fan out and drift away from the actual target prototype, even as feedback climbs. Controls with random feedback show probabilities drifting back to chance, confirming that the observed “learning” in the main simulations truly depends on the link between classifier and generator, not just noise.
The overarching message is that DecNef can be overly optimistic (reinforcing the wrong thing) or overly pessimistic (making a capable learner look like a non‑responder) based purely on methodological choices.
Discussion
Stepping back, DecNefLab offers more than just a clever toy model. It provides a conceptual and practical bridge between machine learning, cognitive neuroscience and neurofeedback practice. By making the internal cognitive trajectory visible, it exposes a core problem: in real DecNef, we never actually see the latent state. We only see the proxy – fMRI activation patterns, decoded by a classifier that is itself limited and biased.
The simulations demonstrate how three factors interact:
Classifier design, especially the alternative class, shapes the reward landscape.
Initial cognitive state biases how much exploration occurs and in which directions.
Random fluctuations in exploratory moves can tip the learner toward or away from regions that yield high feedback.
In a clinical or research setting, all three are typically hidden inside the label “responded well to neurofeedback” or “didn’t learn”. DecNefLab suggests a gentler interpretation: many so‑called non‑responders may simply have been placed in an unfriendly landscape, starting from an unlucky state, with a decoder that reinforces odd corners of representational space.
For people considering or receiving neurofeedback, this matters because it highlights that difficulty learning is not a sign of weakness or failure. The system–person interaction is complex, and protocol design can subtly handicap some individuals. Reframing “non‑response” as partly a design problem rather than a personal limitation can be deeply relieving – and practically, it nudges clinicians to adjust parameters, not just expectations.
For professionals who refer patients for neurofeedback, the work underscores how different closed‑loop designs can be. DecNef uses implicit reinforcement and high‑dimensional decoders; EEG‑based neurofeedback more often uses explicit instructions, simpler metrics (like sensorimotor rhythm or alpha power), and long‑term repetition to consolidate change. Yet the same underlying questions apply: What exactly is being reinforced? How sensitive is the metric to noise, artefact and context? And does increasing the metric truly reflect a healthier functional state, or just a clever route to more points on the screen?
For neurofeedback practitioners, the DecNefLab results map nicely onto everyday clinical dilemmas. Consider trying to up‑train sensorimotor rhythm (SMR) around 12–15 Hz at sites such as Cz or C4 to support behavioural inhibition and stabilise attention. If the reward threshold is too easy, clients may quickly learn behaviours (or produce artefacts) that raise SMR without meaningful self‑regulation. If thresholds are too strict, sessions feel punishing and exploration collapses. Similarly, in protocols that aim to enhance posterior alpha (for example 8–12 Hz at Pz or O1/O2) for anxiety reduction and relaxation, some clients start in a chronically low‑alpha state where exploratory “wiggling” of their brain state is both necessary and noisy. Others walk in already in a fairly high‑alpha, dissociative pattern where the system happily rewards staying stuck.
The interpretive contribution of the paper is to challenge the “decoder’s dictum”: the assumption that if something can be decoded from brain activity, then it is functionally represented and causally relevant. In practice, this means we need to be humble about what feedback signals actually mean. A beautifully tuned classifier with excellent cross‑validation scores may still be reinforcing surrogate features – the neural equivalent of a child learning that smiling at a teacher produces praise, regardless of actual understanding.
Methodologically, DecNefLab invites a more cautious, iterative workflow: design a protocol, simulate it with a variety of artificial participants, and examine whether reward maximisation corresponds to movement toward the intended latent state. If not, change something before ever putting a human in the scanner or the EEG cap. This mindset is particularly attractive in expensive and time‑consuming settings such as fMRI DecNef, but it also resonates with EEG neurofeedback, where protocol tinkering is often done informally on the fly.
The broader neuroscience message is that causal understanding demands more than decodability. To claim that a feedback signal taps into a meaningful representation, we should be able to show that pushing that signal up or down reliably shifts behaviour, experience or downstream network dynamics in theoretically coherent ways. Simulation frameworks like DecNefLab help by letting us peek behind the curtain and ask: if we wired up the system this way, would we actually be teaching the brain what we think we are?
Brendan’s perspective
Reading this paper, I kept thinking of the clients who sit down in front of a neurofeedback screen, try their hardest, and still leave feeling like they have somehow “failed” the training. DecNefLab offers a very compassionate counter‑story: sometimes the landscape itself is sabotaging them.
Even though this work is centred on decoded fMRI neurofeedback, there are direct lessons for everyday EEG‑based practice.
First, the idea that classifier design shapes the reward landscape maps cleanly onto how we set targets using EEG. In DecNef, the critical choice is the alternative class used to train the decoder; in EEG neurofeedback, the analogue is how we define the “not‑target”: which frequencies we inhibit, where we place electrodes, how we treat artefact, how we set thresholds. If we decide to train SMR (12–15 Hz) at Cz for a child with ADHD, but we also have high frontocentral muscle tension, the system may quietly learn “tense your jaw and neck” as the quickest path to reward. On paper, SMR goes up; in practice, we’ve rewarded the wrong latent state. (To be clear, the training screens I designed and use control for this, but some don't!)
A practical takeaway from DecNefLab is to think explicitly in terms of the underlying cognitive or functional state we want to reinforce, not just the EEG signature. For SMR training aimed at behavioural inhibition and motor stillness, that might mean combining 12–15 Hz reward with inhibition of high‑frequency EMG artifact (for example 25–40 Hz) at the same or adjacent sites, and monitoring video or accelerometry to keep an eye on artefacts. For alpha up‑training (8–12 Hz) to support relaxation in anxious adults, it might mean placing sensors at Pz or O1/O2, but also watching that we are not driving the person into a hypoaroused, dissociative state – here, coupling alpha reward with minimum beta engagement, or with heart rate variability biofeedback, can nudge the system toward a calm‑but‑present state.
Second, the simulations around initial conditions and exploration remind us that starting state matters. In the paper, agents beginning in low‑reward zones had to explore widely to find better regions; some made it, some did not. In clinic, someone arriving with extreme hyperarousal, high beta and low alpha, or with chronic pain and constant interoceptive alarm, is effectively starting in a hostile latent region. Expecting smooth learning curves from session one is unrealistic.
In practical terms, this argues for gentle on‑ramps. Before diving into demanding protocols, it can be useful to stabilise physiology with simpler, more forgiving feedback: for example, heart rate variability biofeedback to establish basic autonomic flexibility, or broad‑band amplitude training with generous thresholds that reward any move toward quieter background noise. Once the system has had a chance to “climb out” of the worst part of the landscape, more finely tuned EEG protocols can be introduced.
Third, randomness in DecNefLab corresponds, in the real world, to the messy fluctuations of attention, mood, sleep and life. The same protocol can look brilliant on Monday and flat on Friday. The temptation is to over‑interpret each short run: “this patient can’t learn”, “this protocol doesn’t work”. The simulation framework nudges us toward a more statistical mindset: look at patterns across many short trajectories, not just one. In practice, that means evaluating change across multiple sessions and combining different outcome markers – subjective reports, behavioural measures, and neurophysiological trends – rather than just eyeballing whether the on‑screen bar went up.
Another theme I take from this work is humility about the meaning of our feedback signals. DecNefLab directly challenges the decoder’s dictum; EEG neurofeedback has its own version: if theta/beta ratio moves toward “normal”, we assume the underlying regulation has improved. But just as the simulated agent can game the fMRI decoder, a clever nervous system can game our EEG thresholds. This is one reason I am a fan of protocol individualisation: using quantitative EEG (qEEG), but not as a rigid recipe; adjusting frequency bands and electrode sites in response to the person’s lived experience; and being willing to move away from canonical placements when the data and the client’s story suggest a different target.
From a broader practice perspective, tools like DecNefLab encourage us to think in layers. At the surface: the display, the numbers, the thresholds. Underneath: the patterns of brain activity (EEG, fMRI, etc.). Underneath that: the latent cognitive and emotional states that those patterns imperfectly express. And further down still: the person’s history, expectations and relational context. Good neurofeedback tries to align all those layers so that reward is consistently paired with states that are meaningful, helpful and sustainable in the client’s life.
Finally, I love the idea of simulation as a safety net. Before committing someone to an intensive, costly intervention – whether high‑field fMRI DecNef or a long course of EEG training – we could prototype the protocol in a model, deliberately stress‑testing it with “difficult” learners and noisy data. While current simulations will never capture the full richness of human experience, they can at least flag protocols that are especially prone to reinforcing odd corners of the landscape or producing many apparent non‑responders.
In short, this paper reinforces a core clinical intuition: when neurofeedback “doesn’t work”, our first question should not be “what is wrong with this client?”, but “what in this landscape – the signal, the thresholds, the context – might be working against them?”
Conclusion
DecNefLab offers a fresh way to think about decoded neurofeedback and, by extension, about any closed‑loop brain‑ or body‑based training. By formalising DecNef as a machine learning problem with a visible latent space, the authors show how easily feedback can be decoupled from the intended target, and how design choices around classifiers, starting states and learning rules can create both illusory successes and illusory failures.
For clinicians and researchers alike, the key message is that feedback quality is not just about signal‑to‑noise ratio or classification accuracy; it is about whether reward reliably pulls the system toward genuinely helpful states. Simulation frameworks like DecNefLab give us a way to test that question before we recruit participants, adjust protocols in light of hidden dynamics, and develop more compassionate interpretations of “non‑response”.
For everyday neurofeedback practice, especially with EEG, the implications are both sobering and hopeful. Sobering, because they reveal how easy it is to reinforce the wrong thing; hopeful, because they suggest concrete ways to improve – better target definitions, more nuanced thresholds, multimodal support, and an openness to revising our assumptions when the landscape proves unfriendly.
Ultimately, aligning reinforcement with meaningful brain and body states is the heart of effective neurofeedback, and tools like DecNefLab help us take that alignment more seriously.
Reference
Olza, A., Santana, R., & Soto, D. (2025). DecNefLab: A modular and interpretable simulation framework for decoded neurofeedback. arXiv. https://arxiv.org/abs/2511.14555