Researchers have utilized Stable Diffusion, an AI image generator, in conjunction with fMRI data to reconstruct images from human brain activity, marking a significant advancement in decoding the visual cortex.
Abstract
The article discusses a groundbreaking approach where AI tools like Stable Diffusion are employed to interpret brain activity and reconstruct visual experiences. By leveraging fMRI scans, which capture blood flow changes in the brain corresponding to neural activity, scientists can now visualize what people are seeing or imagining. The process involves training AI models to generate visual and semantic representations from fMRI data, which are then refined by Stable Diffusion to produce high-resolution, semantically accurate images. This method, detailed in a paper by Shinji Nishimoto et al., represents a leap forward in the field of neuroimaging and has implications for understanding consciousness and developing technologies that can interface with the human mind.
Opinions
The author suggests that the integration of Stable Diffusion with fMRI data is a novel and exciting development in the realm of AI and neuroscience.
There is an acknowledgment that while Stable Diffusion does not directly decode brain activity, it significantly enhances the resolution and semantic accuracy of reconstructed images when used in conjunction with other AI models trained on fMRI data.
The article implies that the potential applications of this technology could extend far beyond current capabilities, hinting at future advancements that may allow AI to interpret human thoughts more literally.
The author expresses enthusiasm about the rapid progress in AI and neurotechnology, speculating that mind-reading capabilities may not be as
Can AI tools like Stable Diffusion Read Our Minds By Decoding Brain Activity?
To decipher our brains, we obviously need to record what’s going on in there. Fortunately, there is fMRI (short for functional magnetic resonance imaging), a tool to measure brain activity by measuring and visualizing the blood flow of cells: When a cell is more active than usual, it needs more energy, i.e. oxygen, and therefore the blood flow of that cell or cell region changes.
Our brains constantly produce our conscious experience by interpreting a myriad of sensory information and aggregating layers of memories and subconscious needs into emotions, constructing the subjective experience of “being in a world”.
A daunting task indeed to look at this experience as a whole and try to make sense of all the brain activity that comes with it.
A much more motivating approach is to look at a single aspect of this conscious experience and find out how to decode the specific brain activity that corresponds to it.
For example, we know that brain activity is different depending on what we are doing: Different emotions cause our brains’ cells to become active in different ways, and different regions of the brain show increased activity depending on whether we talk, listen to music, meditate, etc.
One very well-studied part of the brain is the one that is particularly active when we are processing visual information.
Say hello to your visual cortex:
Left: visual cortex, Righth: information streams observed to originate in the visual cortex (occurring during object recognition tasks and object location tasks), Source: Wikipedia (https://en.wikipedia.org/wiki/Visual_cortex)
Although it has been possible to reconstruct images from brain activity ten years ago, the results did not have a high resolution and often lacked semantic accuracy. Here’s an example of how AI used brain activity to reconstruct movie scenes that people were watching:
Now, what about the claim that Stable Diffusion reconstructed images from brain activity?
The according paper by Nishimoto et al. was published in December 2022 already but recently found new attention as it got accepted at this year’s “Computer Vision and Pattern Recognition Conference” (CVPR). No less a premiere: For the first time in scientific research on the visual cortex, the open-source AI imaging tool Stable Diffusion was used to help decode fMRI scans.
How does Stable Diffusion decode the visual cortex?
Short answer: It doesn’t.
Long answer: Stable Diffusion is an AI model trained on 5 billion image-text pairs. This training allows the model to understand relationships between text descriptions and the corresponding images. It was not trained on the relationship between visualizations of brain activity and the corresponding images presented to the subjects.
Let’s recap what Stable Diffusion does when it is generating images: it all starts with a random noise pattern. This noise pattern is then cleaned up by the AI model step-by-step until it looks like the thing an initial text prompt asked for.
Interestingly, Stable Diffusion has been trained to generate music instead of images by teaching the model the relationship between visual representations of sounds (spectrograms) and their corresponding text descriptions.
Reading “High-resolution image reconstruction with latent diffusion models from human brain activity,” which is the title of the paper published by Nishimoto’s team, one could easily get the idea that the scientists taught Stable Diffusion how to interpret brain activity in a similar way the model has been taught to generate music.
But that is not what happened.
It sounds pretty cool though.
Nishimoto et al. took a dataset that included both the images that have been shown to people and the brain activity (fMRI scans) of those people while they were looking at those images. This data was then used to train two AI models to generate visual and semantic representations of the fMRI data. In a final step, they fed Stable Diffusion those preliminary representations and found it to dramatically improve constructing images with higher resolution and better semantic fidelity.