For a class that I was taking over the summer, we had to write a secondary research paper on a topic of our choice. I decided to write about the brain's predictive capabilities and how the predictions that our visual system makes impacts how we appreciate art.
In 2018, Robert Pepperell of Cardiff Metropolitan University designed a 72 cm by 72 cm acrylic painting on a round panel that he coined “The Orange Problem”.
A plain circle with the painting’s title in simple block letters in the center, the panel was painted using pigments that appear almost fluorescent, reflecting light waves in the 635 to 590 nanometer range of the visible spectrum, or reflecting two different shades of orange. However, neither the light it reflects nor the paint used are actually orange. The painting itself as a physical object is also completely colorless (Pepperell, 2019). Using his painting, Pepperell wanted to explore the idea between the sensations that our eyes receive versus what our nervous system and brain interprets. What is different between what we sense and what we perceive? How does our brain actually understand visual cues, like with art and painting? Are they making these perceptual decisions in real time or does the brain predict what they will see ahead of time based on the situation? Do all people perceive the same thing?
One of the first to explore these ideas was neurobiologist Johannes Müller. In the early 19th century, he determined in his theory of “specific energies” that the characteristics of pure sensation including colors, smells, sounds, and flavors, are at the simplest level, only electrical impulses that are travelling through the nervous system (Pepperell, 2019). Yet as Hermann von Helmholtz, a physicist and physician from Germany, expressed in 1878, that even when we know that our brains are essentially creating our own personal world inside our head, the illusion does not disappear, because this is just “the primary and fundamental truth” of our world. (Pepperell, 2019).
Early studies of the psychological model of art experience believe that the human brain and perceptual system follows bottom-up processing, almost exclusively (Kesner, 2014). Bottom-up processing is when the brain takes in sensory information in pieces before creating a general perception as the information comes in, starting detailed before zooming out. This explanation completely neglects the opposite, which is the role of top-down processing in our brain, which believes that the brain uses general ideas and contextualizes the situation in order to understand the details. The human perceptual system also focuses on trying to differentiate between objects by enhancing contours and edges. Humans are innately highly sensitive to contrasts and changes in sensations, but also virtually indifferently to continuity or homogeneity. The visual system is so attuned to boundaries that humans can see them when they are not there, like with the Kanizsa triangle optical illusion where the brain perceives a triangle even though they are just the three corners of one (Pepperell, 2019). Studying how humans understand boundaries can be dated back to the 1400s with Leonardo da Vinci. He called it the “line of invisible thickness” because he felt that they were logically contradictory-- perceiving an object with a boundary is like perceiving what the object is and what it isn’t at the same time (Pepperell, 2019).
People often believe that their visual perception is only veridical, or a completely accurate representation of the facts of reality. But as shown through numerous optical illusions, scientists understand that people don’t always accurately understand what is in front of them. The beauty of art as Robert Pepperell describes is that “at the core of each picture, then, lies a contradiction or dichotomy that can never be resolved if we are to perceive it as a picture”(Pepperell, 2019). Art philosopher Richard Wollheim described this phenomena of the difference between reality and what is depicted as the two-foldedness of pictures. This contradiction is what prevents visual perception from being truly veridical. But if it’s not veridical, are humans creating what they see out of thin air? How much of what people see are based in fact? Visual perception is what many consider to be a “controlled hallucination” because it’s not non-veridical-- understanding the visual world is based partially on the object world and partially on personal experiences and prior knowledge and partially on what the person does not already know (the informative).
The visual system is optimized to recognize individual objects rapidly and effectively-- immediate classification where people are sure about what they see. If not, the objects are considered to be semantically indeterminate that resist classification. This is often true for abstract art where the audience does not know immediately what the painting is depicting. When there is no immediate interpretation, people often search for specific details in the painting that fit the cues. This is just a slower and more conscious process of what the visual system does normally in the retina and in the cortex, even with easily identifiable objects. The purpose of this process is to create the primitive elements of each object, including the corners, edges, and put an emphasis on contrasts in brightness and color.
Although historically the brain has been considered to passively interpret sensory information as it enters the visual cortex, newer research supports that the brain actively predicts potential sensory inputs that are constantly updated and compared to incoming sensations by using both top-down and bottom-up processing. This is known as the constructivist formulation. The ideas behind the brain’s predictive power and how this impacts what the brain focuses on is believed to be explained by predictive coding and errors, the Bayesian approach, and the cancellation theories. Each of these ideas about how the brain perceives the visual world have one common goal: optimize the brain’s function given the limited capacity of the visual sensory system. Depending on what is the predicted perception of the sensory input, scientists have tried to understand why certain pieces of art are more appealing than others.
he whole goal of predictive coding is to actively anticipate the sensory inputs by making a series of comparisons. This method relies on understanding the difference or the ratio between the prediction and the actual sensory input, or the prediction error, instead of directly representing the input. Prediction error is also considered to be the difference between bottom-up input and top-down predictions based on that input. The resulting prediction error is then represented in the neuronal response. The predictions of what visual input the brain expects to get is based on the context of the situation and the individual’s prior experience. This is done by comparing the “signals ascending from the sensory cortex (or subcortical structure such as the thalamus)” (Friston and Picard, 2014) with “descending predictions to form errors” (Friston and Picard, 2014). According to the predictive coding model, humans try to infer or hypothesize what the causes of their sensations are ahead of time through “multi-level generative models of the world” (Kesner, 2014). Using the values of the prediction error, the brain constantly generates and edits a model of the environment. These predictions even have hierarchically decomposed representations for each of the features that are needed for perception, including form, color, and movement.
Specifically, the “prediction errors ascend to higher cortical regions to update posterior expectations and improve top-down predictions” (Friston and Picard, 2014) that were previously not considered to be important in understanding visual information. In addition, these ascending prediction errors provide a continuous stream of feedback in order to confirm or not the perceptual hypotheses: a system of constant comparisons. For instance, the spatio-temporal receptive fields of retinal ganglion cells in the eyes use the past and the surroundings in order to predict the light intensity currently in the center. Then, the same cells transmit the prediction error, which in this case is the difference between the measured light intensity and prediction, to the visual cortex (Aitchison and Lengyel, 2017).
Focusing on the mechanisms behind predictive coding in the visual system, superficial pyramidal cells in the upper layers of the cortex compare the expectations generated at each of the hierarchical level, or the sensory input at the lower level with the top-down predictions from the deep pyramidal cells of higher level. This same prediction error is returned to the deep pyramidal cells in order to change the predictions. Descending connections within the brain typically convey predictions. Ascending driving connections convey prediction errors. Each of the descending, or top-down, predictions are matched or reciprocated by an ascending, or bottom up, prediction error in order to constrain the prediction based on the actual sensations present (Friston and Picard, 2014). Essentially, the brain comes up with its own ideas of what to expect, but it is constantly checking these ideas with the real sensations surrounding the subject.
There is also neuromodulatory gating or gain control, mechanized by NMDA receptors which are glutamate and ion channel protein receptors, on superficial pyramidal cells that determine their relative influence on the deep pyramidal cells that are doing the expectation encoding. This helps to weight the prediction errors by their expected precision. This precision weighting is also believed to be implemented by top-down processing that reports the prediction errors. If the ascending precision errors are mismatched, it could amplify any errors especially if they are weighted more for “ascending precision-weighted prediction errors report what is newsworthy in the incoming signal-- not the sensory signal per se” (Friston and Picard, 2014). The brain wants to pay attention to the most important or “newsworthy” signals-- so they place special weightage on them. If this weightage system doesn’t hold up,
Since the brain is constantly trying to minimize the prediction error, they do so at each of the levels of the cortical and subcortical hierarchy and even at the level of peripheral reflexes. The prediction error is directly resolved by the inhibitory interneurons that are activated by descending prediction. As this occurs, the response to sensory inputs begins to decline. When the internal processes that are generating the predictions within the brain align with the external processes that are generating sensations, the brain is minimizing prediction error because the internal processes have begun to emulate the external ones (Friston and Picard, 2014).
This sort of visual predictive coding is believed to have originated due to its evolutionary purposes. It allows for animals and humans to prepare for what events or threats might come in the future, so that they can anticipate and compensate. This is all with the goal of maintaining homeostasis. If an outside stimulus aligns with the prediction, then there is no need to react and the neural resources that the brain allocates to the particular stimulus is minimized (Van de Cruys and Wagemans, 2011). Predictive coding is the method that allows for the brain to take its external stimuli to make a prediction of what the brain expects to see or perceive given the subject’s past experiences with similar stimuli. The brain optimizes its function by comparing the prediction to what the actual sensations are constantly in order to find patterns to minimize the difference or the error between the two.
In addition, there is also a predictive processing idea known as the Bayesian approach. The Bayesian approach is trying to understand the probabilities of each of the predictions the brain makes. It continually updates these probabilities based on new sensory information that comes into the visual cortex. Bayesian inference looks at the current input data in order to compute the “posterior probability of each latent cause by multiplying the prior probability of each potential setting for the latents, with the likelihood, the probability of receiving the current sensory input under that setting of the latents” (Aitchison and Lengyel, 2017). The predictions that are made are often because of biasing perception that uses humans’ expectations in order to “generate verdicial perceptual experiences rapidly in an inherently ambiguous sensory world” (Press et. al, 2019).
In general, the difference between the Bayesian approach and predictive coding is that predictive coding is making comparisons between internal and external factors to minimize error, whereas the Bayesian approach is looking at probabilities of predictive cases based on external factors.
The simplest approach to using probabilities on their own is to look at the rates of the neurons firing. In this case, the firing rate of each neuron would represent the posterior probability in Bayes’s Theorem, which is the probability that an event is going to occur considering all of the background information, of one possible value of the latent variable that is inferred from the rest of the observed variables. In order to compute the firing rate, the neurons would need to multiply their inputs, which represents the likelihood, and the prediction, which represents the prior. However, since multiplication is considered to be more difficult for neurons to actually implement than addition, the firing rates could also represent log-probabilities. In either case, the probability or the log-probability, the parameters of the posterior probability rely on the neuronal responses.
Another way to approach using the Bayesian model is through direct variable coding. In this case, the neural activity or the rates in which the neurons fire represent the latent variables. Latent variables often correspond to the intensity of different objects within an image, so in direct variable coding, the neural responses would directly code for the intensity. Much like with rate coding in psychobiology where the more intense the stimulus, the more rapidly the neurons fire, a large or small response would mean that the intensity of the feature is high or low, respectively. If there is no response, the feature that the neuron represents is absent (Aitchison and Lengyel, 2017). Through weighting much like in predictive coding the brain focuses on the lower probabilities that will require more attention in order to optimize its function as much as possible. Prediction helps determine what the brain needs to focus on.
In addition, the Bayesian theory is biased towards what people expect and thereby, more likely to be true which increases the gain on expected relative to unexpected units. However, when there is a higher contrast between the two, this implies that there are stronger sensory representations and it helps explain illusions when typical regularities are disrupted. One example could be perceiving concave faces to be convex.
Regardless of what the neural firing represents, the Bayesian approach in general suggests that humans’ perception of the world is dominated mainly by what it expects given the environment and it uses that to create probabilities of possible outcomes. However, this approach still aligns with the idea that vision is slightly veridical because it relies heavily on understanding the true state of the world solely through what the person already knows (Press et. al, 2019).
On the other hand, the cancellation theories look more at informative perception that explores how the brain focuses on what it does not already know. Cancellation theories, which are also known as dampening theories, outline that the way the brain optimizes its function because of the limitations of all of its sensory systems is by focusing on the information that is the more perceptual, the most informative. This way the brain does not look at what it already knows nor does it spend time developing predictions because of what it knows, much like predictive coding and the Bayesian approach do, but rather it updates and determines information about unexpected sensory inputs. This is done by suppressing, not facilitating like predictive coding and the Bayesian approach do, any processing of expected inputs (Press et. al, 2019). For instance, when drinking a glass of water, the brain will ideally have reduced processing of the predicted sensation of what it is like to hold the glass. This will allow the subject to anticipate unexpected changes like having the glass break, which is something that the person does not know will happen or not but is a possibility. Cancellation theories still believe that the brain is making predictions of what might happen but based on information that is not necessarily there. This allows the brain to focus solely on the unexpected rather than the expected (Press et. al, 2019).
For instance, when there is a strong unexpected signal, this highly surprising input generates processes that are associated with an increase in sensory gain and with cancellation theories, specifically, such a strong signal increases upweighting processing of these inputs to aid in model updating. Therefore, unexpected events that update the model are perceptually highlighted and the unexpected events that are still aligned with sensory noise, making it less informative, are not. Again, this is all in order to optimize brain function and brain focus.
The model focuses less on comparing inputs but rather on updating the brain’s understanding of the situation based on sensory information as quickly as possible, trying to come up with potential courses of action. Limited resources are devoted to the unexpected signals that force people to update their beliefs or perform corrective actions.
The cancellation theories are readily supported by studies that report that “predictable tactile, auditory, and visual nputers lower sensor cortical activation and are perceived less intensely than unexpected inputs” (Press et. al, 2019). Looking at cancellation theories allows the brain to focus more on the unknown which is largely supported by action control literature because it provides the brain with the opportunity to focus on the unknown by suppressing known and often repetitive sensations. This model is more of a prediction of the unpredictable approach to how the visual system understands the physical world and how it uses that understanding to create a mental model of the world around.
However, looking at all three theories, it appears as though they can not all exist even though they all seem likely. Predictive coding and the Bayesian approach are the most closely aligned as they both address similar types of data, focus on using background information for prediction, and they “agree upon the importance of combining external inputers with internal signals (predictions or priors)” (Aitchison and Lengyel, 2017). However, predictive coding relies on representing prediction errors, not focusing on how predictions are determined in the first place nor on how the prediction errors will be used. Constrastingly, the Bayesian inference does try to examine how these predictions are made. In addition, predictive coding describes neutral responses, but the Bayesian inference looks at behavior. However, because they are so similar in their goals, theoretically it seems easy to combine the two theories: have the latent variables from Bayes’s theorem to serve as the predictions based on the sensory input that predictive coding requires. Then, neurons can subtract this prediction from the input resulting in a prediction error, similar to the process in predictive coding. The prediction error can serve as an input to neural circuits that implement Bayesian inference in order to encode latent variable values that better represent the sensory inputs (Aitchison and Lengyel, 2017). Preliminary work suggests that this combined model results in lower prediction errors and improved higher-level predictions.
In addition, the Bayesian approach and the cancellation theories can not both be true at the same time. This is because “when it comes to the contents of perception, monolithic Bayesian theories that suggest perception is dominated by what we expect conflict with monolithic Cancellation theories making the opposite suggestion” (Press et. al, 2019). The one work around that scientists have suspect is that both the Bayesian and the cancellation mechanisms are present but in different capacities as the Bayesian mechanisms are used to optimize veridical perception of the environment based on sensory cortex clues and that the cancellation mechanisms optimize information based perception during action using sensorimotor predictions. They call this the opposing process theory where perception is initially biased towards the expected in order to rapidly generate the veridical experiences that are based on the real sensory inputs. When the input is closely aligned enough with expectation, there will be no other process unless the input is different enough from the prediction in order to generate surprise (Press et. al, 2019). However, this does not seem entirely feasible. In the future, this is still an important area of study because both veridical and informative perception seem to be important in constructing the visual world.
What does the brain’s predictive capacity mean for how people interpret paintings and visual art? Historically, people have often believed that the most satisfying paintings are often the ones that are the most natural, the ones that are the most harmonious, the ones that are the most fluid. In art, often the ones that fit along with Gestalt's principles, that group objects in our minds immediately based on symmetry, shape, connectedness, and more. Koffka, a Gestalt psychologist, once said that if the laws of Gestalt were violated it would “hurt our sense of beauty” (Van de Cruys and Wagemans, 2011). However, looking at some of the greatest pieces of art, such as the works of Van Gogh and Picasso, completely disregard the fluidity and predictability that Gestalt valued.
Looking at a more modernist approach to art, many “argue that artists often destroy predictions that they have first carefully built up in their views, and thus highlight the importance of negative affect in aesthetic experience” (Van de Cruys and Wagemans, 2011). However, this does not stop the viewer from eventually succeeding in creating their own pattern to the art, before they are once more surprised when a painting does not fit.This cycle of surprise and recovery often creates a thrill in the viewer that makes those paintings all the more enticing. Looking at prediction errors within predictive coding can help to explain why the paintings that “hurt our sense of beauty” are so appealing.
The reason why unpredictable art is more attractive to people is based on the so-called conflict theories of emotion that claim that emotions arise “from interruptions or discrepancies between expected and actual situations” (Van de Cruys and Wagemans, 2011).
One of the advocates of this view is Donald O. Hebb, a Canadian psychologist. In his work studying the continuous intrinsic activity of the brain, he realized that the brain is proactively involved in stimuli processing. In his work, he advocated that thought consists of phase sequences which are sequential activations of neural structure, or cell assemblies, that are built up due to one’s previous experience and learning. Each of these assemblies can be activated or aroused by a sensory event, a previous assembly, or both. Negative emotions then arise from the obstruction of the established phase sequence, or negative emotions arise from deviating from the predicted norm. This view is consistent with the ideas of implicit prediction formation and the response of their confirmation and obstruction (Van de Cruys and Wagemans, 2011). However, if deviating from the predicted creates negative emotions, why are unpredicted art more appealing even though they are unsatisfying.
This is explained through George Mandler’s, an American psychologist's work. His theory believes that changes from what the predictive coding model creates based on a person’s previous experiences generates a sense of conflict and failure of the expectations to match the actual circumstances. Although at first it may be viewed as a negative emotion, these feelings actually “create arousal because they signal important changes in the environment that must be acted upon” (Van de Cruys and Wagemans, 2011). Researchers describe the new feeling of wanting to understand and focus on the painting despite an initial feeling of confusion or negativity as “reappraisal” (Van de Cruys and Wagemans, 2011). It creates that sense of change that forces the brain to place extra weightage on the particular stimuli, as though it is demanding the person’s attention, much like the Bayesian approach explains how the brain prioritizes and focuses on images. The type of paintings that demand one’s attention to dissect what is happening because it is so different are also more easily processed because they will require a majority of the visual system’s resources given that it has a high prediction error.
Personally, I think this also shows the importance of the cancellation theories because resolving that sense of conflict in order to create that cycle of surprise and recovery can only happen if the brain begins to predict the unpredictable or the unexpected. That sense of recovery is just as important because of the idea of the exposure effect where it is believed that seeing improved processing or understanding of a stimulus due to repeated presentation of this stimulus does lead to an increased preference for this stimulus. Both a temporary state of unpredictability, which is the prediction error, and the emergence of perceptual pleasure through predictability are important for a person’s emotional state-- which is what drives a sense of reward.
In addition, I feel because it is such an emotionally motivated process, it is still unique to each individual’s experiences and how that shapes what a painting or piece of art makes them feel. That ensures that even though each person might be sensing the same things, a painting that does not fit what is expected allows for each person’s emotional experience to shine in how they understand what is happening in the painting. It becomes a person and more human experience.
For me, when I see a picture that is different or not what I expected, it allows for a sense of mystery and it allows for a sense of control. It makes me feel as though I am interacting with the painting rather than just passively viewing it. The more a piece of art sticks out in my mind, the more I want to think about. To me, that is what is creativity, and that ability to deviate from the norm and the expected is what makes humans imaginative and makes people more than just machines. It is that balance between imagination and standard generation that computers have struggled to react which has made creating true art from a computer hard to do.
For instance, looking at CANs, or Creative Adversarial Networks, they try to mimic a human’s ability through neural networks to be creative by creating an input from an output, tricking the model into thinking the input is the original correct input, and then creating a piece of work that doesn’t even fit into a specific genre of art that the output was even related to. Even though this is similar to a human imaginative train of thought and even though it can be considered creativity as it moves from what the original was-- to me, it is still only putting pieces of different ideas together, rather than something new and imaginative. To me, that’s not real creativity because computers rely on patterns in order to create an output. Even though it might be hard to tell the difference between computer generated art and human made art, the journey is as important as the end result. Although, if there are algorithms developed in order to mimic either predictive coding or the Bayesian approach or cancellation theories, it would make it easier for computers to bridge that gap between them and us as it would allow them the ability to reevaluate their art creations as though they were human artists, updating where they see fit or where they believe surprise with be an effective emotion.
Yet at the same time, regardless of whether computers can mimic humans’ predictive qualities, what makes art so appealing is knowing the thought and feeling that was put into it-- with a computer, they are not able to feel or to be an active participant in the art creating process just simply is not possible. Even though you do not know all about the journey when you are looking at the art, just knowing that something is made by a computer detracts from the experience a little. It’s easier to appreciate it if you can try to understand the artists motives and come up with your own ideas about what they were trying to convey, again the idea of how to explain surprise or unpredictability. Either way, I believe that our perception of art and the art process is largely dependent not on how satisfying it is, but how thought-evoking or emotional it is which can be done by disregarding our predictions of our perceptions of paintings. Therefore, unpredictability should be synonymous with originality-- the ability to deviate from the pattern is what makes art individualized.
Works Cited
Aitchison, Laurence, and Máté Lengyel. “With or without You: Predictive Coding and Bayesian
Inference in the Brain.” Current Opinion in Neurobiology, vol. 46, 2017, pp. 219–227., doi:10.1016/j.conb.2017.08.010.
Cruys, Sander Van De, and Johan Wagemans. “Putting Reward in Art: A Tentative Prediction
Error Account of Visual Art.” i-Perception, vol. 2, no. 9, 2011, pp. 1035–1062., doi:10.1068/i0466aap.
Kesner, Ladislav. "The predictive mind and the experience of visual art work." Frontiers in
Psychology, vol. 5, no. 1417, 16 Dec. 2014, doi:10.3389/fpsyg.2014.01417.
Pepperell, Robert. “Problems and Paradoxes of Painting and Perception.” Art and Perception,
vol. 7, no. 2-3, 29 Nov. 2019, pp. 109–122., doi:10.1163/22134913-20191142.
Picard, Fabienne, and Karl Friston. "Predictions, perception, and a sense of self." Neurology, vol.
83, no. 12, 16 Sept. 2014, pp. 1112-18, doi:10.1212/ WNL.0000000000000798.
Press, Clare, et al. "The perceptual prediction paradox." Research Gate, Aug. 2019,
www.researchgate.net/publication/335181322_The_perceptual_prediction_paradox.
Comments