This page is under construction. Contributions are welcome. Feel free to edit this page (contributors) or edit the neocortex discussion page(contributors) or leave a comment on any page (everyone).
The meeting about neocortex was held on June 13, 2007.
The neocortex learns invariant representations that enable using the immense sensory data for controlling the behaviour of the animal. The invariances are formed by discarding useless information, and maintaining only the necessary. Attention is needed both in learning these invariances and in representing the current state of the world. The cortex cannot represent everything simultaneously. These are bottleneck constraints, but even without these attention would be useful. In motor modality, it can select actions of the animal. Learning and attention operates similarly across the cortex. The structure and microconnectivity of it is homogeneous too. This suggests that there is a generic algorithm as a basis behind all cortical functions.
Learning abstraction hierarchies for representing the structure in the world is specific to the neocortex. In the high levels there are neurons that do not represent any specific sensory stimulus, but they respond invariantly to categories of similar sensory patterns. The cortex learns which kind of categories are behaviourally useful, and tunes some neurons for those invariances. For example, in the high levels of the ventral visual stream there are neurons that recognise an object independently of the view angle. This categorisation reduces the amount of distracting information in the sensory data. Hierarchies are learnt not only for sensory modalities, but for at least motor as well.
The cortical structure and circuitry is, to some extent, homogeneous across the cortex. Therefore it has been suggested that the cortex would use variations of a generic algorithm to everything it does (Creutzfeldt, 1977). The task that a specific cortical area does would not be determined by the structure and mechanisms of that area. Instead, the information in the inputs to that area would determine it. For example, if the area is given motor input, it would start to represent that information, and find regularities from it.
When trying to understand the purpose and functionality of the neocortex, the similarity could be a useful starting point. However, there are many differences between cortical areas as well. There are differences in the number of cells within one minicolumn, in the distribution of fast and slow synapses in the cells, in reactivity to global neuromodulators and in the presence of different layers to name a few. The generic algorithm could probably manage to handle all the functions the neocortex does, but tuning the algorithm locally to handle specific type of data and functions may make the cortex more efficient.
It is a common view that cortex does unsupervised, or self-supervised, learning in order to represent the complex information of the world in a simple way (Doya, 1999; Rao and Ballard, 1997; Friston, 2005). It also learns to predict the future, and relate different things together, by forming associations between the different representations it has learnt.
This generic algorithm can be used for various different tasks, and during evolution, the cortex has taken more and more responsibilities. While a cat can continue living when its cortex is removed (Bjursten et al., 1976), a human cannot. In humans, the motor cortex has too much responsibility for controlling the muscles for living without a cortex.
While superior colliculus takes care of much of the orienting behaviour in evolutionarily older animals, in humans the cortex takes considerable part in it as well. This is due to selective attention in the cortex. Then, alongside hippocampus, cortex has long-term memory, and additionally to basal ganglia, cortex learns about rewarding features of the world.
The neocortex developed probably from the reptilian dorsal cortex. The latter is laminarized into 3 layers and it receives inputs from thalamus, specifically from LGN (visual information) and limbic parts of the thalamus (not sensory nor motor) (Striedter, 2004). The inputs from LGN connect to cells that resemble the pyramidal cells in the neocortex. Further, the reptilian dorsal cortex has cells that resemble interneurons in the neocortex. When developing into neocortex, it received new inputs from other senses, from muscles and from evolutionarily older brain structures.
Anatomy and the main connectivity within minicolumns and between thalamus and minicolumns. References for the connections in the text.
Columns and layers
The cortex can be roughly divided into six layers where each layer has its own characteristic cell types and connection patterns. Layer 1 consists mostly of horizontally running axons. Layers 2-6 form minicolumns. There are usually 80-250 neurons within one minicolumn and they are vertically connected together (Mountcastle, 1997).
The main excitatory cells in layers 2-3 and 5-6 are pyramidal cells. They receive inputs into local basal dendrites, and additionally to an apical dendrite, which can extend for example to layer 1. The excitatory cells in layer 4 are mostly spiny stellate cells.
There are many types of inhibitory interneurons in the cortex. They are usually large and synapse into many other neurons (Douglas and Martin, 2004).
In this section, L1 is an abbreviation of Layer 1 and so on.
L4 can be considered as the main input layer of the cortex and L5 and L6 as the output layers. L5 and L6 project to thalamus and other subcortical modules, which on their part often project to thalamus. L2 and L3 can be considered as output layers as well, but they do not project outside the cortex. Instead, they feed the L4 of other cortical areas. Besides this cortical input, the L4 neurons receive inputs from thalamus. These comprise only 10% of the inputs to L4, but this excitatory input is strong and fast. Even single spikes in thalamus can elicit spikes in L4 (Hill and Tononi, 2002).
The strongest feedforward circuitry in the thalamocortical hierarchy travels the following path: relay cells in a thalamic nuclei, granular layer (L4), supragranular layers (L2 and L3), infragranular layers (L5 and L6), back to thalamus into higher-order nuclei relay cells and again into a higher-order cortical area (Douglas and Martin, 2004; Hill and Tononi, 2002). There is actually an excitatory loop within a minicolumn: L6 feeds L4. Otherwise the feedbacks within one minicolumn tend to be inhibitory (Thomson and Bannister, 2003).
Recurrent and contextual connections
Additionally to the feedforward connections, there are horizontal and feedback connections, and long-range diffuse contextual connections. At least the two former types seem to be modulatory instead of driving (Friston, 2005; Angelucci et al., 2002; Somers et al., 1998). This means that these connections cannot alone activate the target neurons but can only modulate activations caused by other inputs.
Horizontal connections within one hierarchical level can be found at least in layers 2 and 3 (Gilbert et al., 1996). Feedback connections from the next hierarchical level seem to originate in all layers but layer 4, and target these same layers (Angelucci et al., 2002). Cortex gives feedback to the thalamus as well. Layer 6 sends feedback to the same thalamic area from where the excitation to that minicolumn originated (Hill and Tononi, 2002).
The main feedforward pathway in the thalamocortical hierarchy goes through relay cells in the thalamus. Another cell type in the thalamus, matrix cells, project diffusively to the cortex, crossing the boundaries of cortical areas (Hill and Tononi, 2002). They target mainly the superficial layers. These connections could give general, non-specific contextual information to the local cortical patches.
Feature representations in the cortex
The primary sensory areas start with representations of somewhat specific and concrete targets, such as edges in simple cells of V1. Then, complex cells of V1 respond to edges invariantly of the phase, or the specific location of them. Gradually, the invariance increases towards the high levels of the hierarchy. In the ventral visual stream, objects can be recognised independently of their specific appearance. In the dorsal stream, the identity of the objects is not represented as much as their locations and rough shapes.
The goal of invariant representations is to extract the useful information explicitly from the complex sensory data. Thus, all the distracting information should be thrown away. For instance, if the cerebellum should execute a grasping movement, it would need information about the location and shape of the target. The task is doable if the information is given to it in a simple form. If there are neurons representing the needed information explicitly, they can send it to cerebellum and it can execute grasping. There are many neural populations representing different invariances. One example was the difference between ventral and dorsal stream. When the appearance of an object changes, some neurons will change their activations, but some neurons may reflect a corresponding invariance and be unaffected.
Invariances are formed by grouping different sensory inputs to one category. The number of different groupings is immense, and learning useful invariances is not trivial. The neurons have to know how the information they represent is going to be used. Then they can form an invariance which preserves the information the receiver needs.
The cortical areas are flexible to learn to represent different types of information, depending on the inputs. If a kitten grows up in an environment lacking horizontal lines, it does not learn to perceive them (Rauschecker and Singer, 1981). Or if the visual inputs are rewired to some other part of the cortex than V1, similar visual representations will develop there as normally in V1 (Sur and Leamey, 2001).
Learning in the cortex endures through lifetime. Even the lowest sensory areas continue tuning their neurons for the most important features (Gilbert, 1994).
Context affects perception
Interpretation of the sensory inputs depends on the context where the inputs appear. One example is the extra-classical receptive fields (RF) in the visual cortex. A stimulus in the RF of a neuron can activate the neuron. Around that RF is an extra-classical RF. Stimuli in the latter can increase or decrease the responses to stimuli in the normal RF.
Within sensory modalities, the cortex tends to perceive regular, continuous shapes. From different mutually contradictory interpretations, cortex tends to select those that are coherent with the context. The Gestalt principles of perception describe this fact. For instance, in visual domain, a group of dots is often perceived as forming lines and other shapes. The cortex wants to see contextually coherent shapes.
In addition, context affects perception between modalities, as the McGurk effect shows.
Context as a guide for learning invariances
Context does not only help in making momentarily coherent interpretations of the sensory data, but it can be used to guide the neurons to learn relevant features as well. Especially, it can guide learning of invariances. Invariances are formed by cutting distracting information out from the representations. If only that information which has something in common with the context is represented, then the invariance could be useful. If a receiver of this information gives context to the neuron, it would form a representation that relates to the task of the receiver. On the other hand, if the context is general, from the surrounding columns and cortical areas, the representations would develop to be coherent with the context.
Körding and König (2001) made a computational model of a biologically plausible pyramidal cell that receives contextual information to its apical dendrite. The cell learnt a contextually coherent invariant representation for its basal dendrite inputs.
Now it looks like the cortex would not be doing pure unsupervised learning. Instead, the local cortical areas receive context which tells them what kind of features to search for from their inputs. Thus, the context would function as a teaching signal. The context to one part of the cortex can come from other cortical areas, or from outside the cortex. The latter type can come through the thalamic matrix cells.
As information is integrated between sensory cortices, it is conceivably also integrated between all the other information the cortex receives. Getting inputs from nearly every brain region and from all the senses, the cortex could build a model of the world where everything relates to the evolutionary goals of the animal. The motor cortex especially can inform the rest of the cortex of the goals of the animal. The motor cortex perceives the actions that the animal makes, including those originating from reflexes and fixed action patterns, and those learnt by other brain modules, such as cerebellum and basal ganglia.
Computational models of learning invariances
The most popular way to use context in learning invariances is trying to find slowly changing features (Földiák, 1991; Kohonen, 1995; Wiskott and Sejnowski, 2002). The context is then the activations of the same neurons on previous instants. In sensory data, objects do not tend to appear and disappear abruptly, but they can change their appearance. For example, moving objects change their location but not their identity. Finding slow features can therefore result in representations that are invariant with respect to different appearances of the same object.
Spatial context, too, has been used in learning invariances. Becker and Hinton (1992) made a model where two networks got inputs from two "eyes". The networks tried to learn to produce outputs that resemble each other. This resulted in representing the depth of the targets invariantly with respect to the target identities.
Another model which used spatial context was made by Valpola (2004). It learnt complex-cell like features from images when the neurons were biased with context from the surrounding neurons.
The cortex does not represent all the targets in the sensory fields at once. Instead, it focuses its attention to a limited set of features or targets. This kind of inner attention is called covert attention in contrast to overt attention, which means physically orienting toward a target.
Even while the cortex can represent an immense number of different objects from a sensory scene, it cannot represent them simultaneously. This is due to bottlenecks in both perception and learning. These bottlenecks are described in the following.
The cortex uses distributed representations, and they get mixed up if many of them are superposed on each other. The sum of representations of objects A and B may resemble the representation of some third object C more than those of A or B. This is the same phenomenon as that which causes the binding problem. When distinct neurons represent different features, only by representing one object at a time one can know which features belong to a certain object. The binding causes a bottleneck in perception.
Perception also becomes easier with selective attention. If the sensory data were preprocessed so that individual objects were segmented apart from each other, the task of recognising them became crucially easier. As it is not, selective attention does something very similar.
This is also true for learning. If attention focuses only on such targets that relate to each other, learning associations between things becomes tractable. Learning invariances would get more difficult, too, if attention did not filter distracting features away.
Invariant representations already filter unwanted information from the inputs. Selective attention throws even more information away. While invariant representations maintain only that information which is generally useful, selective attention maintains only that information which is useful right at the moment.
The focus of attention depends on both bottom-up and top-down components. Bottom-up saliency of targets attracts attention, and high-level intentions of the animal can affect it as well. The bottom-up saliency can be for example high contrast or regular shape. The strongest attractor of attention is possibly surprise, or unexpected stimuli (Itti, 2006).
What is the mechanism that controls attention? Some theories claim that there are separate networks that control attention explicitly, in addition to the actual representation networks (Raz, 2006). However, it is not clear how they could self-organise along with the actual representation network which learns to recognise objects. Then, the prominent feature integration theory (FIT) suggests that there are different kinds of covert attention, which operate by different mechanisms (Treisman, 1980). When the sensory scene is complicated, the top-down intentions of the animal would produce a serial search for the desired object. Attention would travel sequentially from one object to another. When the sensory scene is simpler, attention could choose the right object instantaneously, resulting in parallel search.
Biased-competition model of attention
Duncan and Humphreys (1989) made a series of behavioural experiments, which propose that all covert attention result from one mechanism. The serial and parallel modes would be the ends of one continuum.
Desimone and Duncan (1995) describe a biased-competition model for this mechanism. According to it, attention emerges from local competition in the perceptual network itself. Adjacent neurons represent different features or objects, but still about the same parts of the sensory fields. Local competition tries to select only some of those features. This competition can be biased in favour of some features. The bias can originate for example from working memory. This would lead to emergence of search. Or the bias can originate from nearby neurons. In any case, the biases allow coherent attention to develop in the cortex.
Not only would serial and parallel modes of attention be explained with the same mechanism, but bottom-up and top-down control of attention as well.
At least with computational models, a common paradigm is to try to analyse the sensory scene wholly, and after this analysis make a decision about actions. Biased-competition model does not follow this paradigm. Instead, it uses distributed decision making. All the cortical areas make decisions about what to represent, based on both bottom-up and contextual information. In motor cortex, these decisions can be direct action selection. In other cortices, the decisions still affect action selection. For instance, when visual cortex decides to neglect some objects, the motor cortex cannot do actions that would need this information about them. The co-operation of different cortical areas stays coherent because the information about decisions in one area percolates through all the cortex.
There is plenty of neurophysiological evidence for the biased-competition model (Reynolds et al., 1999; Reynolds and Desimone, 2003; Maunsell and Treue, 2006). Single-unit recordings with monkeys have shown the following:
- When the RF of a high-level visual neuron contains only one object, the corresponding neurons shows high activity
- Further objects cause other neurons to become active. The neurons will inhibit each other, and they will have lower activities than in the case of one object.
- When the animal focuses its attention towards some object, the corresponding neuron will increase its activity and suppress other neurons.
The hypothesis of general algorithm in the cortex would predict that attention operates similarly, whether it is about selecting between visual locations, colours or other feature, auditory frequencies or thinking about different future plans. There is some evidence that selection works similarly independently of the feature type. Maunsell and Treue (2006) describe how competition between visual locations works similarly to competition between other features.
Attention and learning
When an animal attends to some features in its sensations, the neural representations even in the lowest cortical levels adapt mostly to those features and not to others (Ahissar and Hochstein, 1993). Thus, attention, as well as context, guides learning.
This is reasonable, as it leads to allocating the representational capacity to the most important things in the world. Attention tries to focus on those targets that are the most important for the momentary behaviour, and usually they will be important in the future as well.
Selective attention also alleviates learning of associations and invariances. Learning of associations becomes easier, because coherent attention selects related features from different modalities. Learning invariances becomes easier for a similar reason. Learning invariances is about learning what information to throw away, and attention already does that.
Computational models of attention
In visual modality, attention has been modelled much with saliencymaps. These maps contain locations of where there are objects with probability. These are calculated with predefined filters, such as finding discontinuities of colour or locations of high contrast. Attention is then focused to places of high saliency. The problem with this kind of approach is that attention cannot adapt to the world. It suits only to a situation when much is known about the data already before seeing it.
Deco and Rolls (2004) have shown how coherent attention can emerge in a neural network implementing the biased-competition model. The network has separate ventral and dorsal visual streams. When a bias is introduced to the ventral stream, it leads to a search of a certain object. Correspondingly, when the bias is introduced to the dorsal stream, the object in the corresponding location grabs attention and will be recognised.
However, these implementations of the biased-competition model do not use attention in a learning phase. Learning is therefore inefficient, and the representational capacity is not used primarily to important targets.
Attention and learning combined
In the Computational Neuroscience research group in LCE, we have built a model where attention and learning support each other.
Questions and further topics on neocortex
- Thalamocortical oscillations
- Motor cortices, representing the actions and executing them
- Computational functions of the six layers
Ahissar, M. and Hochstein, S. (1993). Attentional control of early perceptual learning. Proceedings of the National Academy of Sciences, 90:5718-5722.
Angelucci, A., Levitt, J. B., Walton, E. J. S., Hupé, J.-M., Bullier, J., and Lund, J. S. (2002). Circuits for local and global signal integration in primary visual cortex. The Journal of Neuroscience, 22:8633-8646.
Becker, S. and Hinton, G. E. (1992). Self-organizing neural network that discovers surfaces in random-dot stereograms. Nature, 355:161-163.
Bjursten, L.M., Norrsell, K., Norrsell U. (1976). Behavioural repertory of cats without cerebral cortex from infancy. Experimental Brain Research, 25:115-30.
Creutzfeldt, O. (1977). Generality of the functional structure of the neocortex. Naturwissenschaften, 10:507-517.
Deco, G. and Rolls, E. T. (2004). A neurodynamical cortical model of visual attention and invariant object recognition. Vision Research, 44:621-642.
Desimone, R. and Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual Review of Neuroscience, 18:193-222.
Douglas, R. J. and Martin, K. A. C. (2004). Neuronal circuits of the neocortex. Annual review of neuroscience, 27:419-451.
Doya, K. (1999). What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural Networks, 12:961-974.
Duncan, J. and Humphreys, G. W. (1989). Visual search and stimulus similarity. Psychological Review, 96:433-458.
Friston, K. (2005). A theory of cortical responses. Philosophical Transactions of the Royal Society B: Biological Sciences, 360:815-836.
Földiák, P. (1991). Learning invariance from transformation sequences. Neural Computation, 3:194-200.
Gilbert, C. D. (1994). Early perceptual learning. Proceedings of the National Academy of Sciences, 91:1195-1197.
Gilbert, C. D., Das, A., Ito, M., Kapadia, M., and Westheimer, G. (1996). Spatial integration and cortical dynamics. Proceedings of the National Academy of Sciences, 93:615-622.
Itti, L. and Baldi, P. (2006). Bayesian surprise attracts human attention. In Advances in Neural Information Processing Systems 19, pages 1-8. MIT Press.
Kohonen, T. (1995). Self-Organizing Maps. Springer.
Körding, K. P. and König, P. (2001). Neurons with two sites of synaptic integration learn invariant representations. Neural Computation, 13:2823-2849.
Maunsell, J. H. and Treue, S. (2006). Feature-based attention in visual cortex. TRENDS in Neurosciences, 29.
Mountcastle, V. (1997). The columnar organization of the neocortex. Brain, 120:701-722.
Hill, S. and Tononi, G. (2002). Thalamus. In Arbib, M. A., editor, The Handbook of Brain Theory and Neural Networks. MIT Press, 2nd edition.
Reynolds, J. and Desimone, R. (2003). Interacting roles of attention and visual salience in v4. Neuron, 37:853-863.
Reynolds, J. H., Chelazzi, L., and Desimone, R. (1999). Competitive mechanisms subserve attention in macaque areas v2 and v4. The Journal of Neuroscience, 19:1736-1753.
Rao, R. P. and Ballard, D. H. (1997). Dynamic model of visual recognition predicts neural response properties in the visual cortex. Neural Computation, 9:721-763.
Rauschecker, J. P. and Singer, W. (1981). The effects of early visual experience on the cat's visual cortex and their possible explanation by hebb synapses. The Journal of Physiology, 310:215-239.
Raz, A. and Buhle, J. (2006). Typologies of attentional networks. Nature Reviews Neuroscience, 7:367-379.
Somers, D. C., Todorov, E. V., Siapas, A. G., Toth, L. J., Kim, D. and Sur, M, (1998). A Local Circuit Approach to Understanding Integration of Long-range Inputs in Primary Visual Cortex. Cerebral Cortex, 8:204-217.
Striedter, G. F. (2004). Principles of Brain Evolution. Sinauer Associates.
Sur, M. and Leamey, C. A. (2001). Development and plasticity of cortical areas and networks. Nature Reviews Neuroscience, 2:251-262.
Treisman, A. and Gelade, G. (1980). A feature integration theory of attention. Cognitive Psychology, 12:97-136.
Thomson, A. M. and Bannister, A. P. (2003). Interlaminar connections in the neocortex. Cerebral Cortex, 13:5-14.
Valpola, H. (2004). Behaviourally meaningful representations from normalisation and context-guided denoising. Technical Report, Artificial Intelligence Laboratory, University of Zurich.
Wiskott, L. and Sejnowski, T. J. (2002). Slow feature analysis: Unsupervised learning of invariances. Neural Computation, 14:715-770.