Workshop on
Natural Environments Tasks and Intelligence



Matteo Carandini (UCL)

Regimes of operation in visual cortex

The perception of visual stimuli is widely held to be supported through the activity of populations of neurons in visual cortex. Work in our laboratory seeks to record this population activity and to characterize its evolution in time. Our methods rely on optical imaging of voltage-sensitive dyes and on electrical imaging via multielectrode arrays. The results indicate that the visual cortex operates in a regime that depends on the strength of the visual stimulus. For large, high contrast stimuli, the cortex operates in a manner that emphasizes local computations, whereas for smaller or lower contrast stimuli the effect of lateral connections becomes predominant. In this interconnected regime, the population responses exhibit rich dynamics, with waves of activity that travel over 2-6 millimeters of cortex to influence distal locations. In the complete absence of a stimulus, these waves dominate, and are sufficient to explain the apparently erratic activity of local populations. These results indicate that two apparently contradictory views of visual cortex, one postulating computations that are entirely local and the other postulating strong lateral connectivity, are both correct. The cortex can operate in both regimes, and makes its choice of regime adaptively, based on the stimulus conditions.

EJ Chichilnisky (UCSD)

The retinal ganglion cell receptive field at the elementary resolution of single cones

To fully understand a neural circuit requires knowing the pattern of connectivity between its inputs and outputs. For example, the role of the retina in color vision depends on the pattern of connectivity between the lattice of cone photoreceptors and multiple types of retinal ganglion cells sampling the entire visual field. In the vertebrate nervous system, such detailed functional circuitry information has generally been out of reach. Here we report the first measurements of the assembled functional connectivity between input and output layers of the primate retina at single-cell resolution, and use the information to probe the neural circuitry for color vision. We employed a unique multi-electrode technology to record simultaneously from complete populations of the ganglion cell types which collectively mediate high-resolution vision in primates (midget, parasol, small bistratified). We then used fine-grained visual stimulation to separately identify the location and spectral type -- (L)ong, (M)iddle or (S)hort-wavelength sensitive -- of each cone photoreceptor providing input to each ganglion cell. The populations of ON and OFF midget and parasol cells each sampled essentially the complete population of L and M cones. However, only OFF midget cells received substantial input from S cones, an unexpected specificity. Statistical analysis revealed a non-random pattern of inputs from L and M cones to the receptive field centers of midget cells, while inputs to the receptive field surround were random. The specificity of cone inputs to the receptive field center could not be explained by clumping in the cone mosaic, implying that developmental or adaptive mechanisms in the retinal circuitry enhance red-green opponent color signals transmitted to the brain.

Bruce Cumming (NIH)

The nature of decision related activity in sensory neurons

The activity of sensory neurons in many cortical areas is correlated with the perceptual reports of animals performing appropriate psychophysical tasks. These "Choice Probabilities" are often thought to reflect the causal effect of neuronal noise upon the animals choice. We recently combined Choice Probability measurement with white noise analysis, and found several features that are very hard to explain in this framework. This led us to suggest that Choice Probabilities may be partially top down in origin. In order to examine this possibility more directly, we exploit structure from motion, which produces ambiguous but relatively stable stimuli. We recorded the activity of neurons in MT while monkeys reported the perceived rotation direction of cylinders. We employed a series of manipulations to induce a particular perceived direction, in ways that should not directly affect the activity of MT neurons. These manipulations all changed the activity of MT neurons in the direction predicted by our proposed top-down mechanism. This demonstrates top-down changes in firing rate associated with choices in perceptual decision tasks.

Jim Dicarlo (MIT)

Untangling object recognition: The convergence of systems neuroscience and computer vision

Visual object recognition is a fundamental building block of memory and cognition, and is a central unsolved problem in both systems neuroscience and computer vision. In this talk, I will outline our ongoing efforts to synergize elements of these two research fields to attack the central challenge of object recognition. The computational crux of object recognition problem is that the recognition system (biological or artificial) must somehow tolerate tremendous image variation produced by different views of each object (the "invariance" problem). I will outline a framework that provides intuition on how this problem might be solved (a stepwise "untangling" of object identity manifolds). But that intuition is not enough -- the space of hypothetical models is impossibly large, so that finding a solution (biologically correct or not) without further constraints is like finding a needle in a haystack.

In the first part of my talk, I will highlight ways in which systems neuroscience data have greatly reduced the size of the hypothesis space, including the recent discovery that the primate visual system uses naturally occurring temporal contiguity cues in the visual environment to "learn" how to untangle object identity manifolds. In the second part of my talk, I will outline how new advances in computer vision and computing technology are beginning to appropriately navigate this neuroscience-reduced hypothesis space. I will conclude with a discussion of the definition of success in solving "real-world" object recognition, the rate of progress thus far, and speculation on what the future holds.

David Field (Cornell U)

Efficient coding and biological plausibility: Is the belief in a "blank slate" implied by our models?

The last two decades has seen a great deal of success in applying efficient coding techniques to sensory systems. Algorithms like sparse coding, slow feature analysis, etc., have demonstrated that the early stages of the visual pathway can be partially explained by applying simple learning rules to natural scenes. The application of Bayesian statistics has demonstrated that many aspects of sensory coding are efficient representations. But what do these algorithms actually teach us about sensory systems? What are the limits of this approach? Do they tell us how development proceeds? In this talk, I will discuss some of the failures (or at least some difficulties) with the general approach and ask a few questions about the deeper assumptions.

Pascal Fries (Nijmegen U)

Routing and computing with neuronal (gamma-band) synchronization

Rhythmic neuronal synchronization can be found in many systems and under many conditions, and it likely subserves brain functioning. We try to understand the mechanisms behind (gamma-band) synchronization and the mechanisms through which it subserves brain processing. I will present recent results, showing how gamma-band synchronization in early visual cortex influences the elementary coding of stimulus orientation. During each gamma cycle, orientation coding is build up and decays again. Furthermore, I will present data from a 252-channel subdural grid electrode recording from large parts of a hemisphere of two monkeys performing a selective visual attention task. Attention strongly modulates the precision of brain-wide synchronization in several networks with distinct spatio-spectral signature. These data taken together support the notion that rhythmic neuronal synchronization is an important factor for cognitive functioning.

Bill Geisler (UT Austin)

Scene statistics and ideal observers can help us understand neural encoding and decoding

In the past two decades advances in physical measurement technology, computational power, and statistical modeling have made it possible to begin exploring in detail the relationship between the properties of the environment and the mechanisms of neural encoding and decoding that underlie performance in natural tasks. This talk will briefly summarize recent efforts to determine which stimulus features are optimal for performance in specific visual tasks, how encoded features should be decoded to perform optimally in specific visual tasks, and how real performance compares with optimal performance. It will be argued that quantitative analysis of the natural scene statistics and the neural signals that support natural tasks can provide novel quantitative predictions for perceptual performance and deep insight into the design of perceptual systems.

Paul Glimcher (NYU)

Cortical normalization models of choice cortex: Area lip

There is now widespread agreement that a region in the posterior parietal cortex encodes the subjective values of individual movements. Increases and decreases in the reward associated with a particular movement lead to increases and decreases in firing rate amongst a population of neurons encoding that movement. We have recently demonstrated that cortical normalization also operates across movement values in this area. Increasing the value of movements outside of the response field decreases responding for a fixed reward. Further, this decrease is non-linear, suggesting curvature of the kind first described by Heeger for neurons in V1. Perhaps most interestingly, a Heeger-style normalized representation of value vastly outperforms strategies like the standard drift diffusion model that employ value differences as decision variables as tools for describing firing rates.

Daeyeol Lee (Yale U)

Prefrontal cortex and decision making

During naturalistic decision making, the outcomes expected from alternative actions are often evaluated along multiple dimensions, such as the magnitude and delay of reward. We found that during inter-temporal choice in which the animals choose between a small immediate reward and a larger but more delayed reward, the neurons in the lateral prefrontal cortex and basal ganglia often encode the subjective value of a particular reward discounted by its delay. The lateral prefrontal cortex also plays an important role in updating the animal's preference for alternative actions based on its previous experience. Using a computer-simulated rock-paper-scissors task, we found that the activity of individual prefrontal cortex tends to reflect not only the actual outcomes of the animal's chosen actions, but also the hypothetical outcomes that could have been obtained from one of unchosen actions. These results suggest that the prefrontal cortex provides the learning signals necessary to maximize the efficiency of reinforcement learning in addition to the value signals during decision making.

Michael Lewicki (CMU/CaseWestern)

Learning structures in natural sounds

What information processing does the auditory system performs to carry out our everyday auditory tasks? Consider the basic coding problem of transforming the vibrations at the eardrum to the neural code at the auditory nerve. Out of the infinite range of possible codes why do biological systems use the codes they do? Are there theories that can explain auditory coding in terms of fundamental principles? One hypothesis is that biological auditory systems are optimally adapted to code their natural acoustic environments. I will show that by learning efficient codes of natural sounds, it is possible to predict properties of the auditory code. These results also suggest that the acoustic structure of speech is adapted to the coding properties of the mammalian auditory system. In the second part of my talk, I will focus on the problem of how we generalize from specific instances of sounds to their general classes. For example, what acoustic cues tell that the individual waveforms of the clinks of a glasses or footsteps in the hall are each part of the same sound class? I will show that it is possible to use statistical learning methods to characterize natural impact sounds with a small number of intrinsic dimensions. From this, it is possible to perform accurate sound categorization and to synthesize realistic impact sounds from a small number of intrinsic values. Finally, I will discuss the issues faced by the biological auditory systems in the context analysis in natural scenes.

This is joint work with Vivienne Ming, Sofia Cavaco, Bruno Olshausen, Annemarie Surlykke, and Cindy Moss.

Robert Liu (Emory U)

Auditory cortical coding in a natural communication context

Acoustic communication involves a many different functions that must be performed on naturally variable sound signals. For example, in human speech, we must often times simultaneously detect, discriminate and categorize strings of phonemes in order to make sense of a message. Unfortunately though, auditory processing is usually studied in laboratory contexts by using only single tasks and training with specific exemplars. To begin understanding how the auditory system may function in more realistic communication, we have been studying the auditory cortex in an animal model in which species-specific vocalizations gain communicative significance through a natural process. Specifically, we use the ultrasound communication between mouse pups and their mothers. We have examined how the auditory cortex reflects the behavioral relevance of these vocalizations by comparing electrophysiological recordings between animals that recognize the significance of these calls (mothers) to those that do not (virgins). The plasticity observed from both anesthetized and awake recording preparations suggest multiple strategies to naturally enhance the cortical representation of behaviorally relevant sounds.

George Pollak (UT Austin)

Dissecting the auditory system with in vivo whole cell recordings

Aside from humans, the animals with perhaps the richest and most sophisticated repertoire of social communication calls are bats. In this talk I illustrate some of the ways in which bats use sound for social communication and then explore the ways that neurons in their auditory midbrain derive their selective properties for the direction of frequency modulation (FM) sweeps. Mechanisms that generate FM directionality are of interest because FMs are integral components of both their echolocation and communication calls and FM directionality is directly analogous to directionality for movement across the sensory surface in other sensory systems. We use in-vivo whole cell recordings from the auditory midbrain of awake bats to evaluate how inhibition interacts with excitation to shape directional selectivity. We show that contrary to conventional thinking, the timing of inhibitory and excitatory inputs plays a far less prominent role than their magnitudes. We then show that directional selectivity of discharges is far more pronounced than the directional selectivity of inputs, as reflected in EPSP amplitudes, due to the non-linear influence of spike threshold. Finally, the results suggest that highly selective discharge properties can be formed from only minor adjustments in synaptic strength.

Paul Schrater (U Minn)

Planning for uncertainty in action tasks - exploration and compensation

How does the brain handle uncertainty in action tasks? One possibility is to reduce uncertainty by exploring and acquiring more perceptual information along the way. Another possibility is to execute action strategies that are robust to uncertainty. In this talk I will discuss rational approaches for deciding how much exploration is warranted using an optimal control framework. Somewhat surprisingly, the conditions which make active information gathering valuable are limited, and it is easy to find domains where it is better to construct action plans robust to uncertainty than to try to reduce uncertainty. I show the applicability of these ideas by showing behavioral results on grasping with position uncertainty, scheduling time for perceptual information gathering, and optimal exploration.

Eyal Seidemann (UT Austin)

Neural population coding in the primate visual cortex

What are the principles that govern the encoding and decoding of visual stimuli by populations of neurons in the primate visual cortex? To begin to address this general question, our laboratory uses a combination of voltage-sensitive dye imaging and electrophysiology to measure neural population responses from the primary visual cortex (V1) of monkeys while they perform well controlled, threshold visual tasks. To study encoding, we develop and test simple models of the early visual system that can account for the observed properties of V1 population responses, and can be used as a framework for testing specific hypotheses regarding the underlying neural mechanisms. To study decoding, we develop and test possible readout models that perform the same task as the monkey using only the measured neural responses from the monkey's brain. In this talk, I will present some of our recent findings from these two lines of research and discuss their possible implications for population coding in the cortex.

Shihab Shamma (U Maryland)

Cortical mechanisms to navigate complex auditory scenes

Spectrotemporal modulations are critical in understanding the perception of complex sounds, especially in conveying intelligibility in speech and quality in music. Neurons in the primary auditory cortex extract and explicitly represent these modulations in their responses. In this talk, I shall review experimental methods to study the representation of these modulations in the cortex and the way they reveal the acoustic features of complex sounds. I shall also discuss various abstractions of this analysis to develop algorithms for the analysis of complex auditory scenes. One example is the question of how spectrotemporal modulation content of speech can be used to assess its intelligibility, to discriminate it from non-speech signals, and to enhance it by performing Weiner-like filtering in the space of the cortical representation. In music, these modulations underlie the perception of sound quality, and hence can be effectively employed to construct a perceptually-meaningful metric of musical timbre. Finally, I shall review recent extensions of these studies that explore the role of feedback and top-down attentional effects on the representation of modulations, measurements that are carried out while animals are engaged in a variety of auditory tasks. The resulting understanding of these rapid adaptive mechanisms has been formulated as algorithms for the segregation of complex sound mixtures to resolve difficult examples of the "cocktail-party problem".

Mandyam Srinivasan (U Queensland)

Visual information processing in honeybee navigation

Insects, in general, and honeybees, in particular, perform remarkably well at seeing and perceiving the world and navigating effectively in it, despite possessing a brain that weighs less than a milligram and carries fewer than 0.01% as many neurons as ours does. Although most insects lack stereo vision, they use a number of ingenious strategies for perceiving their world in three dimensions and navigating successfully in it.

The talk will describe a series of experiments designed to understand how flying insects perceive the world in three dimensions, and navigate safely in it. To a large extent, moment-to-moment navigational cues are derived from the patterns of image motion that are created by the environment in the eyes of the flying insect. For example, distances to objects are gauged in terms of the apparent speeds of motion of the objects' images. Objects are distinguished from backgrounds by sensing the apparent relative motion at the boundary. Narrow gaps are negotiated safely by balancing the apparent speeds of the images in the two eyes. The speed of flight is regulated by holding constant the average image velocity as seen by both eyes. This ensures that flight speed is automatically lowered in cluttered environments, and that thrust is appropriately adjusted to compensate for headwinds and tail winds. Visual cues based on motion are also used to compensate for crosswinds, and to avoid collisions with other flying insects. Bees landing on a horizontal surface hold constant the image velocity of the surface as they approach it, thus automatically ensuring that flight speed is close to zero at touchdown. Bees approaching a vertical surface hold the rate of expansion of the image of the surface constant during the approach, again ensuring smooth docking. Foraging bees gauge distance flown by integrating optic flow: they possess a visually-driven ‘odometer’ that is robust to variations in wind, body weight, energy expenditure, and the properties of the visual environment. Path integration during long-range navigation is accomplished by combining directional information from the bee’s ‘celestial compass’ with the odometric information generated by the optic flow.

We have been using some of the insect-based strategies described above to design, implement and test biologically-inspired algorithms for the guidance of autonomous terrestrial and aerial vehicles. Application to manoeuvres such as visually stabilized hover, gorge navigation, attitude stabilization, and terrain following will be described, if time permits.

Bill Warren (Brown U)

Behavioral dynamics of visually-guided locomotion

How do people generate paths of locomotion through a complex changing environment? Behavioral dynamics studies how stable patterns of behavior emerge from the interaction between an agent and its environment, which is typically non-stationary and unfolds over time. In this talk, I will describe our effort to model the dynamics of visually-guided steering, obstacle avoidance, interception, pursuit-evasion, and shadowing, based on experiments in an ambulatory virtual environment. By combining these components, we seek to predict paths of locomotion in more complex situations, and ultimately to model the collective behavior of crowds. The results demonstrate that locomotor paths can be understood as emerging on-line from the agent-environment interaction, making internal models and explicit path planning unnecessary.

Daniel Wolpert (Cambridge U)

Structures and statistics in sensorimotor control

The effortless ease with which humans move our arms, our eyes, even our lips when we speak masks the true complexity of the control processes involved. This is evident when we try to build machines to perform human control tasks. While computers can now beat grandmasters at chess, no computer can yet control a robot to manipulate a chess piece with the dexterity of a six-year-old child. I will review our recent work on how the humans learn to make skilled movements covering probabilistic models of learning, including Bayesian and structural learning, as well as decision making and the revision of decisions in the face of uncertainty.