9/28/2006

Cultivating Cognition: The "Brain Fitness" Movement

One topic of increasing popular interest is "brain fitness," or how to maintain and further develop cognitive abilities. One need not look farther than the current sudoku craze to see that people everywhere are eager to keep themselves mentally engaged. But to what extent can the outlets for this urge be informed by the cognitive and neural sciences?

At the forefront of this field is Sharp Brains, a company whose advisory board includes several neuroscientists. For example, one of the board members is Dr. Elkhonon Goldberg, professor of Neurology at NYU and author of "The Wisdom Paradox." As reviewed in the popular science press, Goldberg argues that old age need not be viewed solely as a period of cognitive decline, but that instead certain cognitive abilities actually continue to develop beyond adulthood. The book also investigates whether it may be possible to ward off cognitive decline with regular exercise - both mental and physical.

Dr. Juliana Baldo - another member of the Sharp Brains advisory board - has written extensively on the role of language in problem solving and executive function. Baldo has published work that begins to dissociate response interference from "conceptual" interference in the Stroop task, and has also developed an interesting new test of executive function. Much of Baldo's work has involved patients with frontal damage and aphasia.

Sharp Brains has partnered with other companies bringing science-based brain fitness products to market. For example, SharpBrains works with CogMed, a Swedish-startup offering computer-based working memory training for kids with ADHD. The results of such training were published in JAACAP last February, which showed that training improved digit span, Stroop, and Raven's progressive matrices performance, and that these gains were stable at follow-up. It's notable that the effects of CogMed's training seem to transfer or generalize beyond the specifics of their training paradigm.

Perhaps cognitive science's first major commercial application was in the field of human-computer interaction (HCI), where software interfaces are made more usable by incorporating the limits of human cognition as a software design constraint. Clearly, there are many parallels between HCI and the emerging "brain fitness" market, where cognitive science informs the design of software to extend the limits of human cognition. Particularly interesting are the ways in which these two fields might interact.

As in the HCI movement of the 90s, and before that, the Human Factors movement in the 60's and 70s, brain fitness is a field where basic research is being put directly into real-world use. It's important for both the users of these new products and for the field as a whole that these products are grounded in rigorous science.

Interview With Torkel Klingberg

Just wanted to point out that Sharp Brains has an interview with Torkel Klingberg, lead of the Developmental Cognitive Neuroscience Lab at the Stockholm Brain Institute. Fascinating stuff!

9/25/2006

Developmental Toys

"You are worried about seeing him spend his early years in doing nothing. What! ... Nothing to skip, play, and run around all day long? Never in his life will he be so busy again."
- Jean-Jacques Rousseau, Emile, 1762


In Rousseau's day, very little was known about the science of cognitive development. But even then, it was clear that "child's play" is not mere distraction or lazy pasttime - in fact, it is a critical process in how children discover the world and eventually become capable of thinking like adults.

The past 50 years have seen an explosion of research on the mechanisms underlying brain and cognitive development, but only recently has this work begun to inform the kinds of early-life experiences we give our children. Specifically, "enriched play" is a relatively new concept, gaining traction across the research landscape: as detailed in this post, this is an active area of research at the MIT Media Lab, the Human-Computer Interaction Lab of the University of Maryland, and in the Craft Technologies Lab at the University of Colorado, Boulder.

To learn more about this "enriched play" movement in the consumer world, I sent a series of questions to Tiny Love - a company specializing in developmental toys for the consumer market.
DI: Who started Tiny Love, and what is it's history?

TL: Tiny Love was founded 15 years ago, in 1991, by Shoshi and Isaac Oren, who owned a chain of retail stores for children product and decided to expand into the developmental toys field.

DI: What is the toy-development process and creative process like at Tiny Love?

TL: Much like everybody else, we start with an idea. The idea can come either from inside: from the management, the art director, an account manager who receives requests and input from his clients, or from outside: students that participate in a Tiny Love sponsored course for example. We make a point out of fostering relationships with design schools, design students and young designers, and they reward us with fresh ideas. There are two factors that come into play in the toy development process: one, it's important to us to be innovative and pioneering in terms of redefining the boundaries of existing toy categories and even in terms of creating new categories altogether, so we try to invest our creative energies in these directions. Second, maybe the most important aspect of our toys is the 7 elements system we developed that make sure the toy provides the baby with the appropriate developmental values. The 7 element system, in short, provide the parent a point of reference as to the different stages of development the baby should go through, and the ways to encourage her in achieving them. Our website offers an extensive explanation of the 7 elements system, including cross-sections by age, if you would like to read more about it. This is where our in-house developmental psychologist gets in the picture. She reviews the toy thoroughly to make sure it is up to par vis-à-vis the developmental values it supplies, and the age group it is intended for. We also use other experts such as a physiotherapist and an occupational therapist that contribute their professional knowledge to the features and general usability of the toy.

DI: What recent research work have you found inspiring?

TL: There are more than a few examples where our products relate directly to recent studies in child development. One example would be our DVD MAGIQ product that responds to the increasing need in mitigating young children exposure to television. In recent years a growing number of studies point to the "couch potato problem": children are watching too much television and becoming more and more passive, a phenomenon that has a host of harmful physical and emotional consequences. In the face of this reality, new research works have established the distinction between passive and active viewing as well as the latter's potential benefits for children, for example "Growing up with Television – The Small Screen in the Lives of Children and Adolescents" (Open University Press, 2002), by Professor Dafna Lamish of Tel Aviv University’s Media Studies Dept. Our DVD MAGIQ product is design to encourage active viewing in children and babies, by creating a real interaction between the child and a toy animal that is synchronized to respond to the DVD program. This technique can potentially change the entire paradigm of watching television by transforming it into a more dialogical activity.

Another example is our new Activitot series that encourage babies to spend time on playing on their tummies, in response to research that found that babies that did not spend enough time on their tummies did not develop as well as those who did.

DI: What advice do you have for people who would like to make developmentally-informed toys?

TL: Working in Tiny Love one realizes that people come with good ideas all the time. The difficult part is to get from that good idea to the part where you actually has a toy to sell. You need to go through a complicated development and production process, and you need to market and sell your product. So, our advice for making developmentally-informed toys is: come to us with your brilliant ideas, and we'll put it to work in our tried and true system.
Thanks to Shiri Percinger-Cohen for this opportunity to take a peek inside Tiny Love Toys.

Related Posts:
Profile: Mark Tilden (of Wowwee Toys)
Intelligent Adaptive Toys

9/22/2006

Attentional Control and Active Maintenance

Yesterday I described how some attention control mechanisms might interact with working memory. For example, momentary failures of goal maintenance could result from stochastic dopamine fluctuations, and thus result in errors or extremely long reaction times on tasks involving interference. On the other hand, the primary (unskewed) distribution of reaction times on such tasks can be explained by a time-consuming process of "competition resolution," which might also differ in efficiency between individuals. In summary, measures of working memory capacity may reflect not only 'capacity' but also individual ability a) to maintain representations (of goals or specific stimuli) in a highly active state, as well as b) to suppress representations that might interfere with the to-be-remembered items.

In a 2005 Memory and Cognition paper, Hester and Garavan approach working memory in a different way: they find that WM loads impair both switching and inhibitory ability in a material-specific way.

In their first experiment, subjects studied a list of 2, 5, or 8 letters, and were then presented with a series of trials in which they had to judge whether a given letter was a member of the previously studied list. Some of the letters presented for this judgment were colored, indicating that subjects had to perform a secondary task (make a vowel/consonant judgment or judge whether the letter's color was red or green).

Not surprisingly, all trial types (memory judgments, vowel/consonant judgments, and red/green judgments) were harder when subjects were remembering 8 letters as opposed to 2 (as reflected in both reaction times and, to a lesser extent, accuracy). Paradoxically, it took subjects longer to switch from the primary to the secondary task if the colored letter was part of the studied list, and this difference increased with the number of items that had been studied. Likewise, it took subjects longer to switch back to the primary task if the next stimulus was a member of the studied list, and this difference also increased with memory load. The authors suggest that subjects may have suppressed these memorized items in order to perform the secondary task.

In a second experiment, Hester & Garavan had subjects study lists of 1, 3 or 5 items, and then presented them with a series of letters which may or may not have been members of that studied set. Subjects were required to give a response for each stimulus that was not a member of the studied list (which was the case for over 90% of the trials); however, for presented items that had been studied, subjects were required to withhold a response. As in the previous experiment, accuracy decreased as memory load increased. However, the reaction time on incorrect trials (where subjects should have withheld response) was much faster than that on correct trials.

A third experiment differed from the second only insofar as subjects were also required to withold a response to any item that had also been presented on the previous trial. As in previous experiments, accuracy decreased as memory load increased - but this time, only for items that had been maintained in memory. In other words, performance on the "repeat" items was unaffected by memory load.

In summary, Hester & Garavan found that it takes longer to switch away from, and to switch back to, items that are maintained in memory for another task, and that this problem is exacerbated by increasing the number of items maintained in memory. Likewise, they showed that this same trend applies to inhibitory control, in that it is generally more difficult to inhibit responding to items maintained in memory than to other stimuli, and that this is even more the case for high memory loads. This work suggests that working memory and executive control can interact in the sense that it is more difficult to control competing responses with WM load (and particularly responses that are related to items maintained in WM). Additionally, both these problems are made substantially worse by increasing the number of items maintained in WM.

9/21/2006

Interactions of Memory and Attention: Goal Maintenance Failure and Biased Competition

What differentiates short-term memory and working memory? According to one perspective, working memory tasks require the dynamic online manipulation of information, whereas in short-term memory tasks, information must merely be kept active for later recall. For example, digit span might be considered a short-term memory task, in which the proper string of numbers can merely be repeated moments later. In contrast, operation span would be considered a working memory task, in which some to-be-remembered information must be kept active while other distracting information is used by memory for simple arithmetic operations.

What functions are used to manipulate information in working memory (WM) tasks, but not in short-term memory (STM) tasks? In their 2003 JEP:G article, Kane & Engle argue that the difference is controlled attention, but we might as well call this executive function: processes like goal maintenance, updating, inhibition, set-shifting, and selection might be involved in working memory, but not short-term memory tasks.

As Kane & Engle point out, the idea that "WM = STM + controlled attention" is supported both by latent factor analyses as well as the high correlation between working memory tasks and nonverbal tests of general fluid intelligence, which is unaffected by partialling out variance in WM tasks shared by STM tasks. Recent work with the operation span (OSPAN) working memory measure also supports the idea that controlled attention that is shared by simpler short-term memory tasks does not affect WM's correlation with several high-level measures of cognition, including nonverbal tests of general fluid intelligence. And other work shows that those with low-span are more likely to succumb to the "cocktail party effect," in which a participant notices their name in an unattended stream of speach. Those with high-span are also more able to succeed at the anti-saccade task, in which subjects must quickly look away from a sudden-onset visual stimulus. As a whole, this evidence suggests that domain-general attentional control processes contribute to WM, but not STM span measures.

At a high-level, one can interpret these attention control processes as helping to resolve interference between bottom-up perceptual information and the intended action or goal. This might happen through enhanced activation of the intended goal, or through inhibition of irrelevant stimuli (in fact, these are incredibly hard to distinguish empirically and may ultimately be two sides of the same coin).

How can we measure such goal maintenance? One pillar of cognitive psychology research is the Stroop task, in which subjects must repeatedly suppress their urge to read a word (such as "red") but must instead name the ink color of the word ("yellow"). Failures of goal maintenance are particularly likely when a congruent stimulus ("red") preceedes an incongruent stimulus ("yellow"). In this case, subjects may momentarily lapse in their maintenance of the goal ("read the ink color") since the bottom-up perceptual input leads to the same response as the intended goal. In this case, strengthening the intended goal may take longer (due to competition) and so the subject may be slowed (or less accurate) in responding to the subsequent incongruent stimulus.

How might this goal maintenance ability interact with measures of WM span (which require attentional control)? This is the question investigated by Kane & Engle, in a series of experiments with the Stroop task. Critically, they view Stroop performance as driven by two processes, attention AND memory: in their own words, "resolution of the response competition between color and word dimensions in the Stroop task, an attentional process, will only be engaged when the goal to do so is sufficiently maintained in active memory."

Here are the results of their experiments:
  1. In a first experiment, high-OSPAN subjects were more accurate than low-OSPAN subjects in a Stroop task with 75% congruent trials, suggesting that low-OSPAN subjects were more likely to lapse in their active goal maintenance. Additionally, low-OSPAN subjects showed a greater benefit for congruent stimuli above neutral stimuli (i.e., responding to "red" was faster than "xxx") than did high-OSPAN subjects, again suggesting that they were not as successful at limiting their attention to ink color alone.
  2. In a second experiment, Kane & Engle had each subject first complete the 75% congruent condition followed by a 0% congruent condition, with feedback following every trial. The results replicated those in the first experiment, but low-spans made slightly less errors than they had previously (suggesting that the error feedback helped them correctly maintain the task goal).
  3. The third experiment differed from the second only in that the 0% congruent condition was presented first, followed by the 75% congruent condition. However, the results were quite different: now low-OSPANs were less accurate than high-OSPANs in the 0% condition, and were far slower than high-OSPANs in both the 0% and 75% congruent conditions, but were not less accurate in the 75% condition. The authors interpret this to suggest that goal maintenance was made much more likely throughout the experiment by requiring subjects to undergo a condition where the goal needed to be maintained on every trial, and so instead of accuracy differences, only reaction time showed OSPAN differences (reflecting the attentional resolving of competition between representations).
  4. By combining the data from Experiments 1 & 2, Kane & Engle showed that low-OSPANs reaction time difference between incongruent/neutral trials (i.e., "red" vs "xxx") is even larger than the difference for high-OSPANs regardless of task order for the 0% condition. However, task order does make a difference for the 75% condition, in that high OSPANS have lower errors than low OSPANs for the incongruent/neutral comparison when the harder condition is presented first (75% first, experiment 2), whereas high OSPANs merely have shorter reaction time differences between incongruent/neutral trials than low OSPANs when the easier condition is presented first (experiment 3).
  5. In a fourth experiment, Kane & Engle replicated the first task-order effect described above with different congruency percentages: similar to the 0% condition, a 20% congruent condition revealed a smaller RT difference between incongruent/congruent trials among high-OSPANs than among low-OSPANs, regardless of when this condition was completed. However, an 80% congruent condition showed span differences between incongruent/congruent in terms of both errors and reaction times regardless of task order, whereas previously the 75% condition had shown such differences in errors only when presented first, and in reaction times only when presented second. These discrepancies might be explained by the fact that previous interference calculations subtracted neutral from incongruent trials, while those in experiment 4 subtracted congruent from incongruent trials. It's also possible that the 20% congruent condition did not reinforce the need for goal maintenance as strongly as the 0% condition had previously, and therefore low-OSPANs were more likely to fail to properly maintain the goal if next presented with the harder condition.
The authors interpret these results in a framework where Stroop performance is determined by two factors: attention and memory. Specifically, errors are thought to result from memory failure - failure to actively maintain a goal in mind. On the other hand, reaction time slowing is thought to result from attentional processes - a failure to quickly bias competition towards the correct representation rather than the incorrect representation. Kane & Engle interpret low-OSPAN subjects have a consistent problem with resolving competition between representations, but also show a tendency to fail to maintain the task-relevant goals in some circumstances.

Of course, this dual-component interpretation rests on the idea that goal maintenance cannot directly bias other competing representations, but relies on an additional attentional process to resolve this competition. In contrast, many computational models instantiate these as "one and the same." So, what other data support the distinction made by Kane & Engle?
  • Across all subjects, the amount of RT facilitation (i.e., how much faster congruent trials are than neutral trials) correlates with error interference (i.e., how much more accurate neutral trials are than incongruent trials), suggesting that goal maintenance failure is behind both of these phenomena. In contrast, there is no correlation between the RT facilitation effect and RT interference, as would be expected if goal maintenance failure actually gives rise to all of these measures, nor is there a correlation between error & latency interference.
  • On high-congruency Stroop tasks, schizophrenics show increased errors on incongruent relative to congruent trials, and increased facilitation on congruent relative to neutral trials.
  • The distribution of response times on Stroop tasks indicate that incongruent RTs are not only shifted positively by a specific amount (reflective of the increased competition between representations), but also are positively skewed due to a few trials that take much longer than other trials. These are thought to reflect momentary failures of goal maintenance
  • This positive skew is exaggerated among older adults, and in young adults when trials are presently very slowly (providing more opportunity for mind-wandering)
  • ERP studies of Stroop tasks have identified a wave that may originate from anterior cingulate (ACC) and appears to correspond to response selection and competition processes; in contrast, the activity of a different wave up to 800 ms before stimulus presentation predicts correct performance on the next stimulus (and appears to originate from polar or dorsolateral frontal cortex [dlPFC])
  • Event-related fMRI shows a strong negative correlation between delay-period dlPFC activity and Stroop interference, whereas ACC activity is tied to the presentation of incongruent stimuli
Although it may seem more parsimonious to suggest that a single mechanism - active maintenance of goal-relevant information - is responsible for Stroop performance, Kane & Engle have presented an abundance of evidence suggesting that two pieces of active maintenance may be dissociable: momentary failures to maintain the goal, and the time-consuming process of resolving competition between representations by biasing. Many computational models provide a clear way of visualizing the process of biased competition, but only the most recent (see, for example, this one) include a possible mechanism for stochastic goal maintenance failure.

9/20/2006

Reminder: Encephalon Vol 7

Don't forget to submit your best posts to Encephalon this week!

Send your submissions to: encephalon.host@gmail.com

You can find the guidelines for submissions here.

9/18/2006

Sensory Gating by Prefrontal Cortex

In a 2005 JOCN paper, Brad Postle argues that the dorsolateral prefrontal cortex is not responsible for storing information in short-term memory, but rather than it carries out control operations which include the "gating" or modulation of activity in more posterior regions. According to the argument in this paper, short-term storage of information is achieved by sustained firing in modality-specific posterior regions, and DLPFC's role is to make sure that distracting or irrelevant stimuli do not interfere with the maintenance of that information.

To support this argument, Postle puts 16 young adults in an fMRI scanner and presents them first with a target face, followed by a 7 second delay, followed by a probe face. On "memory present" trials, subjects had to identify whether the probe face was the same as the target face, disregarding any other faces that might have been presented during the delay period. For "memory absent" trials, subjects had to identify whether there were intervening distractor faces present during the delay period. The author estimated the hemodynamic response function for each participant, and identified two regions of interest on each subject's anatomical scans: the dorsolateral prefrontal cortex (DLPFC) and the inferior occipito-temporal cortex (IOTC).

The results indicate that DLPFC massively increases its activity during the delay period when a memory judgment is required, but interestingly, IOTC shows the opposite pattern, in that activity is actually slightly higher when a memory judgment is not required. Postle interprets this to show that DLPFC is suppressing or gating activity in posterior cortex to help it resist interference from distractors.

However, several caveats should be mentioned:
  • For trials where memory judgments are not required, subjects were instead required to indicate whether distractor faces were present. For these trials, subjects may have actively recruited IOTC regions during the delay period, so as to increase their sensitivity to distractors; therefore, the difference between memory-absent and memory-present IOTC activity may reflect active recruitment rather than suppression of activity.
  • Decreased activation in IOTC could also reflect "streamlined processing" of the maintained target face. In other words, DLPFC may be maintaining a "skeleton representation" of the target face during the delay, against which representations of the distractor faces would have to compete. In this way, DLPFC is not actively suppressing distracting information, but is doing so indirectly, by powerfully maintaining a specific representation with which distractors must compete.
  • While DLPFC is clearly differentially engaged by memory absent and memory present trials, the case for differences in IOTC activity is not so clear. For example, DLPFC increased its activity by 139% while IOTC decreased its activity by only 7.5% on memory-present relative to memory-absent trials.
  • Another alternative account is that increases in PFC activity for memory-present trials actually reflects dual tasking, in that subjects may be maintaining information about the target face and accidentally processing additional some aspects of the distractor faces. Although DLPFC activity was numerically stronger for distractor absent trials, this difference did not reach significance, suggesting the difference could be due to chance.
  • This explanation is somewhat less parsimonious than the idea that DLPFC is storing the information, because it requires believing that IOTC is actually maintaining the target face information despite decreases in activity relative to trials where no memory is required, and then assuming that this decrease exists because distractor faces are suppressed. In contrast, one could explain the apparent decrease in IOTC activity as resulting from comparison to the task where the goal was to identify distractor faces.
Quibbles aside, this is a nice demonstration of how working memory might get done - through maintenance (or operations controlling that maintenance) as related to task demands. However, this study does not offer conclusive evidence about where information is actually stored, and how it is made robust to potential interference.

9/15/2006

High Gamma Modulation in Cortex

In today's issue of Science, a new paper by Canolty et al. demonstrates a different kind of frequency multiplexing than has been described previously. Using electrodes planted directly on the cortices of 5 epileptic patients, they were able to show "cross-frequency coupling" in which the amplitude and phase of neuronal oscillations between 80 and 150 Hz were strongly modulated by the amplitude and phase of theta rhythms.

Readers should note that brain oscillations between 80 and 150 Hz are called "high gamma," and have only recently been explored scientificially. These oscillations are too fast to be reliably detected from the scalp using traditional EEG techniques. Furthermore, because invasive EEG recording is dangerous, only patients with an existing need for these implants (such as epileptics) can be used as subjects.

Previous evidence has indicated that theta-band power may be related to executive function, insofar as theta has been shown to be sensitive to task demands. Canolty et al describe how across all their tasks and across all their subjects, over 85% of electrodes showed strong theta/high gamma coupling at significant levels (p<.001). High gamma activity was more prevalent at the trough of the theta wave, showing that gamma activity is sensitive to theta phase information. Additionally, high gamma activity was strongest where theta wave amplitude was highest, showing that gamma is also sensitive to theta amplitude information.

The authors also showed that the locations of electrodes showing theta & high gamma coupling were sensitive to change in task, such that the same task always evokes a similar "topography" of coupling, while different tasks evoke a variety of patterns. The authors conclude that this data further supports the idea that cross-frequency coupling (or, as John Lisman has termed it, multiplexed oscillations) are an organizing principle of neural computation.

Related Posts:
The Argument for Multiplexed Synchrony

9/14/2006

Two Connectionist Models of Reading

Many languages have more regular letter-to-sound mappings than English. How does this affect young language learners?

One test, known as nonword reading, tests the ability of language learners to produce pronounciations of novel nonwords "by analogy" with the words they have previously learned. For example, some researchers have been able to compare nonword reading among learners across different languages by creating nonwords from high-frequency number words. Some studies show English language learners at nearly half the performance of their German peers at 7 years, and still somewhat behind at 12 years of age.

It is important to understand why this occurs, and what can be done about it. Such mechanistic questions can be well addressed using computational models. In Hutzler et al's 2004 Cognition paper, they review two connectionist models (by Plaut et al 1996 and Zorzi et al 1998) of this and related phenomena.

As reviewed by Hutzler et al, the Plaut model consists of three layers - an orthographic input layer (105 units), a hidden layer (100 units), and a phonological output layer (61 units) - and is trained with backprop on 3000 words for 300 epochs. The model is able to successfully simulate skilled reading of novel nonwords, and shows the "frequency by consistency" interaction in English - in other words, it shows that words are read faster if the pronounciation is more consistent with spelling rules, but only for low frequency words. Hutzler et al implement this model and trained one network on English word-to-sound mappings, and another network on German word-to-sound mappings.

Both the German and the English networks were tested on 80 monosyllabic nonwords (e.g., fot, lank, plock). Pronounciations were considered correct if they were merely plausible pronounciations - they did not have to correspond to dominant letter-to-sound correspondences in either language. Unfortunately, these models do not capture the correct qualitative pattern of results - instead of showing large differences in nonword pronounciation decrease over time, they show initially small differences in nonword pronounciation increase over time. Hutzler et al suggest that this failure might be due to the multiple layers used here, which could delay the advantage of spelling-sound consistency (although I think it might also be a result of using backprop without also using more bottom-up hebbian learning rules).

Hutzler et al also implemented Zorzi's two layer associative model, which learns orthography to phonology mappings with a delta learning rule - in other words, a rule that changes connection weights based on the difference between produced output and target output. There are 208 orthography input units, 44 phonological output units (each of which is in 7 positions, yielding a total of 308 outputs), and these layers are fully interconnected; unit activation is determined with a standard sigmoidal function on the dot product of all input activations.

As Hutzler et al note, this network has no hidden layer and thus can learn only linear functions. As a result, it never learns to correctly pronounce all words - training is instead stopped when errors reach a global minimum. As before, one version of the network was trained on English and another on German. After training, the German network shows a consistent advantage in nonword pronoucniation over the English network, which remains (and perhaps even widens) throughout training. However, this still does not perfectly match the results, which show a wider advantage at the beginning of training.

Hutzler et al then ask whether this difference might be explained by differences in pedagogical differences, and address this question by "teaching" each network differently. To understand the logic here, consider that direct training of letter-to-sound correspondence is more difficult when each letter-sound relationship is more dependent on surrounding letters. This is the definition of inconsistent spelling-to-sound mapping, such as frequently found in English. Thus, English pronounciation requires training on entire words. In contrast, German can be effectively taught on the basis of single phonemes or at the syllabic unit without appeal to entire words. The increased effectiveness of simpler training methods may provide an early learning advantage to German readers.

To test this hypothesis, the authors used training events that were reflective of spelling-to-sound rules taught in common English and German phonics programs. After the Zorzi two-layer model was pre-trained with these phonics programs, and then retrained on the previous corpus, while tested on nonword pronounciation throughout training. This time, the results fit the human data much better - the German-trained network shows an initial advantage of around 35%, which decreases to 10% by the end of training.

Could English learners benefit more from a different kind of phonics program? Probably, but it's hard to know what sort of training is best, given that Hutzler et al do not attempt to search the "training space" to find an ideal set of training material. In fact, this kind of analysis is rarely done, probably because input representations are sometimes somewhat arbitrary and are thought to reflect a "weak point" in many connectionist models.

But there are also reasons to think that such an analysis would be premature. For one, these models use only error-driven learning, which is generally good for learning specific tasks but is not particularly good for picking up statistical regularities in the environment. For this type of learning, hebbian rules are ideal. It is difficult to predict how these networks might differ if hebbian learning rules were incorporated into the training paradigms.

9/13/2006

Backward Inhibition: Evidence and Possible Mechanisms

Some psychologists believe the human ability to flexibly switch between tasks relies critically on the ability to activate a new task-set, or high-level representation of the new task. Yet others believe that flexible switching must also involve the active inhibition of an old task-set. What kinds of evidence suggests that such inhibition is a critical component of task-switching, and that active maintenance of a new task-set is not enough?

Some evidence for the idea that task sets must be inhibited comes from Mayr & Keele's 2000 JEP:G article, in which they conduct a series of experiments where subjects select the object "that doesn't belong" among a set of 4 objects. Three objects are the same color, while one is a different color. Another object has a different orientation than the other three. And a third of the four displayed object is moving, while the others remain still. How, then, does a subject decide which one doesn't belong?

Mayr and Keele present a "task cue" a certain interval (the "CSI") before they display these objects - this task cue either indicates whether the oddball judgment should be made on the basis of color, orientation, or motion. After responding, the subjects are not presented with another cue for a certain interval (the response-cue interval, or RCI). By varying the CSI and the RCI as follows, certain patterns might emerge:

1) Short RCI & Short CSI: in this situation, subjects should have little time to prepare a new task-set representation, and the previous task-set should have had little time to decay. Therefore, subjects should have more difficulty switching back to a task-set that had been used two trials ago - in other words, one that had recently been used, but then abandoned - than switching to a task set that had been used more than two trials ago. (This difference is referred to as "backward inhibition," or BI.)

2) Long RCI & Short CSI: here, the old task-set will have decayed, but subjects should have had little time to instantiate a new task set. In this case, subjects should show reduced backward inhibition (compared to the situation in #1) when switching to a cue that had been recently used but then subsequently abandoned.

3) Short RCI & Long CSI: in this case, subjects have a long time to instantiate a new task set, they might be more able to activate the previously-used-but-more-recently-abandoned task set, and thus show less backward inhibition than in 1 (and possibly also than in 2).

Although 2 & 3 did show less BI than 1, there was no difference between the backward inhibition found in 2) and 3) above. This indicates that switching to a previously abandoned task set is not made easier by having longer to prepare - suggesting that "task set inertia" appears to be unrelated to how strongly one is able to activate the new task. Mayr and Keele interpret this as evidence that switching involves an inhibitory process acting upon previous task sets.

These results suggest that "backward inhibition" is not actually a side-effect of not having fully activated the now-relevant task set, which would be an explanation easily backed-up by many computational models of task-switching. Instead, it suggests that something more complicated is going on - perhaps old task-sets are actually inhibited. But does this happen as a result of lateral inhibition among competing task-sets, or is it an active top-down process of executive control?

In a third experiment, the authors attempt to dissociate these two possible forms of inhibition ("bottom-up" versus "top-down" executive control) by changing the stimulus display such that one object differed from other three in terms of size, while another object differed from the others in terms of another dimension (color, motion or orientation). However, subjects were told only that size was always irrelevant, and that they should pick the "oddball" item on the basis of its features. In another condition, size was still irrelevant, but subjects were told which dimension would differentiate the oddball from the other items (e.g., they might be told "color").

If backward inhibition is actually a top-down phenomenon, one would expect to see BI effects only in the case where subjects were told which dimension would be relevant, because they would have deliberately suppressed the older task set. In contrast, those in the "bottom-up" condition (where they were not told the relevant dimension) should not show BI if it inhibition actually a control mechanism engaged by top-down mechanisms.

Essentially, this is what the authors found. In further experiments, they demonstrate that backwards inhibition may make up the bulk of what is called "residual shift cost" - the amount of time taken by participants to switch to a task despite having been given ample time to prepare for the new task.

What mechanisms might subserve this process? It seems hard to imagine that the brain is actively suppressing previously used but currently irrelevant items. A mechanism that mimics these effects through active maintenance and lateral inhibition along would be far more parsimonious.

One possibility is that recently used task sets are never completely abandoned, but instead enjoy gradually diminishing activation as new task sets are activated. If one assumes that task sets are affected by processing both at the executive level and at the stimulus level, which seems likely (and is assumed by Mayr & Keele), then an attractor network can be viewed as a candidate model.

In this case, switching to a new task set through top-down control would involve biasing the attractor state of the previous task towards the current task. In this case, traces of the previous task representation may become incorporated into the new stable attractor state, which represents the current task. However, after a second task-switch, it may take longer for the attractor to "revert" to a previous state. This might occur because the attractor basin - the diversity of representations that lead to the same response - had been widened to accomodate traces of the previous task. In contrast, switching to a less recently performed task might be paradoxically easier, since it involves moving to a substantially different attractor state, far outside the basin of attraction in the last 2 most recently performed tasks.

Of course, these possible mechanistic explanations are completely speculative; this account would obviously be strengthened by an actual implementation of the backward inhibition phenomenon within an attractor network that lacks directed inhibition. Yet other directed inhibition phenomena would be difficult to explain in this framework, such as evidence reviewed yesterday.

On the other hand, if directed inhibition is truly at work in the brain, one needs to explain the relative lack of long-range inhibitory connections in cortex. Thus, we arrive at a conundrum: behavioral research appears to show directed inhibition at work, but neurobiology does not provide an easy way of explaining the mechanism. It seems safe to claim that biologically-plausible computational models will be important in resolving this apparent discrepancy.

9/12/2006

Inhibition and rTMS

Imagine your phone is ringing and just as you reach for it, it stops ringing - the caller hung up. Chances are, you were able to inhibit your movement and didn't pick up the phone. What brain regions implement this kind of "stop signal"?

Neuroimaging and neuropsychological investigations have revealed that the right inferior frontal gyrus (IFG), the middle frontal gyrus, and the right inferior parietal cortex appear to be important for such inhibition. However, as Chambers et al point out in a recent issue of the Journal of Cognitive Neuroscience, each of these techniques has its drawbacks. Imaging can only tell you which regions are associated with a given function, not which are causally responsible. And while neuropsychological investigations of brain damaged patients does provide information about causality, the power of these studies may be limited due to the tendency of brain-damaged patients to show some functional reorganization as a response to brain injury.

Enter repetitive transcranial magnetic stimulation (rTMS). rTMS involves directing strong magnetic fields towards a specific area of the brain. This has the effect of temporarily disrupting normal activity in that region. Using this technology, we should be able to identify which of these regions is causally responsible for inhibition, and which are merely associates.

This is exactly what Chambers et al did in two sets of experiments. First, they calibrated the stop signal reaction time (SSRT) for each of 16 subjects - SSRT is the amount of time it takes you to cancel an intended motor action. This is calculated by telling subjects to press a button every time an "X" or an "O" appears, and then telling them to "stop" on 25% of trials. Critically, the amount of delay between presentation of the stimulus and the signal to "stop" is varied. The amount of time required for subjects to successfully inhibit their response 50% of the time is called the SSRT.

Next, they put each of these subjects into an MRI scanner to identify the location of the inferior frontal, middle frontal, and angular gyri in each subject's brain. They then performed rTMS on each of these regions, accompanied by "sham TMS" conditions in which the magnet was on but directed away from their heads, in order to identify which would most significantly increase the required stop signal reaction time in each participant.

The results were striking. Disruption of activity within the right IFG, but not in either of the other regions, led to a selective increase in the SSRT relative to the sham condition - an increase of almost 30 miliseconds.

What if rTMS of the IFG results merely in suppression of arousal? In other words, wouldn't we expect a similar pattern if the right IFG was responsible for keeping subjects awake? The authors considered this possibility, but show that neither pupil diameter change (a putative measure of arousal) nor changes in reaction time or accuracy on "go" trials (trials in which the stop signal was not provided).

Interestingly, rTMS of the right IFG had no effect after about 30 minutes, suggesting that this regions is capable of extremely rapid functional reorganization.

9/09/2006

Blogging on the Brain: 9/3 to 9/9

Highlights from the week in brain blogging:

The picture is from a post on Andrew Carnie - a "time-based" visual artist with an interest in brain-related subject matter.

BrainTechSci reveals that "open access journals" (like PLOS) actually charge authors over $2,000 to publish a single article. Is this policy likely to result in the publication of good science?

Brainethics (and several others) all cover new findings on facial imitation in neonatal primates.

The Neurophilosopher discusses similarities in spatial memory strategies across the primates, yet differences in the developmental trajectory of these strategies.

Omnibrain covers the brain bar - an EEG-based bartender that will mix your drinks based on a custom cognitive diagnosis! More pictures at the Neurocritic.

New algorithms for adaptive classification of EEG-based brain computer interface signals, and fascinating work showing single-trial fMRI is no longer a thing of science fiction.

Neural operations that give rise to a sense of self.

AI toys only 20 years away?

9/07/2006

A Presentation on Self-Organizing Learning of Semantics

In this presentation (PPT and PDF) I summarize the two self-organizing approaches to semantic learning covered on Tuesday and Wednesday. In the presentation, I propose the "problem of learning language," and discuss the constraints that have been used to make the language problem tractable. In particular, I differentiate between contraints (or assumptions) that are theoretically grounded, and those that are merely a matter of implementation.

In the process, I explain singular value decomposition by way of an analogy to eigenfaces, as well as analyze the similarities of hebbian learning mechanisms with both the math underlying latent semantic analysis as well as the math in the "grounded bootstrapping" experiments of Steels & Kaplan.

9/06/2006

Watching A Language Evolve Among Robotic Agents

In yesterday's post, I described a solution to the "gavagai problem" (which states, according to one interpretation, that word learning is an intractable task because any word can in principle have an infinite number of referents, even when learned in the context of direct instruction) by Landauer & Dumais, who used a machine learning technique (called latent semantic analysis, or LSA) to extract the meaning of words by plotting a similarity relationship between each word and every other word ever encountered by the progam, as a function of the contexts in which they appear.

Although showing impressive results, this approach seems to miss a fundamental and intuitive aspect of language learning: words seem to be defined primarily as relating to objects in the real world, and only secondarily as relating to other words. In other words, LSA word meanings are not "grounded" in real-world experience ... so to what extent can we think of it as truly understanding the meanings of words, human-competitive data notwithstanding?

A more intuitively fulfilling solution to the gavagai problem might explore the way in which speakers come to understand the real-world objects to which a given term refers, rather than developing a solely recursive understanding of word meaning. In their chapter in Linguistic evolution through language acquisition: formal and computational models, authors Steels and Kaplan describe experiments with two robotic agents whose job is to communicate about objects in their environment by developing their own language.

Steels & Kaplan implement this task as follows: two robots face an environment populated only by 2-D shapes of different colors on a whiteboard. Each robot consists of a camera, a catagorization system, and a verbalization system, which in combination are capable of segmenting a visual image into objects, with each object defined by its spatial position and RGB color values. The robots take turns playing the roles of speaker and hearer in what the authors call "the guessing game", in which a speaker first picks an object from the environment, and then communicates to the hearer a series of syllables that best categorize that object uniquely among all the other objects in the environment (where "best categorize" is defined in terms of weight, as discussed below). For example, the speaker may use the phrase wovota to indicate the object of interest is in the upper left corner. The hearer will then use its database of terms and their associated meanings to pick the object it believes the speaker is referring to. The speaker then verifies whether the picked object is actually the one to which it was referring; if the hearer was correct, both the speaker and hearer increase the "weight" of the relationship between that term (wovota) and the internal representations (upper left) while decreasing the weight of all competing associations. If the hearer picked the wrong referent, the relationship between wovota and the internal representation is decreased in weight. Such games are played thousands of times in a row, in which the agents playing speaker and hearer are rotated so that all agents in the environment have played the games with other agents. By the conclusion of training, the arrive at a set of consistent - or mostly consistent - terms for objects in their environment.

However, consider what would happen if the speaker had used the term wovota to indicate "very red," but this very red object was located in the upper left corner of the environment - in this case, the hearer would have been correct, but for the wrong reason: the term wovota was actually intended to reflect redness, but was interpreted as reflecting spatial position. In this case, future uses of the term wovota by the speaker would be necessary in order for the hearer to correctly reinterpret the phrase.

Consequently, yet another type of misunderstanding is possible: if the speaker uses the term wovota to indicate the very red object in an environment consisting of both one red and one blue objects in the upper left hand corner, the hearer will not be able to successfully identify the referent of wovota. In this case, the hearer will not be able to pick an object, and so the speaker will identify the referent of wovota. Now a new meaning for wovota will be stored (one involving redness, most likely), and will begin to compete with the previous interpretation of "upper left."

These types of miscommunications are prototypical examples of the "gavagai problem." And yet, the communicative performance of these robotic agents is in many ways very language-like and actually quite impressive (even more so, when you consider that they begin with no shared terminology at all!). For example:
  • Synonyms tend to emerge during training, such that one agent prefers one term for something, while the other agent may prefer another. As the authors note, this situation arises when one agent invents a new term after having incorrectly interpreted a term that is already in existence.
  • Likewise, homonyms also emerge during training, in which one word can have multiple meanings )
  • Some meanings of words fall out of usage among the agent population, just as in human language
  • When new objects were added to the environment, thus making them more complex, success in the guessing game dipped sharply, but quickly rebounded as agents derived new words or differentiated the possible meanings of old words.
Above and beyond these surface similarities to human language, what is most impressive is that semantics can spontaneously emerge from a population of simple agents, whose behavior is guided only by a few simple rules. These rules involve uniquely identifying objects in the environment, and maintaining representations of word meaning that compete with each other for predictiveness. One cannot quite call this a "language" since it lacks so many other aspects of linguistic experience - phonology and syntax to name but a few. Nonetheless, Steels and Kaplan seem to have solved the gavagai problem in a completely different way than the approach embodied by LSA. The extent to which these approaches can be integrated, and extended to account for other data, is yet to be determined.

9/05/2006

Machine and Human Learning of Word Meanings

How do we ever come to know the meanings of words? Consider the following (likely apocryphal) story:
Walking along one day on the newly-discovered coast of Australia, Captain Cook saw an extraordinary animal leaping through the bush. "What's that?" he asked one of the aborigines accompanying him.

"Uh - gangurru." he replied - or something like that. Captain Cook duly noted down the name of the peculiar beast as 'Kangaroo'. Some time later, Cook had the opportunity to compare notes with Captain King, and mentioned the kangaroo.

"No, no, Cook", said King, "the word for that animal is 'meenuah' - I've checked it carefully.

"So what does 'kangaroo' mean?"

"Well, I think," said King "it probably means something like 'I don't know'..."
This story may have inspired Quine in proposing the "gavagai problem," which can be interpreted as suggesting that word learning is a nearly impossible task because any given word has in principle an infinite number of referents. To make this more clear, consider that gangurru (or meenuah, or "gavagai") could actually mean "marsupial," "good to eat," or "kangaroo jumping through a harvested field prior to 11:45 am." So, how do we recover word meaning from any real-world linguistic experience, when even straightforward and direct instruction of word meaning is actually so ambiguous?

One might assume that powerful constraints exist on the kinds of conclusions we draw from linguistic experience. Yet a fascinating machine learning technique, known as Latent Semantic Analysis, is capable of acquiring word meanings in ways that eerily resemble human learning - all without ever having undergone direct instruction, and with no word learning biases built-in.

As presented in this paper (from which this post is conceptually derived), Landauer & Dumais describe how Latent Semantic Analysis (LSA) can acquire word meaning simply by reading encyclopedia text. Essentially, the LSA algorithms derive a rating of similarity between every word and every other word by cataloguing the "number of times that a particular word type, say 'model,' appears in a particular paragraph, say this one." These values undergo a log transform, and then division by the entropy of that word with respect to that paragraph (entropy is a measure of how representative a word is of the paragraph in which it is found, with high values being less representative; basically, dividing by this term allows LSA to scale-down the importance of "contextually ill-defined" words). Ultimately, this processing results in an enormous matrix of similarity relationships.

That's "where the miracle occurs." Landauer & Dumais describe the use of "singular value decomposition" (SVD) to compress these similarity relationships into a more manageable data structure - one that represents the most "core" set of relationships that define when and how a certain word will be used. In slightly more technical language, SVD reduces the dimensionality of a data set by identifying the principle components or eigenvectors that compose it (the technique is similar to factor analysis, principle component analysis, and multidimensional scaling).

To use the example in the paper, Landauer & Dumais were able to compress semantic relationship data from over 30,000 encyclopedia articles into around 300 dimensions. In other words, the usage of each and every term out of the 4,000,000 words encountered in the encyclopedia is optimally describable if characterized along each of roughly 300 different continuums. One might claim that LSA has learned the set of "core semantic characteristics" that make up the meanings of English words. (Not only that, but this is just one of many possible dimensionality-reduction techniques, as the authors note!)

And the proof is in the pudding:
  • after training, LSA performed at 64.4% correct on a multiple choice test of synonymity taken from TOEFL (in contrast, humans score around 64.5% on average on this test, which is frequently used as a college entrance examination of English proficiency in non-native speakers. By this metric, LSA would be admitted to many major universities!)
  • calculations of the rate of word learning by 7th graders suggests that they acquire .15 words per 70-word text sample; analogous calculations of LSA's rate of acquisition show that LSA acquires .1500 words per text sample read
  • the comprehension by college students of several versions of a text sample about heart function is precisely replicated by LSA, when comprehension is measured as the degree of semantic overlap between subsequent sentences;
  • Humans initially show facilitated processing of all meanings of a previously-presented word, but after 300 ms show priming only of context-appropriate meanings; LSA shows similar effects insofar as similarity is higher between a homograph and two words related to different meanings of the homograph than between a homograph and unrelated words, and in that LSA considers words related to the context-appropriate definition of a homograph as more related than words related to the context-inappropriate definition of the homograph;
  • Human reaction times in judgments of numerical magnitude suggest that the single digit numerals are represented along a "logarithmic mental number line;" LSA was able to replicate this effect in its ratings of similarity among the single digit numerals, which also conform to a logarithmic function
The authors note that novel variants on the SVD procedure may be capable of fitting even more data, and indicate a particular need for elaborations that specify an adaptive procedure for identifying a data set's optimal dimensionality.

In conclusion, LSA well-characterizes many aspects of word learning. How SVD-like computations may be situated in the brain, and the necessity of some of the simplifying assumptions of this model (such as perfect memory, and an empirical determination of optimal dimensionality) are as yet unanswered.

Nonetheless, latent semantic analysis seems like a promising and powerful approach for understanding human semantic learning at a cognitive level. Perhaps most importantly, it shows that Quine's "gavagai problem" might not be so intractable after all - a word's meaning can be understood as a function both of the contexts in which that word appears, the contexts in which it doesn't appear, and likewise for every other word to which it is related. This recursive relationship of word to context provides a nearly infinite amount of linguistic data from which word meaning might be derived.

9/04/2006

Dolphins: Stupid or Smart? A Summary and Synthesis

Since the publication of this article on the intelligence of dolphins, over 200 blogs and news outlets have covered the paper's highly controversial thesis: dolphins may not be as smart as we think they are. Before offering my opinion on the article, I thought I'd summarize the article itself and what's already been said about it in the cognitive blogosphere.

Briefly, Paul Manger's (quite technically impressive) Biological Reviews article states that:

1) Behavioral experiments that purport to demonstrate dolphin intelligence are often poorly controlled, difficult to replicate, and disputed by other scholars; therefore, this evidence is not considered in any detail;

2) Some of the most computationally important anatomical characteristics of the primate brain (such as the hippocampal formation) are extremely underdeveloped, while yet others (such as Layer IV in neocortex, and prefrontal cortex in toto) seem to be completely missing in dolphin brains;

3) Naive indices of dolphin brain size/complexity (such as encephalization indices) suggest that dolphin brains are as complex as some primate brains, but more sophisticated analyses (such as corticalisation indices) suggest that dolphin brain size is primarily an adaptation to living in cold water;

4) Viewing dolphin brain function as an adaptation to living in cold water is powerfully explanatory and explains many confusing differences between dolphin and primate brains (such as the altered glial:neuron ratio, consistently higher noradrenaline levels, vastly different sleep patterns, relatively undifferentiated neuronal morphology, and the homogenous scaling of body:brain mass among cetaceans in contrast with the heterogenous scaling seen among primates);

5) Evolutionary increases in dolphin brain size were extremely abrupt, and coincide well with an estimated dip in ocean water temperature roughly 20 million years ago. This is taken as evidence that brain size change was a result of selection pressure for increased thermogenesis.

It apears the discussion in the blogosphere was ignited by this technically inaccurate and otherwise poor article at the Sun-Times and subsequent coverage on Slashdot. Now, for what the cognitive blogosphere has to say about the idea that dolphins are "dumber than goldfish," as the popular press coverage often claimed:

What, Doesn't Intelligent Behavior Count? (Majikthise, August 23)

Philosopher Lindsay Beyerstein suggests that this study should be taken with a grain of salt, given that a multitude of studies show dolphins can behave intelligently (such as maintaining "complex and dynamic" social relations, being trained for mine-sweeping, self-recognition, and spontaneous imaginative play).

Commenter Michael points out that dolphins have also been known to gossip, which to him makes them "at least as intelligent as bloggers."

Maybe Dolphins Aren't Smart After All. (Cognitive Daily, August 21)

Dave Munger admits that while the argument against dolphin intelligence may be somewhat simplistic, the evidence for dolphin intelligence is ambiguous at best. He concludes that while "dolphins may be cute, and they can be trained to do impressive tricks, [...] it's doubtful that they possess humanlike intelligence."

Commenter Dwight Brown suggests that all our definitions of intelligence are anthropocentric, by necessity: from an evolutionary point of view, intelligence is reflected in species-specific adaptations to that species' environment. He suggests that we "check back in a couple of million years and see who is the most intelligent at that point!"

Commenter RixiM MixiR points out that neuroanatomical differences in dolphin brains could reflect differences in perceptual abilities rather than in intelligence, and reminds us how much of the human brain is devoted to perceptual processing.

Irrational and Anthropocentric Hogwash! Dolphins are smart. (A Blog Around the Clock, August 28)

Coturnix insists intelligence is so obvious in dolphins' behavior that we've known they were intelligent for millenia. Coturnix ridicules Manger's unfortunate comment in a press interview about serotoninc as a "happy drug," and notes that thermogenesis is not an established function of glial cells. Given the number of publications that have established "intelligent" behavior in dolphins, why would Manger write this paper? Coturnix argues the answer is our anthropocentric defintion of intelligence, and that a better definition would involve "behavioral flexibility."

Careful How You (Re)Define Intelligence... (Cognitive Daily, August 28)

Dave Munger agrees on the difficulty in defining intelligence but insists that an anthropocentric definition is most productive, assuming that "the point is to identify animals that are similar to ourselves."

Faster, Harder and Stronger ... With Regard to What? (addendum to previous Blog Around the Clock post, August 28)

Coturnix asks what criteria should be used for identifying animals similar to ourselves, and suggests that if intelligence is "provisionally defined as fast learning, high processing power and flexibility of behavior, then we can compare species without looking at specific items that are learned, specific information that is processed and specific behaviors that are flexible."

Neural Data is Better than Behavioral Data (Abstract Nonsense, August 29)

Alon Levy argues that "brain data" is "harder to misinterpret than behavioral data," and existing behavioral data may overestimate the intelligence of dolphins on the basis of inappropriate anthropocentrism: "Dolphins communicate more-or-less verbally; hence they’re intelligent. Dolphins use tools; hence they’re intelligent. Dolphins are smart hunters; hence they’re intelligent."

Glia are Information Processors, So Try Again (Adam Bastard, September 1)

Adam points to the emerging view that glial cells are important for information processing, and correctly identifies this as a fatal flaw in Manger's argument: "With this information, it seems ridiculous to conclude that intelligence is negatively related to the volume of glia. That's a major blow to the theory of dolphin stupidity."

A Biased Synthesis: My Conclusions

1) Identifying the neuroanatomical correlates of cognition is still a highly theoretical discipline, and has been carried out largely in humans. The idea that we can confidently extrapolate in the opposite direction - from neuroanatomy to cognition - is almost completely untenable.

2) The above is even more true in cases where the neuroanatomy differs so profoundly from that on which the majority of cognitive neuroscience is based, which is a point underscored by Manger's detailed comparisons of cetacean and human neuroanatomy.

3) Brains are the most metabolically expensive organ in the body, so it seems like a maladaptive solution to natural selection for body temperature regulation (particularly given the prevalence of convergent evolution in mammal phylogeny, and that the solution almost all other cold climate mammals have adopted is blubber);

4) Until you show that human neuroanatomy is the only system that can lead to human-level intelligence, differences in dolphin neuroanatomy are irrelevant;

5) Even if dolphin brain size increased in response to selection pressure for thermogenesis, this increase in size might become an exaptation and nonetheless translate into improved information processing (a possibility highlighted by neuroscience's increasing appreciation for the computational importance of glia).

In conclusion, there are countless reasons to doubt that dolphins are "dumber than goldfish," or indeed that popular musings about dolphin intelligence have been inaccurate. Of course, as Cognitive Daily points out, it is clear that they don't have human-level intelligence - whatever that may mean. On the other hand, Manger has developed a new theory about the evolution of the dolphin brain; unfortunately, any extrapolation from neuroanatomy to cognition is still highly theoretical, particularly in the case of dolphins, whose brains are so drastically different from our own. Therefore, given the state of neuroscience, judgments of dolphin intellectual powers must more heavily weigh behavioral work (however flawed) than arguments from evolutionary data and cellular neuroscience such as Manger's.

9/02/2006

Blogging on the Brain: 8/22 - 9/2

Highlights from the last few weeks in brain blogging:

The Neurogenesis debate continues as MindBlog covers two recent studies that failed to find evidence of adult neurogenesis - and yet a third that did.

Speaking of neurogenesis, Frontal Cortex reports on a new project by Elizabeth Gould, a big player in the adults neurogenesis debate.

Thomas at Brain Ethics talks about the computational functions common to regions in the the temporal lobe and how this is revising our understanding of modularity in the brain.

Neuromarketing blog covers a new type of single unit recording - using nanowires!

Neurodudes reports on spotaneous synaptic remodeling in "cortical microcircuitry" on the timescale of hours.

Finally, Boulder-refugee Coffee Mug (now at Gene Expression) covers the mechanisms of LTP and memory consolidation in far more detail than my post earlier this week.

Have a nice weekend!

9/01/2006

Selection and Updating Efficiency in the Attentional Blink

In the attentional blink paradigm, two targets are serially displayed in rapid succession; astonishingly, participants show a brief temporal window in which they cannot identify the second target, while on either side of that window recognition proceeds normally. It is as though the proverbial mind's eye must "blink" in order to attend to two temporally distinct meaningful items. Imaging research shows that the second incoming item is still processed by higher visual areas, even if subjects are unable to report seeing that item - so, what happens during this time, when participants are functionally blind?

No one really knows. One explanation is that this "attentional blink" is actually the switch cost associated with task switching between looking for target 1 and looking for target 2 (also known as switching the "attentional set"). Accordingly, several studies implicate the areas responsible for working memory: right posterior parietal, cingulate, and left temporal/frontal regions. Further, when the second target is detected, subjects tend to show a large area of phase coherence in the gamma range (30-80 Hz) throughout the task, suggesting that this synchronized activity might reflect differences in working memory updating or attentional focus, which subsequently translate into improved target detection.

What other kinds of individual differences can be observed in this paradigm? As reviewed in the Martens et al. paper in the current issue of the Journal of Cognitive Neuroscience, patients with neurological damage to inferior parietal, superior temporal gyrus, or frontopolar cortex show a prolonged attentional blink, as do those with dyslexia or ADHD. And yet there is also a minority in normal populations who show no attentional blink whatsoever. In an fMRI study comparing "blinkers" to "nonblinkers," nonblinkers showed more anterior cingulate activity (presumably engaged by some form of conflict when both targets appear close together), more medial prefrontal cortex activity (which Martens et al interpret as reflecting self-directed attentional focus) and frontopolar activity (engaged by the updating of working memory and sometimes also thought to reflect subject preparedness).

Unfortunately, these results could originate from the fact that nonblinkers detect the second target, rather than inherent cognitive differences. Therefore, Martens et al fine-tuned this comparison of blinkers and non-blinkers by measuring ERPs on 11 subjects of each type, each performing 600 attentional blink trials. Their analysis time-locked the ERPs to the onset of the stream of visual stimuli, controlled for any baseline differences in EEG mean amplitude and amplitude within standard power bands (there were no differences in either, interestingly), downsampled the data to 250 Hz, low-pass filtered at 20 Hz, corrected for eye movements, and "DC detrended" the data, with all unusually large amplitude changes removed as artifacts.

As in previous work with ERPs of the attentional blink, targets that were not detected showed no parietally-centered positive deflection at 300 msec after target presentation, further confirming that this "P300" wave reflects updating, or (to use Martens' et al's term) "target consolidation." However, this P300 wave was slower on average for blinkers than for nonblinkers for both targets (but even more so for the second target on trials in which blinkers happened to correctly detect it), suggesting that "nonblinkers" may not blink simply because their WM updating processes are faster than those of blinkers. Furthermore, nonblinkers showed a much stronger bilateral ERP wave known as frontal selective positivity (FSP) than blinkers, who tended to show a weaker FSP only in the left hemisphere.

Martens et al also carried out a second experiment to test the hypothesis that the attentional blink is caused only by the amount of time available to process stimuli. According to this hypothesis, shortening the time between target 1 presentation and target 2 presentation by an amount equivalent to the difference between P3 waves in blinkers and nonblinkers should essentially make many of the nonblinkers experience the attentional blink. Such an effect was not found, which led the authors to propose that the attentional blink is more related to the efficiency of neural processing rather than simply the speed with which stimuli are processed.

The authors conclude that "selection efficiency" is a primary determinant of the attentional blink, as reflected in the fact that blinkers also tended to show more prefrontal activity in response to distractors. Martens et al. also suggest that their results are incompatible with some theoretical models of the attentional blink, one of which (the "two stage model") states that bandwidth-limited or serial attentional processes must select from the results of a first bandwidth unlimited (or parallel) processing stage that transforms visual information into conceptual representations. According to the authors, although the two-stage model correctly predicts slower processing of the second target at lag 3, it doesn't provide any reason to think that detection of the first target would be slower, which was indeed found behaviorally. Therefore Martens et al endorse what they call an "interference model," which posits that the targets compete for representation with distractors.

Although these results are certainly interesting, I don't think that they clearly rule out two-stage models (in fact, since no one would actually specify a two-stage connectionist model of the AB that excludes competition for representation, connectionist implementations of the interference and two-stage models are not really distinct from one another). These data do seem to offer additional constraints on modeling of the attentional blink, particularly with regard to the role of processing time constraints, which have often featured prominently in these models. Instead, these data suggest that processing time is only indirectly important, while processing or selection efficiency may be the more critical variable determining whether an attentional blink will take place.

Related Posts:
The Mind's Eye: Models of the Attentional Blink
Selection Efficiency in Updating Working Memory
Selection Efficiency and Inhibition
Anticipation and Synchronization
Attention: The Selection Problem