Rooting for the machines A Blog by Chris Baldassano

Death by puppies - tenure-track year one

The end of this month will mark the end of my first year as a tenure-track assistant professor. I don't know if I have much helpful advice to give, since I'm still new enough to the job that it's hard for me to know what I've been doing right or wrong, and I'm very grateful to my collaborators and colleagues in my department for bearing with me as I've bumbled my way through my new responsibilities this year. But I do think I can shed some light on what junior faculty life is like, especially for grad students or postdocs who are, like I was, spending a lot of time trying to figure out whether academia is a career path worth attempting. (Though I admire those who were able to easily make up their minds - in grad school I asked a fellow student whether he wanted to pursue a tenure-track job, and he responded with a string of profanity that I will just translate here as "No.")

I think this video sums up pretty well how this year has gone:

Almost all of the responsibilities in this job are things I love doing - there are just a lot of them, usually way too many to possibly be completed by one person. Being buried by tons of great things is quite a nice place to be, as long as you don't mind the chaos:

Puppies Static

Mentoring trainees

I devote the biggest chunk of my time to working with the trainees in lab, which basically means that I get to brainstorm experiments, debug analyses, and talk about science with super-smart people for most of the day. Recruiting lab members seemed like a huge gamble when I started last year (can I know if I want to work closely with someone for 5+ years after meeting them for one day?), but I'm amazed at how much my mentees have accomplished: two were accepted into top PhD programs, we developed multiple new experimental paradigms from scratch, submitted abstracts, drafted a review article, and set up half a dozen pieces of new equipment.


Groaning about (and trying to get out of) teaching is a favorite pastime for research-oriented faculty, but honestly I've loved teaching more than I expected. I've been able to design two of my own seminars and modify an existing course, so most of the time I get to talk about things that I care about, and I've been able to convince students to care about them too! Also, positive teaching evaluation comments are some of the most meaningful and validating bits of praise I've ever gotten in my career.

Faculty/trainee recruiting

As a well-known research institution, we get bombarded with applications from prospective PhD students, postdocs, and faculty, and I've spent a great deal of time interviewing candidates and attending job talks. Speaking with all these enthusiastic current and future scientists is bittersweet, since more extremely-well-qualified people apply for positions than we could ever accommodate. This is especially true for faculty searches - multiple times this year I've been interviewing candidates who are objectively more accomplished researchers than I am, and most didn't end up with an offer.


Writing grants was always presented to me as the major downside of a faculty position, and there is certainly a lot of stress involved - suddenly I am responsible for running a small business on which multiple people depend for their salaries, and the process by which grants are evaluated is unpredictable at best. But I've found putting together grants to actually be quite useful for thinking more long-term about my research goals, and for providing opportunities to build new collaborations and connect with other faculty members.


I'm still trying to carve out some time each week to do at least a little of my own research, writing some analysis code or testing out ideas that I could pass on to trainees if they seem promising. Looking at senior faculty it seems like this will probably gets squeezed out of my schedule at some point, but right now I still look forward to spending some time with my headphones on fiddling with python code.

Talks and writing

At the end of the day, the primary way I'll be evaluated is based on how productively I get my lab's research out into the world in talks and papers. Effective speaking and writing is a hard, time-consuming process, and even as I've become much better at it over the years I still don't know how to do it quickly. It is some consolation to me that even professional authors haven't found any other way to communicate ideas aside from repeatedly writing the wrong thing and crossing it out, until finding something that works.

Administration and departmental service

Not having a boss in the traditional sense is great in many, many ways, but the downside is that it means that a lot of paperwork tends to flow in my direction. There are also a bunch of advisory and committee jobs in the department that need to get done, but which no one particularly wants to do - luckily my department has been pretty good about insulating junior faculty from these, so I haven't had much of this dumped on my plate so far.

Back to the dog pile!

Comments? Complaints? Contact me @ChrisBaldassano

Building the present from the past

All airports, he had long ago decided, look very much the same.​ It doesn’t actually matter where you are, you are in an airport:​ tiles and walkways and restrooms, gates and newsstands and fluorescent lights.​ This airport looked like an airport.​

-Neil Gaiman, American Gods​

Each scene of a movie (or paragraph of a story) generates a pattern of activity in the viewer's brain, and I showed in my last paper that changes in these activity patterns correspond to new events happening in the story. But it is still mysterious exactly what information the brain is keeping track of in these activity patterns. We know that movie and audiobook versions of the same story generate similar patterns in many brain regions, which means that the information must be pretty abstract. Can we push this even farther? Can we find pattern similarities between different narratives that describe events from the same "template"?

In my new paper we studied a kind of template called an event script, which describes a typical sequence of events that occurs in the world. For example, when you walk into a restaurant you have detailed expectations about what is going to happen next - you are going to be seated at a table, then given menus, then order your food, and then the food will come. Our hypothesis was that some brain regions would track this script information, and should look similar for any story about restaurants regardless of the specific characters and storyline of this particular narrative.

We showed subjects movies and audiobooks of stories taking place in restaurants and stories taking place in airports. Here are examples of two of the restaurant movies (all of the stories are publicly available here):

We then looked for brain regions that seemed to be tracking the restaurant or airport script during all of the stories, and found a whole network of regions:


An especially important region here is the medial prefrontal cortex, which is the part of your brain right behind the middle of your forehead. We found that this region only tracked the script when it made logical sense - if we scrambled the order of the script (e.g. showed people getting food before they ordered) then it no longer bothered to track what was going on.


I also used some of the analysis tools I previously developed for matching up brain data during perception and recall of the same story to find correspondences between different stories with the same script template. For example, I can use brain activity from the superior frontal gyrus to figure out which parts of the "Up in the Air" story and the "How I Met Your Mother" story take place in the same places in the airport. Here are snippets of the stories that show similar patterns of brain activity:

Script event Up in the Air How I Met Your Mother
Enter aiport Ryan looked up from his boarding pass and sighed. There was his new partner Natalie, awkwardly climbing out of a taxi at the curb of the airport. Barney and Quinn walked into the airport pulling their matching pink tiger-stripe suitcases. Barney leaned over and kissed her on the cheek.
Airport security Ryan shrugged. "Look at the other lines. I never get behind people traveling with infants - I've never seen a stroller collapse in less than twenty minutes. Old people are worse - their bodies are littered with hidden metal and they never seem to appreciate how little time they have left on earth." The guard motioned to several others for backup. "Sir, you need to open this box." "Oh, I can't do that. Magician's code. A magician never reveals his tricks. The only person I could possibly reveal the trick to is another magician."
Boarding gate He stopped halfway to their gate and pointed at a luggage store. "If you're going to be flying with me, you need to get a carry-on bag. You know how much time you lose by checking in?" She tried to interrogate him as they sat in front of the gate, but he refused to spill the beans. "I told you, magician's code."
On plane "I like my own stuff. Don't you like feeling connected to home?" Ryan laughed. "This is where I live. All the things you probably hate about traveling - the recycled air, the artificial lighting - are warm reminders that I am home." In its center was a diamond ring, which Barney plucked from the flower and held out to Quinn. "Quinn, will you marry me?"

I'm currently analyzing some additional data from these subjects, collected while they tried to retell all 16 stories from memory. Our hypothesis is that these script templates should also be useful when trying to remember events, since they give us clues about what kinds of events to search for in our memories. I'm also fascinated by how these scripts get learned, and am hoping to study this learning process both in adults (who are learning a new scripts in the lab) and in children (who are learning real scripts over the course of years).

Comments? Complaints? Contact me @ChrisBaldassano

The three ingredients of reproducible research

Much of the conversation about research methods in science has focused on the "replication crisis" - the fact that many classic studies (especially in psychology) are often not showing the same results when performed carefully by independent research groups. Although there are some debates about exactly how bad the problem is, a consensus is emerging about how to improve the ways we conduct experiments and analyses: pre-registering study hypotheses before seeing the data, using larger sample sizes, being more willing to publish (informative) null results, and maybe being more conservative about what evidence counts as "proof."

But there is actually an even simpler problem we haven't fully tackled, which is not "replicability" (being able to get the same result in new data from a new experiment) but "reproducibility" - the ability to demonstrate how we got the result from the original data in the first place. Being able trace and record the exact path from data to results is important for documenting precisely how the analysis works, and allows other researchers to examine the details for themselves if they are skeptical. It also makes it much easier for future work (either by the same authors or others) to keep analyses comparable across different experiments.

Describing how data was analyzed is of course supposed to be one of the main points of a published paper, but in practice it is almost impossible to recreate the exact processing pipeline of a study just from reading the paper. Here are some real examples that I have experienced firsthand in my research:

  • Trying to replicate a results from papers that used a randomization procedure called phase scrambling, I realized that there are actually at least two ways of doing this scrambling and papers usually don't specify which one they use
  • Confusion over exactly what probability measure was calculated in a published study set off a minor panic when the study authors started to think their code was wrong, before realizing that their analysis was actually working as intended
  • Putting the same brain data into different versions of AFNI (a neuroimaging software package) can produce different statistical maps, due to a change in the way the False Discovery Rate is calculated
  • A collaborator was failing to reproduce one of my results even with my code - turned out that the code worked in MATLAB versions 2015b and 2017b but not 2017a (for reasons that are still unclear)

These issues show that reproducible research actually requires three pieces:

  1. Publicly available data
  2. Open-source code
  3. A well-defined computing environment

The first two things we know basically how to do, at least in theory - data can be uploaded to a number of services that are typically free to researchers (and standards are starting to emerge for complex data formats like neuroimaging data), and code can be shared (and version-controlled) through platforms like GitHub. But the last piece has been mostly overlooked - how can we take a "snapshot" of all the behind-the-scene infrastructure, like the programming language version and all the libraries the code depends on? This is honestly often the biggest barrier to reproducing results - downloading data and code is easy, but actually getting the code to run (and run exactly as it did for the original analysis) can be a descent into madness, especially on a highly-configurable linux machine.

For my recent preprint, I tried out a possible solution to this problem: an online service called CodeOcean. This platform allow you to create an isolated "capsule" that contains your data, your code, and a description of the programming environment (set up with a simple GUI). You can then execute your code (on their servers), creating a verified set of results - the whole thing is then labeled with a DOI, and is publicly viewable with just a browser. Interestingly the public capsule is still live, meaning that anyone can edit the code and click Run to see how the results change (any changes they make affect only their own view of the capsule). Note that I wouldn't recommend blindly clicking Run on my capsule since the analysis takes multiple hours, but if you're interested in messing with it you can edit the file to only conduct a manageable subset of the analyses (e.g. only on a single region of interest). CodeOcean is still under development, and there are a number of features I haven't tried yet (including the ability to run live Jupyter Notebooks, and a way to create a simple GUI for exposing parameters in your code).

For now this is set up as a post-publication (or post-preprint) service and isn't intended for actually working on the analyses (the computing power you have access to is limited and has a quota), but as cloud computing continues to become more convenient and affordable I could eventually see entire scientific workflows moving online.

Comments? Complaints? Contact me @ChrisBaldassano

Live-blogging SfN 2017

[I wrote these posts during the Society for Neuroscience 2017 meeting, as one of the Official Annual Meeting Bloggers. These blog posts originally appeared on SfN's Neuronline platform.]

SuperEEG: ECoG data breaks free from electrodes

The "gold standard" for measuring neural activity in human brains is ECoG (electrocorticography), using electrodes implanted directly onto the surface of the brain. Unlike methods that measure blood oxygenation (which have poor temporal resolution) or that measure signals on the scalp (which have poor spatial resolution), ECoG data has both high spatial and temporal precision. Most of the ECoG data that has been collected comes from patients who are being treated for epileptic seizures and have had electrodes implanted in order to determine where the seizures are starting.

The big problem with ECoG data, however, is that each patient typically only has about 150 implanted electrodes, meaning that we can only measure brain activity in 150 spots (compared to about 100,000 spots for functional MRI). It would seem like there is no way around this - if you don’t measure activity from some part of the brain, then you can’t know anything about what is happening there, right?

Actually, you can, or at least you can guess! Lucy Owen, Andrew Heusser, and Jeremy Manning have developed a new analysis tool called SuperEEG, based on the idea that measuring from one region of the brain can actually tell you a lot about another unmeasured region, if the two regions are highly correlated (or anti-correlated). By using many ECoG subjects to learn the correlation structure of the brain, we can extrapolate from measurements in a small set of electrodes to estimate neural activity across the whole brain.

Super EEG Figure from their SfN poster

This breaks ECoG data free from little islands of electrodes and allows us to carry out analyses across the brain. Not all brain regions can be well-estimated using this method (due to the typical placement locations of the electrodes and the correlation structure of brain activity), but it works surprisingly well for most of the cortex:

Super EEG2

This could also help with the original medical purpose of implanting these electrodes, by allowing doctors to track seizure activity in 3D as it spreads through the brain. It could even be used to help surgeons choose the locations where electrodes should be placed in new patients, to make sure that seizures can be tracked as broadly and accurately as possible.

Hippocampal subregions growing old together

To understand and remember our experiences, we need to think both big and small. We need to keep track of our spatial location at broad levels ("what town am I in?") all the way down to precise levels ("what part of the room am I in?"). We need to keep track of time on scales from years to fractions of a second. We need to access our memories at both a coarse grain ("what do I usually bring to the beach?") and a fine grain ("remember that time I forgot the sunscreen?").

Data from both rodents and humans has suggested that different parts of the hippocampus keep track of different levels of granularity, with posterior hippocampus focusing on the fine details and anterior hippocampus seeing the bigger picture. Iva Brunec and her co-authors recently posted a preprint showing that temporal and spatial correlations change along the long axis of the hippocampus - in anterior hippocampus all the voxels are similar to each other and change slowly over time, while in posterior hippocampus the voxels are more distinct from each other and change more quickly over time.

In their latest work, they look at how these functional properties of the hippocampus change over the course of our lives. Surprisingly, this anterior-posterior distinction actually increases with age, becoming the most dramatic in the oldest subjects in their sample.

Iva1 The interaction between the two halves of the hippocampus also changes - while in young adults activity timecourses in the posterior and anterior hippocampus are uncorrelated, they start to become anti-correlated in older adults, perhaps suggesting that the complementary relationship between the two regions has started to break down. Also, their functional connectivity with the rest of the brain shifts over time, with posterior hippocampus decoupling from posterior medial regions and anterior hippocampus increasing its coupling to medial prefrontal regions.

Iva2 These results raise a number of intriguing questions about the cause of these shifts, and their impacts on cognition and memory throughout the lifespan. Is this shift toward greater coupling with regions that represent coarse-grained schematic information compensating for degeneration in regions that represent details? What is the “best” balance between coarse- and fine-timescale information for processing complex stimuli like movies and narratives, and at what age is it achieved? How do these regions mature before age 18, and how do their developmental trajectories vary across people? By following the analysis approach of Iva and her colleagues on new datasets, we should hopefully be able to answer many of these questions in future studies.

The Science of Scientific Bias

This year’s David Kopf lecture on Neuroethics was given by Dr. Jo Handelsman, entitled “The Fallacy of Fairness: Diversity in Academic Science”. Dr. Handelsman is a microbiologist who recently spent three years as the Associate Director for Science at the White House Office of Science and Technology Policy, and has also led some of the most well-known studies of gender bias in science.


She began her talk by pointing out that increasing diversity in science is not only a moral obligation, but also has major potential benefits for scientific discovery. Diverse groups have been shown to produce more effective, innovative, and well-reasoned solutions to complex problems. I think this is especially true in psychology - if we are trying to create theories of how all humans think and act, we shouldn’t be building teams composed of a thin slice of humanity.

Almost all scientists agree in principle that we should not be discriminating based on race or gender. However, the process of recruiting, mentoring, hiring, and promotion relies heavily on “gut feelings” and subtle social cues, which are highly susceptible to implicit bias. Dr. Handelsman covered a wide array of studies over the past several decades, ranging from observational analyses to randomized controlled trials of scientists making hiring decisions. I’ll just mention two of the studies she described which I found the most interesting:

  • How it is possible that people can be making biased decisions, but still think they were objective when they reflect on those decisions? A fascinating study by Uhlmann & Cohen showed that subjects rationalized biased hiring decisions after the fact by redefining their evaluation criteria. For example, when choosing whether to hire a male candidate or a female candidate, who both had (randomized) positive and negative aspects to their resumes, the subjects would decide that the positive aspects of the male candidate were the most important for the job and that he therefore deserved the position. This is interestingly similar to the way that p-hacking distorts scientific results, and the solution to the problem may be the same. Just as pre-registration forces scientists to define their analyses ahead of time, Uhlmann & Cohen showed that forcing subjects to commit to their importance criteria before seeing the applications eliminated the hiring bias.

  • Even relatively simple training exercises can be effective in making people more aware of implicit bias. Dr. Handelsman and her colleagues created a set of short videos called VIDS (Video Interventions for Diversity in STEM), consisting of narrative films illustrating issues that have been studied in the implicit bias literature, along with expert videos describing the findings of these studies. They then ran multiple experiments showing that these videos were effective at educating viewers, and made them more likely to notice biased behavior. I plan on making these videos required viewing in my lab, and would encourage everyone working in STEM to watch them as well (the narrative videos are only 30 minutes total).


Drawing out visual memories

If you close your eyes and try to remember something you saw earlier today, what exactly do you see? Can you visualize the right things in the right places? Are there certain key objects that stand out the most? Are you misremembering things that weren’t really there?

Visual memory for natural images has typically been studied with recognition experiments, in which subjects have to recognize whether an image is one they have seen before or not. But recognition is quite different from freely recalling a memory (without being shown it again), and can involve different neural mechanisms. How can we study visual recall, testing whether the mental images people are recalling are correct?

One way option is to have subjects give verbal descriptions of what they remember, but this might not capture all the details of their mental representation, such as the precise relationships between the objects or whether their imagined viewpoint of the scene is correct. Instead, NIMH researchers Elizabeth Hall, Wilma Bainbridge, and Chris Baker had subjects draw photographs from memory, and then analyzed the contents of those drawings.


This is a creative but challenging approach, since it requires quantitatively characterizing how well the drawings (all 1,728!) match the original photographs. They crowdsource this task using Amazon Mechanical Turk, getting high-quality ratings that include: how well can the original photograph be identified based on the drawing, what objects were correctly drawn, what objects were falsely remembered as being in the image, and how close the objects were to their correct locations. There are also “control” drawings made by subjects with full information (that get to look at the image while they draw) or minimal information (just a category label) that were rated for comparison.

The punchline is that subjects can remember many of the images, and produce surprisingly detailed drawings that are quite similar to those drawn by the control group that could look at the pictures. They reproduce the majority of the objects, place them in roughly the correct locations, and draw very few incorrect objects, making it very easy to match the drawings with the original photographs. The only systematic distortion is that the drawings depicted the scenes as being slightly farther away than they actually were, which nicely replicates previous results on boundary extension.

This is a neat task that subjects are remarkably good at (which is not always the case in memory experiments!), and could be a great tool for investigating the neural mechanisms of naturalistic perception and memory. Another intriguing SfN presentation showed that is possible to have subjects draw while in an fMRI scanner, allowing this paradigm to be used in neuroimaging experiments. I wonder if this approach could also be extended into drawing comic strips of remembered events that unfold over time, or to illustrate mental images based on stories told through audio or text.

Comments? Complaints? Contact me @ChrisBaldassano

Reality, now in extra chunky

Our brains receive a constant stream of information about the world through our senses. Often sci-fi depictions of mind-reading or memory implants depict our experiences and memories as being like a continuous, unbroken filmstrip.

Final Cut From The Final Cut, 2004

But if I ask you to describe what has happened to you today, you will usually think in terms of events - snippets of experience that make sense as a single unit. Maybe you ate breakfast, and then brushed your teeth, and then got a phone call. You divide your life into these separate pieces, like how separate memory orbs get created in the movie Inside Out.

Inside Out From Inside Out, 2015

This grouping into events is an example of chunking, a common concept in cognitive psychology. It is much easier to put together parts into wholes and then think about only the wholes (like objects or events), rather than trying to keep track of all the parts separately. The idea that people automatically perform this kind of event chunking has been relatively well studied, but there are lots of things we don't understand about how this happens in the brain. Do we directly create event-level chunks (spanning multiple minutes) or do we build up longer and longer chunks in different brain regions? Does this chunking happen within our perceptual systems, or are events constructed afterwards by some separate process? Are the chunks created during perception the same chunks that get stored into long-term memory?

I have a new paper out today that takes a first stab at these questions, thanks to the help of an all-star team of collaborators: Janice Chen, Asieh Zadbood (who also has a very cool and related preprint), Jonathan Pillow, Uri Hasson, and Ken Norman.

The basic idea is simple: if a brain region represents event chunks, then its activity should go through periods of stability (within events) punctuated by sudden shifts (at boundaries between events). I developed an analysis tool that is able to find this kind of structure in fMRI data, determining how many of these shifts happen and when then happen.

The first main result is that we see event chunking in lots of brain regions, and the length of the events seems to build up from short events (seconds or less) in early sensory regions to long events (minutes) in higher-level regions. This suggests that events are an instrinsic part of how we experience the world, and that events are constructed through multiple stages of a hierarchy.


The second main result is that right at the end of these high-level events, we see lots of activity in brain regions the store long-term memories, like the hippocampus. Based on some additional analyses, we argue that these activity spikes are related to storing these chunks so that we can remember them later. If this is true, then our memory system is less like a DVR that constantly records our life, and more like a library of individually-wrapped events.

There are many (many) other analyses in the paper, which explains why it took us about two years to put together in its entirety. One fun result at the end of the paper is that people who already know a story actually start their events a little earlier than people hearing a story for the first time. This means that if I read you a story in the scanner, I can actually make a guess about whether or not you've heard this story before by looking at your brain activity. This guessing will not be very accurate for an individual person, so I'm not ready to go into business with No Lie MRI just yet, but maybe in the near future we could have a scientific way to detect Netflix cheaters.

Comments? Complaints? Contact me @ChrisBaldassano