Thursday, February 23, 2017

Can We Predict a Molecule’s Smell from Its Physical Characteristics?


An extract of Yuanfang Guan’s winning code for odor prediction

A paper in this week’s edition of Science claims that computer models can predict the smell of a molecule. The paper describes the organization and outcome of an IBM Dream Challenge in which multiple laboratories competed to see whose model best predicts sensory characteristics from chemical parameters.

This crowd-sourced effort began with an olfactory dataset collected and published in 2016 by Andreas Keller and Leslie Vosshall. (Full disclosure: I previously collaborated with Keller and Vosshall on a different smell study.) They had 49 test subjects sniff and rate 476 “structurally and perceptually diverse molecules” using 19 semantic descriptors plus ratings of odor intensity and pleasantness.

In setting up the Dream Challenge, the organizers also “supplied 4884 physicochemical features of each of the molecules smelled by the subjects, including atom types, functional groups, and topological and geometrical properties that were computed using Dragon chemoinformatic software.”

There are several positive aspects to the challenge design. First, instead of recycling the decades-old Dravnieks dataset like so many other attempts at chemometric-based odor prediction, the sponsors supplied a fresh psychophysical dataset. Second, the study included a boatload of odorants, not the handful of smells found in most sensory studies. Third, the odor ratings were gathered from a relatively large number of sensory panelists. Forty-nine is not a super-robust sample size but it’s enough to encompass a lot of the person-to-person variability found in odor perception.

Here’s how the competition worked. Each team was given the molecular and sensory data for 338 molecules. They used these data to build computer models that predicted the sensory ratings from the chemical data. Sixty-nine molecules (absent the sensory data) were used by the organizers to construct a “leaderboard” to rank each team’s performance during the competition. The leaderboard sensory data were revealed to contestants late in the game to let them fine tune their models. Finally, another 69 molecules were reserved by the organizers and used to evaluate performance of the finalized models.

The models were judged on how well their predictions matched the actual sensory data using a bunch of wonky statistical procedures that look reasonable on my cursory inspection. (About the algorithmic structure of the competing models I have nothing useful to say, as “random-forest models” and the like are beyond my ken.) For the sake of argument I will assume that the statistical scorekeeping was appropriate to the task. My concern here is with the sensory methodology, the underlying assumptions, and the claims made for the predictability of odor perception.

Let’s begin with semantic descriptors. The widely used U.C. Davis Wine Aroma Wheel uses 86 terms to describe wine. The World Coffee Research Sensory Lexicon uses 85 terms to describe coffee. The Science paper uses 19 terms to describe a large set of “perceptually diverse” odorants which strikes me as a relatively paltry number. (The descriptors were: garlic, sweet, fruit, spices, bakery, grass, flower, sour, fish, musky, wood, warm, cold, acid, decayed, urinous, sweaty, burnt, and chemical.) Well, you might ask, can’t they just add more descriptors to include qualities like “minty” and “fecal” and “skunky”? It’s not that easy, as I discuss below.

The internal logic of the descriptors presents another issue. Some are quite specific (garlic), other very broad (spices), and still others are ambiguous (chemical). What are we to make of “bakery” as a smell? Is it yeasty like baking bread? Is it the smell of fresh cinnamon buns? (Or would that be “sweet”? Or “spices”?). The problem here is that words that are useful in an olfactory lexicon occur at different levels of cognitive categorization. This is reflected in the wine and coffee examples.

The Wine Aroma Wheel has twelve categories, each with one to six subcategories. For example, the Fruity category includes Citrus which consists of Lemon and Grapefruit. The higher level categories provide overall conceptual structure and are themselves useful as descriptors (e.g. a scent might be citrus-like while not smelling exactly of lemon or grapefruit).

Sensory specialists (including tea tasters, beer brewers, and perfumers) spend a lot of effort setting up lexicons that are concise and hierarchical, and which cover the relevant odor perception space. How were the 19 terms in the Science study arrived at? We do not know. How well do they cover the relevant perception space? We do no know. In fact, the authors state that “the size and dimensionality of olfactory perceptual space is unknown.”

These 19 terms are the basis on which the competing computer models were ranked. Thus a model's success at prediction is locked-in to this specific set of terms (plus intensity and pleasantness). In other words, this is not a general solution to smell prediction: it is specific to these odors and these adjectives. The authors openly admit this:
While the current models can only be used to predict the 21 attributes, the same approach could be applied to a psychophysical dataset that measured any desired sensory attribute (e.g. “rose”, “sandalwood”, or “citrus”).
So if one wants to predict what molecules might smell of sandalwood or citrus, one would have to retest all 476 molecules on another 49 sensory panelists using the new list of descriptors, then re-run the computer models on the new dataset. Easy peasy, right? Alternatively one could assemble a sensory panel and have the members sniff the molecules of interest and rate them on the new attributes of interest. Every fragrance and flavor house has such a panel. That’s how they currently evaluate the aroma of new molecules: they sniff them.

Thus the Dream challenge seems to be tilting at a windmill that the fragrance and flavor industry doesn’t see. The search for new molecules is not done by searching random molecular permutations. It is driven by specific market needs, say for a less expensive sandalwood smell or for a strong-smelling but environmentally safe musk. The parameters are cost, safety, and patentability, along with stability, compatibility in formulations, and (for perfumers) novelty.

Who knows, the smell prediction algorithms of the Dream challenge may turn out to be the first step in automating the exploration of chemosensory space. However I’d be surprised if this approach turns out to be generalizable and amazed if it proves useful in applied settings.

Don’t get me wrong. I like the idea of using Big Data to understand olfaction—have a look at my papers based on the National Geographic Smell Survey. I urged Keller and Vosshall to go big in terms of odorants and the number of sensory panelists for what became our co-authored paper in BMC Neuroscience. At the same time I respect the complexity of odor perception and the effort required to map its natural history. And I think the perceptual side of the equation got short shrift in this study.


The studies discussed here are “Predicting human olfactory perception from chemical features of odor molecules,” by Andreas Keller, et al., published online February 20, 2017 in Science, and “Olfactory perception of chemically diverse molecules,” by Andreas Keller and Leslie B. Vosshall, BMC Neuroscience 17:55, 2016.

Tuesday, January 3, 2017

Rate of Decay: The Case of Jonah Lehrer’s Twitter Account

Anyone active on Twitter experiences follower churn—the constant arrival of new followers and departure of existing ones. Some arrivals are follow-whores who will leave in short order if you fail to follow them back. Some are fake accounts attempting to build a legit patina. (Fake accounts are easy to spot and I delight in kicking them off my feed.) Then there are real-life porn actors and jihadists seeking to expand their reach. (Blocked and blocked.) Others follow you based on the odd single tweet and depart when they find your regular material is not to their taste. (de gustibus).

In general, one must tweet frequently to gain new followers. If you have a truly loyal set of followers they may stick around even if you tweet rarely.

But what happens at the limit, when an account ceases to tweet at all? In the absence of new material it is unlikely to attract new followers. Existing followers may eventually unfollow, or close their accounts, or be banned by Twitter. Thus we can expect an inactive account to shed followers gradually. But at what rate?

I have harvested data on a weekly basis from several Twitter accounts. One is that of Jonah Lehrer who enjoyed a brief vogue as a literary explainer of neuroscience. (I found him to be a superficial thinker and a lazy scholar; see the Proust chapter in What the Nose Knows.) After it became clear that Lehrer had recycled his own material and plagiarized the work of others he withdrew from the science journo-biz and, among other things, ceased tweeting.


The last regular tweet on @jonahlehrer was dated June 17, 2012. On February 13, 2013 he posted a link to the text of a speech he gave to the Knight Foundation in which he apologized for his behavior (and for which he was paid $20,000). After that, nada.

So how did Lehrer’s Twitter followers react after he went silent? Well, here’s the answer, based on weekly tallies from October 14, 2012 through December 31, 2016.


Over that period Lehrer lost 6,258 followers. Their number declined to 40,620 from 46,878. The steady decline was interrupted by three increases: a spike of 2,005 followers the week of October 28, 2012; a blip of 369 followers around May 2013, and another spike of 1,998 in the week of August 24, 2013. (Cynical readers might note that Twitter followers can be bought by the thousand online. Whether something like that happened here, I cannot say. The spikes remain a mystery.)

Aside from the anomalous spikes, the decline in followers shows a remarkably steady linear trend. I analyzed the 173 weeks following the second spike, during which the follower count dropped to 40,620 from 47,800 for a loss of 7,180. Over that interval, Lehrer lost on average -0.0935% of his followers each week. Based on this rate of decay, the half-life of his following is 741 weeks or about 14 years. In other words, he should be down to 20,000 followers in 2031. We can expect him to dip under 100 followers in the year 2140.

That’s one long, shallow glide path.

Is Lehrer’s case typical? Who knows. Maybe his followers are fanatically devoted and waiting, year after year, for him to return to Twitter. Or maybe they never noticed that he left in the first place. Having once clicked “follow” they remain fixed to his account like so many barnacles on the bottom of a boat.

Monday, November 28, 2016



It’s Cyber Monday and I’ve put the Kindle version of What the Nose Knows on sale at a deep discount: only $2.99. Why not grab a copy?

Do you have a friends or relatives who are into scent? You can buy them a Kindle copy! Just hit the “Give as Gift” button on Amazon. You can have it delivered to them instantly or you can schedule it to arrive on their device at any time and date you choose. Great idea for Christmas or Hanukkah.

Plus: plenty of time to order a paperback copy for old-school types. It’s already priced as low as it can go!

Sunday, September 11, 2016

Fifteen Years After 9/11



Fifteen years later and the emotions remain raw.

These are notes I made on the fifth anniversary of 9/11 when I still lived in Montclair, New Jersey.

September 11, 2006

Over morning coffee I read some blog essays remembering the attacks and where we stand five years later. S and I talked about our frustration with friends who don’t see the same threat we do; how we want to shake them by the shoulders and wake them up.

I couldn’t concentrate on my writing so I set off to do some mindless photocopying in town. Walking to the car I noticed the sky was a perfectly clear blue. Strange how the weather on the anniversary is almost always the same as the original.

I drove past K’s house [the 9/11 widow up the block] and noticed several pickup trucks parked outside. Her fireman friends keeping her company, as they do every year on this day.

I saw [my neighbor] D jogging down the street and pulled over to say hello.

“This is such a sad day,” he said.

“I know. It still gets to me too,” I said. (I had to look away.)

Then he blurted out, “It’s the same sky.” Now he was sobbing. “I’ve been crying almost my whole run.”

He walked off and I drove to the copy shop where I had to keep my shades on.

Thought I was back on an even keel. Then at four in the afternoon a Montclair Fire Department ladder truck came up the street with lights flashing, flying the Stars and Stripes. They paused a while in front of K’s house, blew their horn, and drove on.

I was a mess.

It’s dark out now. They’ve lit the two blue lights at Ground Zero. I still think it’s the best memorial.

Wednesday, June 15, 2016

Avery and the Cannabis Factory



I recently had the opportunity to tour a licensed, commercial cannabis grow operation in Denver. It was a mind-blowing experience, but not for the obvious reason—I haven’t tried marijuana since an unfortunate episode in graduate school. [A story, like that of the giant rat of Sumatra, for which the world is not yet prepared.—Ed.]

The operation is located in a nondescript, unlabeled building in the semi-industrial outskirts south of town. One step inside and there is no doubt where you are—the scent of cannabis bud hits you full in the face. That’s because the drying room is next to the entrance. Newly harvested and trimmed buds are spread on wire mesh frames, which are slotted into wall racks. The drying room is kept dark and air conditioned, while fans circulate the chilly air. This struck me as a counterintuitive way to dry the product, but Gary, the slim, thirty-something Willy Wonka who runs the site, tells me the aim is to suppress mold growth.*

As I sign in and get my visitor’s badge, I notice the wall of the entryway—it’s covered with a dozen or more framed certificates from the army of agencies that regulate every aspect of commercial cannabis in the state of Colorado: health, revenue, environmental, etc.

A couple of doors down is a small break room. With a refrigerator and cute sign reminding people to clean up after themselves, it could be the lunch area of a muffler shop or book store—except for the three HD screens on one wall. Each screen displays multiple views from security cameras mounted everywhere in the building, inside and out. Security is a major concern and private patrols visit the facility throughout the night.

The actual process of cultivation begins with small, two- or three-branch cuttings from mature plants. Like straws in soda cup lid, they are stuck into fittings atop a plastic tub. Inside the tub an “aquaponics” system periodically sprays the bottom of the clippings; the water contains carefully calibrated amounts of nutrients to encourage root growth. Once the sproutlings have developed enough of a root system, they are transferred to quart-size plastic bags of potting soil. Then they are given a unique, bar-coded, plastic ID tag purchased from a state-authorized provider. Individual plants are tracked this way until harvest—part of the minute oversight of inventory required by Colorado authorities.

In their quart bags, the new plants resemble the immature tomato plants for sale at Safeway. Arranged in neat rows on white platforms, they begin to bask under a glaring array of artificial lights that keeps the room as warm as the greenhouse it is. The room air is humid but odorless.

After a week or two, the now established plants are transferred to three- or five-gallon bags and moved to a “vegetative room” where they get 18 hours of light daily from high output T5 fluorescent bulbs. They are watered via an automated drip tube system. The room air is kept moving by wall-mounted oscillating fans. Here, again, the room is relatively nonodorous.

Once they are mature, the plants are transferred to a “bloom room.” Individual stalks are woven through a loose rope grid that separates the tops and insures that each newly forming flower bud gets maximal exposure to light. Beside the usual automatic lighting and watering system (now delivering a nutrient mix optimized to encourage bud growth), the bloom room is supplied with extra carbon dioxide. It is delivered (during “daytime” only) via an automated system that monitors the ambient concentration and keeps it at a specific parts per million level. On the bloom room door is posted a sign: “You are entering a possibly oxygen deficient atmosphere.” Should the CO2 level exceed the safety threshold, a warning light flashes and an alarm sounds. It’s an expensive system, but it’s required by the state. “Seems reasonable,” I say to Gary, who shrugs his shoulders. He operates under an OSHA rule that specifies one permissible level for CO2 and under a Colorado marijuana-specific rule that specifies a different, lower level.

In the bloom room, I notice for the first time a distinct background odor—it smells like ripe, even overripe, cantaloupe melon. Nothing at all like the funky herbal smell of dry bud. Yet in order to minimize the risk of complaints from neighbors, Gary has installed large, cylindrical carbon filter units in each bloom room. Harvest day, he says, produces the greatest amount of smell, presumably from the volatile and odorous terpenes that are released as the plants are cut.

The previous tenant of this building was a light-industrial operation. When the marijuana company took over the building it completely renovated the interior. The array of active technologies—lights, fans, watering systems, CO2 delivery, air filtration, HVAC—requires an enormous amount of power. Multiple electrical conduits and air conditioning ducts course along the ceiling and curve down to individual grow rooms—it’s like being inside a gigantic pipe organ. Gary is on personal terms with his Carrier air conditioner rep who makes weekly visits to keep the system tuned up.

Gary keeps 70 different cannabis strains in rotation. From a manufacturing and logistics point of view this is an insane number. But the dispensaries he supplies—and their retail customers—demand it. If he delivers only 10 varieties they complain. What drives this demand? Besides sheer THC potency there are real and/or perceived psychotropic differences between sativa and indica strains, as well as differences in bud aroma and smoke flavor. And thus separate calls for OG Kush, Girl Scout Cookies, Strawberry Cough, Alaskan Thunder Fuck, and all the rest.

Having lived for a year now in Colorado, I’m beginning to see the connection with another artisanal, sensory- and botanical-based industry: beer. There are a ton of microbreweries in the state, and each one produces a range from stouts to IPAs. A microbrewery that produced only two or three beers would soon fail. People revel in sensory choice. And if they are accustomed to getting it in one area, they expect it in others.

This leads me to propose Gilbert’s First Conjecture of Sensory Marketing:
For a given zip code, the number of microbrews on tap is positively correlated with the number of cannabis strains available in dispensaries.
Besides the usual burdens of running a business—hiring, scheduling, payroll, accounting—Gary has the responsibilities of a vintner. He decides when the plants are ready to harvest: too soon and they lack potency, too late and they become stale with oxidized THC. He has to plan the rotations of each strain, and stagger their growing cycles so he can harvest one of his eight bloom rooms each week. Then there is the matter of drying and curing. Gary does this the old school way, in two steps. From the racks in the drying room the buds go into large glass jars, equipped with humidity sensors, where they stay for another week or so, until they reach their optimum in terms of aroma and smokability. I suspect cigar makers and aficionados would agree. I sniff from several jars to get an idea of the range of strain-specific aromas. Some are not very smelly; in others I detect mint, a conifer note, and the vegetal scent of cooked artichoke. Elsewhere I’ve smelled buds that reeked of orange peel. There’s a lot of opportunity here for systematic olfactory evaluation.



With demand far outstripping supply, commercial cannabis growers are under enormous pressure to deliver mass quantities. To speed things up, some growers claim to use techniques that telescope drying and curing into a single step. Not Gary. However, he does make some concessions to efficiency. Instead of having employees trim buds by hand, he puts the buds into a tumbler-like device, an automated Edward Scissorhands. The broken leaf residue or “trim” doesn’t go to waste. (It can’t—the state requires that he account for every gram of it.) Gary’s operation, like others, sells the trim to extractors who transform it into oil, shatter, and the multitude of derivative products that are sold directly to consumers or that go into products such as edibles.

I have toured essential oil extraction operations in Grasse, fragrance compounding factories in New Jersey, a Budweiser brewery in Colorado, and numerous wineries and champagne makers in California. They all manage to bring industrial scale to what is, at its core, a process of subjective, sensory-based, aesthetic design. The Denver grow house, with its blend of technology, agronomy, and logistics, tells me that cannabis is on the road to similar scale.

*If Gary is the Willy Wonka, the Oompa-Loompas of the factory bear an uncanny resemblance (minus tattoos and piercings) to certain of my Berkeley Coop housemates of the 1970s.