Monday, July 22, 2019

Perfume and Big Data: Garbage In, Garbage Out

 A new “big data” study of perfume has been making a splash the past couple of weeks. Titled “Social success of perfumes,” it was published in PLoS ONE on July 4. It’s another in the recent genre of throwing abstruse mathematics at the sense of smell.

The authors, Vaiva Vasiliauskaite and Tim S. Evans, are in the physics department at Imperial College London, more specifically the Theoretical Physics Group and Centre for Complexity Science. [Awesome nameage!—Ed.] Vasiliauskaite is a graduate student and apparently a talented nerd, having graduated from the University of Glasgow with a degree in Theoretical Physics. Evans is a Senior Lecturer.

Here’s the abstract of their paper:
We study data on perfumes and their odour descriptors—notes—to understand how note compositions, called accords, influence successful fragrance formulas. We obtain accords which tend to be present in perfumes that receive significantly more customer ratings. Our findings show that the most popular notes and the most over-represented accords are different to those that have the strongest effect to the perfume ratings. We also used network centrality to understand which notes have the highest potential to enhance note compositions. We find that large degree notes, such as musk and vanilla as well as generically-named notes, e.g. floral notes, are amongst the notes that enhance accords the most. This work presents a framework which would be a timely tool for perfumers to explore a multidimensional space of scent compositions.
Leaving aside the technical terms “network centrality” and “large degree notes,” the claims in the abstract seem clear: the authors have identified odor descriptors that drive the commercial success of specific perfume formulations. Sounds interesting and useful. One dives eagerly into the paper to find the details.

That’s where the problems start.

No actual perfumes were smelled in the making of this study. Nor were any actual perfume formulations examined. Instead, the authors apparently scraped perfume description and ranking data from a website, cleaned it up a bit, and then proceeded to slice, dice, and theorize.

I say “apparently” because nowhere in the paper do they describe where or how they obtained their data. I imagine my amateur readers stammering, “B-b-b-but, don’t scientists have to describe their data?” Yeah, well, uh, no, I guess not.

From clues in the text (e.g., perfumes rated on a five-point scale), Vasiliauskaite and Evans may have tapped as their source. In which case, they might have been courteous enough to give the site’s proprietors a shout-out. Or, better yet, have obtained permission to use the site’s data.

Of greater ethical concern is the absence of the source data in a supplemental file or online scientific archive. Making the data available is a requirement of publication in PLoS ONE.
PLOS journals require authors to make all data underlying the findings described in their manuscript fully available without restriction at the time of publication.
I don’t understand how this basic requirement could have been overlooked by the reviewers or the paper’s editor, Yongli Li of the Harbin Institute of Technology in China.

So, let’s look again at the abstract. By “successful fragrance formulas” the authors mean perfume brands with a large number of user reviews and high ratings on an undisclosed fragrance website. By “notes” and “odor descriptors” they mean descriptors provided by anonymous reviewers of undetermined skill level and/or marketing verbiage lifted from advertisements and promotional copy. On, gauzy words like “honey,” “amber” and “musk”—which refer to no one specific perfumery material—are given equal weight with clary sage, tonka bean, and oakmoss (see, for example, Paco Rabanne Pour Homme). By “accords” the authors appear to mean co-occurring “notes” on a webpage.

Other passages make one wonder if the authors know what they’re talking about. Take this paragraph, for example:
Information of the precise amounts of each ingredient in the formulation of a perfume is confidential, to prevent duplications of the formula. However, the list of ingredients, the list of notes, is often advertised in order to describe the scent of a perfume. Thus a perfume which smells of rose, vanilla and musk, is described using such notes. In this study we have analysed the notes which make up over ten thousand perfumes without knowing anything about their specific amounts in each perfume. We assume that a note is included in the perfume description as its presence enriches the composition and its smell is detectable.
This paragraph is confused to the point of idiocy. A perfume’s list of ingredients is never revealed, much less advertised. What is publicly promoted by the brand is a short list of note names meant to imply romance, exotic origins, and high quality as much as what the perfume smells like. Even then, it’s unclear whether the “notes” used in this study were provided by the brand or plucked from the website’s crowd-sourced reviews. The authors “have analysed the notes which make up over ten thousand perfumes”? Fairer to say they’ve analyzed notes attributed to perfumes (by persons unknown) that may or may not include all the salient notes in a given perfume. The stated assumption that each note in a description is individually “detectable” is ridiculous. Do the authors believe that the grapefruit and Calabrian bergamot in Coco Noir are individually detectable? One wonders how much smelling they’ve ever done.

At this point it’s clear that no matter how much “network” and “non-network” statistical analysis they apply to this slop bucket of data, the results will lack specificity and insight. For all of its nodes, edges, weighted network representations, permutation tests, d-scores, and one-mode projections, the study doesn’t make much contact with the commercial or sensory realities of perfumery. It could have been interesting as an analysis of brand attributes seen through social media. But even there it fails.

This paper isn’t worth the time it takes to download.

UPDATE September 19, 2019
Called it! The paper has now been retracted by PLoS ONE.

The study discussed here is “Social success of perfumes,” by Vaiva Vasiliauskaite and Tim S. Evans, PLoS ONE 14(7): e0218664.


Peter Apps said...

Throwing dodgy data into elaborate and sophisticated statistical modelling, stirring vigorously and seeing what floats to the top is becoming a disturbingly common substitute for experimental design and critical thinking

Avery Gilbert said...

Peter Apps,

Agree. I'd add that not all naturally occurring datasets are dodgy. One needs to judge their reliability, and their appropriateness to the question at hand. That didn't happen here.

Peter Apps said...

No, indeed, and selecting which pre-existing data are sound is where the critical thinking comes in. All else being equal, there seems to be an inverse relationship between quantity and quality.

Mahesh said...

Hi, I looked at most of your posts. This article is probably where I got the most useful information for my research. Thanks for posting, we can find out more about this. Do you know of any other websites on this topic?
Artificial Intelligence Course in Delhi