During the pandemic, social media sleuths, epidemiologists, and health nerds have started noticing an interesting trend in the Yankee Candle review section on Amazon.
Whenever there was an influx of negative reviews citing no smell, there was usually a spike in COVID cases to go along with it.
new wave of bad reviews for yankee spark plugs pic.twitter.com/1mlandB78I
— drawtoothpaste (@drawtoothpaste) December 21, 2021
Loss of sense of smell is one of the most recognized symptoms of an infection. After noticing this trend, people started wondering: Could the reviews themselves be a reliable indicator of a resurgence of the virus?
This theory has been put under the microscope and has taken on new relevance amid concerns over a lack of official data on infections across the United States as another winter approaches.
How a notice became a red flag
Nick Beauchamp is an associate professor of political science at Northeastern University and first caught wind of the Yankee Candle theory late last year.
He decided it wouldn’t be too hard to find out if there was actually a connection. And after focusing on previous projects that attempted to predict COVID cases using social media data, he sought to build a model to test it.
“I just thought, well, that’s pretty easy to do. Maybe I’ll just try scraping a few reviews from Amazon and see what the real trends are, instead of just cutting and pasting a few reviews who mention a lack of odor,” Beauchamp said.
John Greim/LightRocket via Getty Images
To his surprise, the relationship was clear; COVID cases have followed a very similar pattern to the frequency of examinations.
Here’s a chart of “no smell” complaints for the three best Yankee candles on Amazon. pic.twitter.com/EFUsGil5k4
—Nick Beauchamp (@nick_beauchamp) December 22, 2021
Beauchamp’s first tweets about the findings in December 2021 also went viral, and he was quick to add more data to come up with a definitive answer. In mid-January, he wrote an article and submitted it to a journal, and in June of this year it was published.
“It’s a very small journal, but I think it’s got a lot of people interested, especially because it tries to do something a little more carefully about something that a lot of people have qualitatively noticed on Twitter.” , did he declare.
Ultimately, the paper’s findings showed that COVID cases were predictive reviews, which means if there was a recorded increase in COVID cases, there would likely be an increase in negative reviews. But could it work the other way around?
“The other thing I was trying to come up with was, ‘Can we predict COVID cases using reviews? “And what we found is that at least until December 2021, not really. Using past COVID cases to predict future COVID cases is pretty good, and you can’t really do better by using reviews.
But something happened. After adding more months of data to his model in June of this year, he found that the relationship between reviews and COVID rates had reversed again: Reviews were now predictive of COVID rates.
In other words, the rise in negative reviews might actually be an earlier warning sign than the official COVID data.
“It’s either due to a lack of COVID metrics, or worse COVID metrics, or maybe something else changing. I guess the reviews themselves weren’t changing much,” said said Beauchamp.
An interesting reaction seen by Beauchamp has been the tweets and the study itself have evolved into their own sets of metadata, gaining popularity again as users notice an increase in COVID cases.
Some researchers refer to these trends as “digital breadcrumbs” because online activities, such as searches, interacting with old Twitter feeds or, in this case, posting a review, can provide insight unique to a person’s actual circumstances.
As for Beauchamp, he maintains a healthy level of skepticism for the study, even with all of its controls.
Why some think official data is a ‘big mess’
These days, the quality of COVID tracking has become a cause for concern for Beauchamp and other experts working with public health data, especially since President Joe Biden has declared the pandemic “over.”
“Traditional data sources are getting worse. The CDC is kind of reducing their measurements. Everybody’s measuring themselves less frequently. They’re reporting these things to government agencies less frequently,” Beauchamp said.
He also cited the reduction in sewage measures and said the frequent attention to reviews of Yankee Candle was an example of how many people were still invested in tracking COVID numbers.
“Those of us who still care and worry about the pandemic, and don’t think it’s over, are looking for other sources of data that can be used to track new waves and that sort of thing. “, he added. .
Abraar Karan is an infectious disease physician and researcher at Stanford University and said the evolving nature of the virus has made it difficult to locate and sustain the most efficient means of collecting and analyzing COVID data, particularly three years after the start of the pandemic.
“If we go back to the start of the outbreak, every case that we were documenting meant a lot. And we were trying to figure out what to do with that data,” Karan said.
Over time, new issues have arisen, such as reinfections and how to document them. Karan also cited reduced testing and its decentralization as other obstacles. Many people have frequently stopped getting tested, if at all, and those who choose to get tested at home often do not report their results to public health services.
Justin Sullivan/Getty Images
But at this point in the pandemic, Karan said tracking some key sources, though less robust than in previous years, had proven to be an effective strategy, given the breadth of data available from years past. previous ones.
He said observing trends in reported cases was the clearest method, as long as there was no recent change in the amount of testing available.
“The most relevant question I get asked, as a doctor or an epidemiologist, is, ‘What is the risk of me doing activity X to get SARS-CoV-2?’ And frankly, you can really widely at this point, respond only on the basis of the [trend] activity, and less about what’s going on around you, because the data is a big mess,” he said.
Karan also noted that sewage data could be extremely useful, although not very accurate in measuring the number of cases.
Ultimately, Karan said a combination of data sources could help experts and everyday people make the best decisions for themselves when it comes to their COVID safety.
“People are constantly evaluating these risks and benefits based on limited data, but data nonetheless. So you can triangulate a lot of things, like all of the things that we just talked about to get some sort of assessment of where we stand. are with new variants”, he said.
And when it comes to including Yankee Candle data in the mix?
“Those kinds of things are used more in public health for research. But at this stage of COVID, I don’t think candle reviews are going to change our public health strategy,” he said.
Instead, it could be an indication that there is more untouched data online that could be useful for the greater good. And if there is, Beauchamp is all for it.
“It’s better to unite in some kind of movement here, if we can,” he said. “So I’m happy to be a small part of that.”