Tuesday, a Twitter user by the name of Zach Goldberg posted some intriguing graphs of data generated from a weekend spent with LexisNexis. If you’re not familiar with it, LexisNexis is a private company that maintains a searchable database of legal information and also an extensive database of news publications. In this case, Goldberg used the system to search for specific words to see how often they appeared in news publications over time. For instance:
— Zach Goldberg (@ZachG932) May 28, 2019
— Zach Goldberg (@ZachG932) May 28, 2019
(#7) If you're looking for a job in a booming industry…. pic.twitter.com/nERSV7N2eH
— Zach Goldberg (@ZachG932) May 28, 2019
#16 Our future is intersectional pic.twitter.com/YNWk8XZ8g5
— Zach Goldberg (@ZachG932) May 28, 2019
Goldberg realized that you can’t tell from these graphs whether these terms are receiving a positive mention or a negative mention. In other words, some portion of the people writing about “unconscious bias” are probably quoting people using the term and arguing that it’s nonsense. Still, you get a sense that many of these terms appear to be spiking in just the past few years.
But there’s another wrinkle and it’s potentially significant: The rise of online media. These graphs probably don’t account for a spike in the overall amount of publishing made possible by not needing to print everything on paper. Obviously, that transition has been taking place over the last decade or so. In order to get a better sense of what was going on, Goldberg limited his searches to a single source, the NY Times. The results were similar:
(#10) pic.twitter.com/nVDNOR0eDY
— Zach Goldberg (@ZachG932) May 28, 2019
#12 pic.twitter.com/jL390W7jwh
— Zach Goldberg (@ZachG932) May 28, 2019
At some point, he realized a few of the data sets he’d collected were undercounts because he used quotes around the word in question:
The 'privilege' NYT graph was also an undercount. Obviously, the word 'privilege' need not refer to social/racial privilege. I nevertheless find it interesting that its use has grown tremendously over the past few years. It could be just a coincidence, but I'm skeptical. pic.twitter.com/AilEduyvAZ
— Zach Goldberg (@ZachG932) May 29, 2019
It’s still possible, even likely, that the NY Times is simply posting a lot more stories in the past couple years than it was previously. So the raw word counts don’t prove ‘racism’ is used in a higher percentage of articles. Maybe there are just a lot more articles. To find out you’d need to run a test with some non-partisan words or perhaps try to normalize the graph based on the total number of articles being published. Bottom line, it’s not clear what’s happening from this limited data.
However, with that said, it certainly seems plausible that there has been a big upswing in terms like “privilege” in the past few years. A lot of the race-related, social justice terms seem to start climbing sharply in 2014 in many of these graphs. What was happening in 2014? Well, George Zimmerman was acquitted of murdering Trayvon Martin in the summer of 2013. Eric Garner died in July 2014. Mike Brown was shot in August 2014. Again, you can’t make firm conclusions from the limited data, but it seems plausible that a lot of these terms became far more common in the wake of Black Lives Matter. If nothing else, it’s an intriguing area for further investigation that could either confirm or disprove what these graphs suggest is happening.
Join the conversation as a VIP Member