Bye Karen

Using data from the SSA, I look to answer the basic question: Has the name Karen been irreparably destroyed?

Bye Karen

After the most recent escapade in the chronicles of Karen, it got me thinking: Have the actions of this small cadre of women permanently tarnished the integrity of the name? Since I've been looking for an opportunity to practice with Jupyter Notebook and Pandas anyways, a weekend project affording that practice and the chance to strike back at the cult of Karen seemed like double bonus. With that, let's take a look at some of my findings (you can check out the full notebook on Github).


Reframed at the most basic level, the question I wanted to answer was: Did the 'Karen' movement - which began in 2018 and started gaining significant traction in 2019-2020 - have a meaningful effect on the popularity of the name for newborns. As a first step towards answering this, I needed data. Luckily, the SSA exposes a portal for viewing the popularity of baby names going back to the late 1800s. Using a simple scraper, I collected the following:

  1. A static list of the top 20 baby names from 1970 (I chose this date due to the popular correlation of Karen figures to the baby boomer segment). Karen was the 12th most popular female name that year with close to 1% adoption across female babies, immediately following Julie and preceding Laura.
  2. For each name in this static list, I collected its popularity by year as a rank from 1-1000 (names greater than rank 1000 aren't surfaced in queries and are simply dropped from that years' results). This data is read into a DataFrame having columns name, year, and rank.


After reading in and parsing the data, the next step was doing some basic data exploration. Side note: I wish I had more Pandas experience at this point!

uh-oh Karen

Interesting! While the names in close popularity-proximity from 1970 saw relatively minor changes in popularity from 2018 - 2020 (the average change in rank being 27 and 11 for Julie and Laura, respectively), Karen dropped a whopping 97 spots in the rankings! Further, I would posit that with the Karen meme really hitting in 2019, the effects on names should surface on a 1-9 month lag, explaining the cataclysmic descent not hitting till 2020.

But is it possible this effect is somewhat cyclical? Perhaps looking back in the history of the name we'd see a similar pattern having occurred previously.

in short, Karen took a nosedive

While the name seems to have been on somewhat of a....downturn, it certainly seems to have taken a turn for the worse in 2020.

Next, I looked at some of the other names from the list to see how they tended to move over time. There were certainly a few surprises:

Dawn with a brief spike in popularity
Elizabeth, truly timeless

However, none had the breathtaking drop in rank Karen enjoyed.

To get a better sense of how the change in rank Karen saw in 2020 differed from the normal fluctuations seen for the names in the list I created a new DataFrame tracking the average change in rank by name, the max change in rank seen by that name through its history, and the delta between its average and max rank change.

Karen easily with most drastic drop

Based on this, we can see the delta between Karen's max rank drop and average is more than 30% greater than the next largest drop!

Karens max rank change vs. the average for the name

Takeaways and Opportunities for Future Exploration

In short, it seems safe to say that the Karen pandemic of 2018-2020 has had a dire effect on the standing of the name. Of course, without more rigorous statistical analysis we can't say with any degree of certainty that the observations aren't part of a natural pattern, but at first glance it seems pretty damning.

This being a weekend project, it's definitely not the sort of thing I'd show off in my portfolio. Not that I'd necessarily want to show off a thinly veiled blasting of Karen in a portfolio anyway. Anyways...some ways we could improve the layout of the analysis include:

  1. Pulling additional names into the experiment. Especially interesting would be analyzing a name surrounded by similar levels of venom (perhaps, oh I don't know, something like Jeffrey) and referencing its movement in rank against the Karen.
  2. Applying a rigorous statistical analysis to better understand the probability of seeing the sort of rank movement observed without the presence of outside influence.

That's all for now folks. Stay safe out there, you've got Karen and a virus to watch out for.