How many couples of the same name are there?

TL;DR: probably around 10 k couples in the US have the same name and 60 % of them are gay men.

Welcome to the first lab output of Twinkdestroyer Analytics. While we were touching grass in upstate New York my buddy Anita said, “I wonder how many couples have the same names.” As the Bible said, when the woman asks, the man shall answer. We can absolutely estimate this.

First, I downloaded the 1996–2017 NYC Marriage Data from nycmarriageindex.com. This data has first, middle, and last names of bride and groom, as well as all same-sex marriages after June 2011.

Looking at the couples with the same name from all the data, we have that 0.04 % of all couples share the same name. That’s 728 couples out of 1.6 million marriages, and a lot of these names are ethnic (e.g. Chinese transliterations), which means that’s not a good estimate at all.

I felt it was reasonable to assume that gay couples are more likely to have the same names, so I only looked at the data starting 2011 (which includes gay marriages).

I wanted to do some mid-level NLP nonsense to make ethnically informed guesses about the gender of bride and groom, but I soon realized that was overkill (and that I have no compute). Instead, I installed a good-enough package called gender_guesser and called it a day. I looked at the first and middle name of bride and groom separately, and wrote a small program to label marriages as gay, straight, lesbian, or unknown.

Although I couldn’t classify 30 % of my data (again, ethnic names make it hard), I found that 51 565 out of 590 097 marriages between 2011-2017 are gay/lesbian. That’s 8.74 %, which is so much higher than the nationwide stat that 1.2 % of all marriages are same-sex. But just when I thought my good-enough regex classification scheme wasn’t good enough, I found this article showing 9 % of all marriages in NYC are gay. Lol.

After removing couples I couldn’t classify as gay or straight, I found that 0 straight couples have the same names. This is obviously not true (there’s probably a Sharon-&-Sharon or Lee-&-Lee marriage out there), but for our purposes it’s good enough to say 0 % of straight couples have the same names.

% of gay/lesbian marriages: 51565 / 590097 = 8.74%
% of gay marriage w same names: 332 / 31203 = 1.06%
% of lesbian marriage w same names: 130 / 20362 = 0.64%

According to the Williams Institute there are 1.3 million same-sex couples in the US, 46.13 % are gay and 53.87 % are lesbian. Now we can calculate:

0.0106 * 0.4613 * 1 300 000 + 0.0064 * 0.5387 * 1 300 000 = 10 852

So there are probably ~10 852 couples in the US with the same name, and over 60 % of them are gay men.

To close off, here are the top 20 names for same-name couples in my dataset:

samenamecouples

Thank you for reading my in-house data analysis that could have been done by an 11-year-old from Shandong. Stay tuned for more Twinkdestroyer Analytics.