I am currently looking at first names in order to improve the 'natural language geocoder'. It is curious to see that it needs 4200 first names for girls to cover 90% of the population, but it only needs 1200 boy's names to reach a 90% coverage. The reason for this huge difference is mainly found in the top positions. The ten most popular male names reach 23% whereas the ten most popular female names reach a comparatively meager 10%.
The question is: why are parents looking for variety in a name for a girl and far less for a boy? Any ideas?
Source : US Census
PS, just in case you want to know the most popular names :
name freq cum.freq rank
MARY 2.629 2.629 1 PATRICIA 1.073 3.702 2 LINDA 1.035 4.736 3 BARBARA 0.980 5.716 4 ELIZABETH 0.937 6.653 5 JENNIFER 0.932 7.586 6 MARIA 0.828 8.414 7 SUSAN 0.794 9.209 8 MARGARET 0.768 9.976 9 DOROTHY 0.727 10.703 10
JAMES 3.318 3.318 1 JOHN 3.271 6.589 2 ROBERT 3.143 9.732 3 MICHAEL 2.629 12.361 4 WILLIAM 2.451 14.812 5 DAVID 2.363 17.176 6 RICHARD 1.703 18.878 7 CHARLES 1.523 20.401 8 JOSEPH 1.404 21.805 9 THOMAS 1.380 23.185 10
I was going to argue that it’s not so much that parents seek variety in naming girls but rather that there is a pervasive cultural source for boys names in the bible. Both new and old testaments are mostly about males and a lot of parents will either be religious or at the very least exposed to it through cultural osmosis. However, the fact that only 4 of the top 10 boy names are biblical undermines this.
This is a good point, Aaron. I agree with you that it may have a religious or cultural reason. It would be interesting to compare these numbers with other countries and cultures.
I found some numbers for France, which show a similar pattern with more names for girls :
http://www.insee.fr/fr/insee_regions/Limousin/publi/e_prenoms-2005.html
The exact numbers, however, are too expensive : http://www.webcommerce.insee.fr/Catalogue/descriptif_produit.asp?Code%5FProduit=PRENVIV03&TABLE=&ID=&Choix=#
[…] post on that here, which notes, “it [natural language geocoder] needs 4200 first names for girls to cover 90% of […]