(Sweet Jesus this post needed some editing!)
One of the missions of my field work in Dominica is to develop a comprehensive genealogy for this rural community so I can measure the relationship between kinship and sharing between migrants and non-migrants.
It's common knowledge that there are a few very common surnames in this village, but the conventional wisdom doesn't cover the rest of much less common surnames. I tabulated the frequency of different surnames and ordered them by rank to learn more.
Below is the discrete distribution of surname frequency by surname rank (higher rank means higher frequency; there were lots of ties, especially at lower ranks). Because there is good reason to believe that the distribution of surname rank should follow a power law distribution, I also fit a power law and calculated the R-squared value.
A power law, by the way, simply means that the frequency of something that interests you (in this case, surname rank) can be calculated using a formula like this:
Because we'd expect surname distributions to follow a power law, you might think this analysis isn't particularly surprising. And you'd be right. While the ubiquity of the power law in nature and the reasons thereof are fascinating, it is a pretty well studied phenomenon.
What I don't think is well understood is how universal the parameters ("a" and "k") of the power law distribution are in a given context (i.e., in the distribution of surname rank across different populations), and what mechanisms influence variance in those parameters. Similarly, the fit of the power law distribution to actual surname rank data seems to vary somewhat across populations. Why? The answers to both of these questions may have something to do with the cultural, social, demographic characteristics of a population.
In this community, for example, high rates of out-migration (a net loss of about a fifth of the population between 2001 and 2011) likely affect the current distribution of surnames. Members of the same family tend to migrate together, the psychology underlying the motivations to migrate may run in the family, and some families tend to migrate to destinations where their family members already reside because it eases the transition.
How does this population process influence the power law (and fit thereof) of surname rank relative to, for example, a growing population with rapid IN-migration?
I don't know and I don't have much time right now to investigate. But it's an interesting question.