<![CDATA[My n of 3 - Blog]]>Wed, 19 Jun 2013 06:20:42 -0800Weebly<![CDATA[More on war, peace, human nature, bad statistics, and straw men.]]>Wed, 19 Jun 2013 05:37:13 GMThttp://www.mynof3.com/1/post/2013/06/more-onwar-peace-human-nature-bad-statistics-and-straw-men.htmlSomeone named Roger Gathman responded to my recent post about the debate over the origins of human violence. I reproduce the insightful comment below:


Oddly, you don't include what to my mind is Pinker's most scandalous mistake - that in two cases, he evidently thinks that two names for one site means that there are two sites! One is the Brittany site which is counted twice, once as a Brittany site, once as a French site. The other is Boggebaken Denmark, also counted twice due to the fact that it is also named Vedmack, Denmark. That is a pretty high error rate. Ferguson, as he himself says, is concerned with the way Pinker's list skews the reader to think that these are proofs of a universally high violence prehistoric scene. I don't think he has to recalculate Pinker - the point is that the list doesn't take into account other sites (and thus gives us really zero information about the prevelance of prehistoric violence) and that even the list itself is full of errors. For instance, the SaraiNahar
Rai site in India.is falsely claimed to exhibit a 30 percent violent death rate. Ferguson pretty convincingly shows that this is a gross distortion of the evidence, which shows 1 killing. 
Your statistical point is a rather different point than Ferguson's, who is only trying to show how a small selection of evidence, a sample, has been manipulated - or, I would imagine, simply transferred from one secondary source to the other, as I can't imagine Pinker spent any time actually reading the literature about these sites, taking their descriptions on faith from his sources.
And now I will address each of Gathman's points.

Double-counted sites


Oddly, you don't include what to my mind is Pinker's most scandalous mistake - that in two cases, he evidently thinks that two names for one site means that there are two sites! One is the Brittany site which is counted twice, once as a Brittany site, once as a French site. The other is Boggebaken Denmark, also counted twice due to the fact that it is also named Vedmack, Denmark. That is a pretty high error rate.
Gathman uses the word "scandalous" to imply that Pinker purposefully misrepresented the evidence. It seems more likely that Keeley's and Bowles' descriptions conflict enough that Pinker honestly mistook them as two different sites. The cited dates are 2000 years apart! Anyway, let's go with Ferguson and discard one of these counts. For the Brittany/Ile Teviec site, let's "keep" the smaller figure, which is 8% death-from-war rate (Bowles). As for Bogebakken, let's again go with the smaller figure, which is 12%. Let's not "discard" any other sites yet for the reasons I describe in the previous post.

Recall that the average death-from-war rate from Pinker's original list was 15%. The contribution this average by the double counts is only (13.6% + 12%)/21 = 1.21%. So what's mainly happening from discarding these double counts is a slight widening of the confidence intervals, and a slight lowering of the average. So what's the big deal?

The big deal is that Pinker's list might be skewed toward sites with high death-by-war rates. Okay. So give us a better sample and use proper statistical methods to estimate the posterior distribution of the casualty rates. The rest is hand waving.

"Zero information"?


...the point is that the list doesn't take into account other sites (and thus gives us really zero information about the prevelance [sic] of prehistoric violence)...
No, Pinker's list doesn't give us "zero information" about prevalence of prehistoric violence. In fact it is more and better than the information provided by Ferguson, which includes not one graph or table to summarize the evidence that Pinker's list is biased. Ferguson's argument is almost entirely verbal and citation-based. Time to chase down those edited volumes and check if they were cited properly!

Listen, I am open to the possibility that hunter-gatherers were not that violent, perhaps even "peaceful". My concern is with the science in Ferguson's paper, which is not very good. I'm trying to counteract any hype that could arise from this paper that over-reaches the weight of its evidence. The message of the paper should be "we might need a better sample", not "we can effectively ignore Pinker's list." This subject deserves better science, and that's why I'm asking archaeologists to assemble a more comprehensive assemblage of prehistoric remains, and do a proper statistical analysis of the casualty rates. The "assemblage" could be a damned literature review where the articles are properly vetted! Yes, we can do better than Pinker's list, but Pinker's list does better than Ferguson's narrative. Ferguson's narrative rightly raises questions. It provides little competing quantitative evidence.

What about Sarai Nahir Rai?


For instance, the SaraiNahar Rai site in India.is [sic] falsely claimed to exhibit a 30 percent violent death rate. Ferguson pretty convincingly shows that this is a gross distortion of the evidence, which shows 1 killing.
Okay. There were eleven well-preserved skeletons. Pinker's figure is based on the assumption that three of them were homicides. Suppose just one of them is homicide. So that doesn't matter? We should just discard all samples that have only one homicide? Ferguson discards this sample for this reason, and I've already argued that this reason is silly, not to mention guaranteed to underestimated the homicide rate. As for the new homicide rate for this site accounting for the supposed mistake in homicide classification, we go from almost 30% to about 10%. The contribution to the average rate is what, a little more than 1%? Again, so what? The qualitative result remains the same, but there is marginally less certainty in the comparison.

The fault isn't all Ferguson's. I finally got a hold of Better Angels. Indeed, Pinker calls these "war deaths". How the $#*& do we think we can reliably classify a homicide that occurred thousands of years ago and that is part of a small sample as a war death? Both sides are being silly on this matter. But recall that Pinker's argument isn't just about warfare. It's about all violence.


Your statistical point is a rather different point than Ferguson's, who is only trying to show how a small selection of evidence, a sample, has been manipulated - or, I would imagine, simply transferred from one secondary source to the other, as I can't imagine Pinker spent any time actually reading the literature about these sites, taking their descriptions on faith from his sources.
My statistical point is the point that matters. We are trying to estimate the homicide rate throughout human existence. We need to do that using a proper statistical model. And if you want to argue that someone's sample is possibly biased, great. But manipulated? That's a sharp accusation. In academia, accusations like that are investigated thoroughly. Know how they are investigated? By using proper statistical methods! No, I think Ferguson is just saying that Pinker is so convinced he's right that he and others have subconsciously cherry-picked the data. This is way different from cooking data to predetermine the result. Even so, if you want to argue that a sample is biased, the best way is to collect a better sample and compare the results.
]]>
<![CDATA[Of war, peace, human nature, bad statistics, and straw men.]]>Wed, 12 Jun 2013 21:05:04 GMThttp://www.mynof3.com/1/post/2013/06/of-war-peace-human-nature-bad-statistics-and-straw-men.htmlYesterday on my bus commute home, I had a great conversation with anthropologist Jason Anstrosio of Living Anthropologically and Andrew Badenoch of Evolvify. (I apologize to Badenoch for not biking home, or to the southern coast of the Arctic Ocean, for that matter). We talked about the long-running debate about the origins of human violence, popularized most recently by Steven Pinker's book The Better Angels of Our Nature, Jared Diamond's The World until Yesterday, Napoleon Chagnon's Noble Savages, and a conversation among a bunch of old men hosted by John Brockman's Edge.

To summarize the debate, there are effectively two sides. One side is convinced that human violence has decreased relative to population size throughout our existence, with a downward trend running from the days when we were all hunter-gatherers until now, when almost no one is a hunter-gatherer. In other words, this side thinks Hobbes was right about the nastiness, brutality, and brevity of human life before we domesticated ourselves. Another side believes that either we don't know how much violence there was among ancient hunter-gatherers, or that there was less violence before plant and animal domestication, political centralization, and whatnot. Members of both sides are overconfident in their ability to discern warfare from other causes of homicide in the archaeological record.

A new book that Anstrosio has reviewed came up during our conversation. It's called War, Peace, & Human Nature. It's not about war, peace, and human nature so much as it is about how much war and peace factor into human nature, and if indeed there is a singular human nature. Unfortunately, the book costs $85, and I think that is a ludicrous price. Also unfortunate is that it's checked out of my university library until the 7th of July. Thankfully, Anstrosio put up PDFs of two the book's chapters, both written by Brian Ferguson, a critic of the Hobbesian point of view. I'll focus on one of these articles, called "Pinker's list: exaggerating prehistoric war mortality," the title of which is pretty self explanatory.

Okay, maybe not. First, you need to know what Pinker's list is. In Better Angels, Steven Pinker put together two of the largest archaeological datasets on prehistoric homicide. The data spans multiple temporal, geographic, and cultural settings. One of the two dataset was assembled in Lawrence Keeley's War Before Civilization. Sam Bowles assembled the other for an article that Science magazine published. From this data, Pinker calculated that the average death-from-war rate as 15%. (Ferguson didn't report the confidence intervals and I don't know if Pinker did, either, because all three copies of Better Angels are checked out of my university's library). Ferguson's goal in this chapter is show that, of the 21 cases in Pinker's list, six can be thrown out, and the rest are biased samples.

The greatest use of this book chapter is Ferguson's summary of the data sources for each of the 21 cases. And he makes a good verbal case that the data might be biased. But the chapter offers neither graph, table, nor parameter estimate to show how biased it might be, nor how certain we are in that bias. Ferguson also uses some weird logic to argue that some of the 21 cases should be thrown out. Here, let's just quote from the chapter, and comment on those quotes:


So let us look over Pinker's list. Of the original 21, Gobero, Nigher is out because it has no war deaths.
First, okay, throw out data that will potentially support your argument that hunter gatherer "war" deaths were lower than what Pinker calculated. Second, no, don't do that! When doing meta analysis, you are not allowed to throw out data just because the sample size is small. Instead, you use a hierarchical statistical model so that the data coming from small samples borrows information from the variation across samples. In this case, I'm pretty sure that the borrowed information would result in a non-zero point estimate for the homicide rate.


Three cases...are all eliminated because they only have one instance of violent death.
Here, Ferguson neglects to mention that Pinker's book's premise is that all forms of violence, including but not limited to war, have decreased over time. You can't fault Ferguson for focusing on war so much that he throws out data that shows evidence of homicide. After all, the book has "war" in its title, and the editor says from the first chapter that the central question is about how and when war became a part of human existence. Except that is a far narrower question than the one Pinker's book addresses. A more basic question is, "What is the time series of any form of violence throughout human existence, and how much uncertainty do we have in the overall shape and scale of that trend?" Throwing out cases that are fairly good evidence of homicide is guaranteed to bias your sample with regard to this much better research question.

The rest of Ferguson's conclusions make claims that the sample is biased. But again, why not plot or table the rest of the data for comparison, or do some statistics? 

Or better yet, why doesn't a big research group get together all prehistoric skeletal samples, tabulate the violent deaths, and design a proper model for statistical inference? Then plot the estimated homicide rate and its uncertainty across all human existence to look for a trend. "Oh, that's not feasible." Shut up! It was done for population growth rate decades ago. And this is perhaps one of the most important questions about human behavior in its broadest sense. Surely the Harry Frank Guggenheim foundation might be able provide some seed funding?

My guess is that, if we can ever herd enough cats to do this definitive analysis, the result will be nice, close estimates of homicide rate for recent times that funnel back into a lot of ancient uncertainty. And if the uncertainty is so great that we can't even discern a shape to the trend before a certain date? At least we can be sure of one thing: we need more data.
]]>
<![CDATA[One time I asked Julian Barbour, "If time doesn't exist, then what is evolution?"]]>Thu, 23 May 2013 23:30:18 GMThttp://www.mynof3.com/1/post/2013/05/one-time-i-asked-julian-barbour-if-time-doesnt-exist-then-what-is-evolution.htmlLast year, when I was living in the Commonwealth of Dominica, and experiencing some severe culture shock, I emailed independent scientist and theoretical physicist Julian Barbour.
       


Dear Dr. Barbour:

You envision a universe in which time is an illusion. I am not sure at what level you are familiar with modern evolutionary theory. But you are likely aware that the most basic definition of evolution is a change in allele frequency over -time- via selection, drift, gene flow, and mutation. So I have three questions for you:

(1) What would an evolutionary theory look like without time?
(2) Or do you believe that time is a useful tool that helps us understand certain things such as natural selection and genetic drift, even though time doesn't exist?
(3) Do you think it possible that the perception of time may be under selection? 
(4) If so, what adaptive benefit do you think a linear perception of time has?
(5) Can you envision an organism that need not perceive time as linear?

Thank you for your time.
Get it? "Thank you for your time"? I just thanked a man who argues that time doesn't exist for his time. How cool is that?

Anyway, Julian Barbour never answered me, probably because he is very busy doing Russian translations, answering emails from more important people, and avoiding people like me who might sound like quacks and say they have three questions when they actually have five. Okay, I might actually be a quack.

But dude! Those questions are deep! I mean, seriously, if time doesn't actually exist, what is the underlying biological explanation for why we experience one damn thing happening after another instead of experiencing time backwards like Merlin? That is, why do we sense the increase of entropy instead of its decrease?

Let me take a few steps backward. Okay. So there's this thing that physicists call the "arrow of time". The arrow of time refers to the fact that there is only one quantity in the physical sciences...seriously, only one...that requires time to have a direction. That quantity is entropy. 

Entropy is a measure of "disorder". Think of a bag of potato chips. That got your attention. Imagine walking around with that bag of potato chips in your backpack, and occasionally reaching in since you can't eat just one. At the beginning of the day, the potato chips are very orderly in space, with this well-defined, potato chip shape, separated in the bag by equally orderly pockets of air. By the end of the day, however, the potato chips have been ground into a chaotic crapstorm of salty, greasy crumbs, evenly distributed at the bottom of the bag below a not-very-interesting pocket of air. 

That's kind of like how the universe works. In the beginning, everything was orderly and hot as a mother fucker. Billions of years down the road, all matter will be evenly distributed about the universe, uninteresting and cold until...well, I don't know, I'm not a physicist.

This whole analogy requires an arrow of time from order to chaos, from hot to cold. Or does it? I haven't studied up on his hypothesis much, and my brother is the astrophysicist, but I think that's the question Julian Barbour's asking. And since he asked it, I think that in the back of their minds, all evolutionary biologists should also ask it. After all! Biology is just applied chemistry. And chemistry? Just applied physics.

Although really, if time is just an illusion and we've been making such a big stink about it only because we're organisms who happen to experience entropy in this funky, linear way, then maybe, at least in this case, physics is just applied evolutionary biology!
]]>
<![CDATA[How closely must we measure climate to understand its effects on human behavior?]]>Wed, 22 May 2013 16:10:46 GMThttp://www.mynof3.com/1/post/2013/05/how-closely-must-we-measure-climate-to-understand-its-effects-on-human-behavior.htmlGuess what. The climate's changing quickly and we're to blame for it. Well, one good turn deserves another. We're influencing the environment's behavior. It will influence ours, as it always has. But how? Some researchers are battling over the effect of climate on the frequency of warfare and civil unrest. The research group I'm working with - led by Sara Curran and Matt Dunbar at the Center for Studies in Demography and Ecology - just wants to know how the climate will affect human migration patterns, especially between rural and urban places. And I'm betting that climate effects on migration will be, at least in the beginning, and at least if we don't let things get too out of hand, more important than the effects on warfare. And if you've read Peter Turchin's work on the factors leading to civil unrest and civil war, immigration is one of them, anyway.

Farmers' livelihoods depend on the weather. For example, the farmers in Nang Rong, Thailand, where we've focused our research, depend on the annual monsoon seasons to water their crops, especially rice. If those monsoons become more intense, or more irregular, or if the climate starts alternating between extremely dry and extremely wet conditions, crops might not do as well, and some farmers will look for other ways to get money. In fact, we've shown that's what's happened for decades since Thailand started its rapid urbanization and development. At least...we think that's what we've shown.

We're interested in "slow onset" climate change, as opposed to extreme events, like hurricanes and stuff. Slow onset change is like sea level rise, and changing rainfall patterns. These are trends that last for decades, but that you won't see on CNN's dramatic news breaks. A good way to predict future behavior in response to slow onset change is to look at migration responses to past climatic conditions. Do more or fewer people migrate during years with more or less rainfall? Okay. Now that we know that, let's look at what's likely to happen to rainfall in the future and forecast migration patterns. It's like that.

Believe it or not, climate is a difficult thing to measure mainly because there are so many ways you can do it. You can look at global climatic processes, like the El Niño Southern Oscillation (ENSO). But this doesn't tell you much about local processes. For local processes, you can look directly at temperature. But temperature is erratic at small time scales. You could look at rainfall, but in places like Nang Rong, you'll only have one rain station, yet widely varying village microclimates.

Another way to measure the local environment is to look at satellite imagery, and that's the approach we've taken. One thing the satellites give you is the amount of light at different wave lengths that gets reflected back into space. Two important wavelengths are the near infrared and red spectral bands. When plants are healthy and growing and dense, more infrared light gets reflected compared to red. When plants are not healthy and sparse, less infrared light gets reflected compared to red. A useful measure of near infrared compared to red reflectance is called the Normalized Vegetation Difference Index (NDVI).

The trouble with satellite data is that there is a lot of it. We have 24 years of NDVI, with two images every month. What's more, we've got that much data for 49 8x8 km sq parcels of Nang Rong land. That's 28,224 data points. How do we compress all that data into a simple, intuitive measure that predicts individual migration? Should we compare the annual average NDVI for a given year with a long-term average? What about a five year average NDVI comparison? Or does our measure need to take into account the pattern of NDVI within a year, such as how quickly plants "green up", and how long they stay healthy? And how do we reliably measure "green up" anyway? For that matter, shouldn't we use satellite imagery to check what is actually on the landscape to make sure we're tying NDVI to the growth cycles of plants that actually matter to farmers?

So in addition to the global vs. local dimension of environmental measures, there's this simple vs. complex dimension. Global measures, such as ENSO, are simpler than local measures based on NDVI because there's no spatial dimension; ENSO is a global process! Yet within ENSO-based or NDVI-based measures, you can make simpler or more complex decisions about how you measure the process. Simpler variables are cheaper to produce because they take less time, and we all know that time = $. Complex variables are more expensive, but might tell us more about how the climate affects decisions in a local context. If you're trying to predict migration in several regions at once, local variables are more expensive because there is more data to crunch, no matter how simple the measure is.  By comparison, there is only one ENSO dataset for every region.

If we'd like to quickly and cheaply measure the effect of climate on migration, we should prefer simpler, global measures. So the questions that our research group is turning to are: 

  • How complex do our environmental measures need to be to predict migration behavior well? 
  • And how much do we gain in predictive power for more costly measurement methods?


Stay tuned for answers.]]>
<![CDATA[Rise, Microryza! New science crowdfunding engine looks cool.]]>Wed, 22 May 2013 05:02:20 GMThttp://www.mynof3.com/1/post/2013/05/rise-microryza-new-science-crowdfunding-engine-looks-cool.htmlToday, while looking up independent scientist Ethan Perlstein (recently profiled in Science magazine), I came across something called Microryza, a crowdfunding engine that lets you "follow & fund" science. As everyone else on the Internet is saying, it's kind of like the Kickstarter of science funding. What I like about Microryza is that the focus is the science. Unlike Kickstarter or Rockethub or whatever, you aren't required to provide tangible rewards to your backers. The science is the reward, and the scientist gets to focus on producing and communicating it through Microryza's beautiful online interface.

And here's my favorite part of their FAQ:



Do I have to be a student or professor at a university?
No, we love to host projects from people outside of research institutions. 
So here's my postdoctoral scientrepreneurship plan so far. Here are the things I could potentially focus on and seek funding for.

  1. Sound Cheks, my political fact checking research institute project.
  2. My research on hawkish cooperation.
  3. My collaborative research with Sara Curran, Matt Dunbar, and Jacqueline Meijer-Irons on the effect of climate on migration.
  4. Developing a personal finance education program in the Commonwealth of Dominica.
  5. Continuing to publish my work on inferring social dominance structure in collaboration with Zack Almquist (soon to be at U of Minnesota).


Obviously, I can't do all of this at the same time. The way I see it, item 5 is a given. After my dissertation, we'll have two more papers we could still publish, and the ball will have been put in Zack's court. Item 4 is something I could incubate over a few years and then develop over a summer and (hopefully) make self-sustaining through local Dominicans once I set it up. As for items 1, 2, and 3, I'm going to apply for funding for all them simultaneously, see what I get, and allocate my time accordingly. Here's the funding plan.

  1. Upstart campaign (Sound Cheks ostensibly, but really it will help me do all of this if I have patron investors to help me free up time from working for somebody else).
  2. Microryza campaign (Sound Cheks measurement models and soundness checking personnel).
  3. Microryza campaign (hawkish cooperation).
  4. Microryza campaign (migration and climate research).
  5. National Science Foundation Interdisciplinary Behavioral and Social Science Research Postdoctoral Research Fellowhsip (migration and climate research).
  6. Harry Guggenheim Foundation Research grant (hawkish cooperation).
  7. Rockethub (Sound Cheks UX, webdev, hardware, and alpha testing).


]]>
<![CDATA[When cooperators start recognizing cheaters, cheaters start deceiving cooperators.]]>Wed, 15 May 2013 18:13:56 GMThttp://www.mynof3.com/1/post/2013/05/when-cooperators-start-recognizing-cheaters-cheaters-start-deceiving-cooperators.htmlProceedings B just published an interesting article by theoretical biologists McNally and Jackson called "Cooperation creates selection for tactical deception". The authors analyzed a simple mathematical model and reviewed comparative data on cooperation and decepton across the order Primates to argue...well...exactly what their title says. Irrelevant side note: the Jackson author's first name is Andrew. That is, he shares a name with one of the most badass, cantankerous, and dare I say murderous of American Presidents.

Anyway, the mathematical analysis reveals that, if cooperation evolutionarily prevails over "honest" cheating (where cheaters don't try to hide their cheating), then a new strategy can invade that tactically deceives cooperators (by hiding or misrepresenting their behavior). But this can only happen if cooperators aren't good at recognizing rare cheaters. In that case, you'd expect a mixed population of cooperators and deceptive cheaters. The equilibrium ratio of cooperators to deceptive cheaters depends on how difficult it is to deceive relative to how good cooperators are at recognizing both cheaters and deception. The more difficult deception is and the easier recognizing it is, the greater the ratio of cooperators to deceptive cheaters.


What's really interesting about the mathematical result is that if cooperators are terrible at catching cheaters, then "honest" cheaters can invade the mixed population of cooperators and deceivers because they don't pay the cost of deception that deceivers do, but they still reap all the benefits of cheating. In that case, cooperation prevails. Hurray. But if cooperators are good at catching cheaters, it pays to deceive ... at least for rare deceptive cheaters. I'm pretty sure that these mathematical results make sense and you might guess at them without doing any calculus. That's not a mark against the models. Instead, it's helpful when a mathematical model with explicit, formal assumptions confirms our intuition, which derives from implicit assumptions and informal logic.


The authors argue that their mathematical model implies a positive correlation between the number of cooperative strategies and the number of deceptive strategies in a species. Actually, their model implies that, under some very specific circumstances, we'd expect a positive correlation between the frequency of cooperators and the frequency of defectors. That said, it's not too much of a logical leap.

To examine this prediction, the authors did a comparative analysis of species in the order Primates (to which we belong). The data compiled the presence or absence of different types of cooperative and deceptive behaviors. They used a method called independent contrasts to examine the relationship between cooperativeness and deception, controlling for the phylogenetic relationships among species and for the research effort into a particular species (because more research yields more observations of different types of behaviors). Here is are scatterplots of the independent contrasts with best fitting lines through the points.
The left plot includes only primates in the wild (because behavior in the wild is more relevant than behavior in captivity). The right plot includes both free-ranging and captive individuals. In both cases, the positive correlation between cooperativeness and deception rate is statistically significant, if a bit weak in the case of the full data set. 

What's fascinating about the empirical results is that there is no statistically significant relationship between neocortex size (a measure of cognitive capacity) and deception rate when controlling for cooperativeness. This goes against the grain of the Machiavellian intelligence hypothesis, which argues that there should be a positive correlation between deception rate and neocortex size. 

But is the non-significance of neocortex size simply due to a collinearity problem? A collinearity problem happens when you fit a regression in which two of the predictor variables are highly correlated. The effect of collinearity is that it inflates the confidence intervals of your regression coefficients (which measure the relationship between the outcome variable and the predictors). Wider confidence intervals mean larger p-values and lower statistical significance. The number of cooperative behaviors in a species and its neocortex size might be correlated. Indeed, R.I.M. Dunbar's classic study found that group size is correlated with neocortex size in primates, and group size is a problematic but still useful proxy for social complexity.

And this is why journals need to allow more room for the methods section: because we should never penalize scientists for doing collinearity diagnostics.
]]>
<![CDATA[Are people less willing to say which friend would win a row than who is stronger or more irascible?]]>Tue, 14 May 2013 22:19:22 GMThttp://www.mynof3.com/1/post/2013/05/are-people-less-willing-to-say-which-friend-would-win-a-row-than-who-is-stronger-or-more-irascible.htmlLast week, I gave people advice on how to collect data from multiple informants about social dominance relations within households. I'm still working with that data, so let's keeping talking about it. Today, I present a very preliminary finding that's entirely tangential to my dissertation, but potentially interesting. Basically, it looks like people are more willing to say which of their friends is physically stronger or more irascible than who is more likely to a win a serious disagreement...at least in one rural village in the Commonwealth of Dominica.

Let's review the data collected. I went to 92 households. I tried to ask every single household member 13 or older some questions about the relationship between every possible pair of fellow household members who are also 13 or older. The three questions that interest us today are (translated into Dominican English or Dominican French Creole):

1. "If these two people got into a serious disagreement, which of them would more likely get what they want?" (more likely to win a row)

2. "Which of these two people can lift a heavier load?" (physically stronger)

3. "Which of these two people has a more fiery temper?" (more irrascible)

For neither of these questions did I force respondents to make a choice. That is, a respondent could say that the two members of the household pair are equally likely to win a serious disagreement, equally strong, or equally irascible.

One of the challenges I faced with these sorts of questions is that it may be taboo to acknowledge that one person is more [insert adjective] than someone else. Indeed, I allowed people to avoid making a decision on who is more likely to win a serious disagreement because my research assistants told me I might be considered rude if I didn't. By comparison, it isn't rude to say that one person is physically stronger than another. Moreover, I've heard people describe others jokingly as one who "ke fache pli vit" (literally, "will get mad faster"). That said, you might think it would be considered more rude to describe someone else as temperamental than to describe them as strong compared to someone else.

A crummy, half-assed way to explore these questions is just to calculate for each of the three questions the proportion of reports on household pairs that are considered roughly equivalent, or "half and half" as Dominicans would put it.

So here's the results of this half-assed study, which completely ignores sampling error, bias, and missing data problems (so take it with a teaspoon of salt):

1. About 11% of reports considered the household pair to be equally likely to win a disagreement.

2. Compare that to about 3% of reports that considered a household pair to be equally strong.

3. Compare also to about 6% of reports that considered a household pair to be equally irascible.

These figures appear to agree with my experienced-based assumptions about the sort of comparisons that rural Dominican are more willing to make.

I also asked people how certain they felt about their choice for who would more likely win a disagreement. Many respondents might consider this a second chance to adhere to the taboo about comparing people's ability to win a disagreement (if such a taboo exists). Specifically, I asked respondents if there were "not at all", "a little bit", or "almost completely" certain in their choice. If we take "not at all" answers as equivalent to "half and half" answers, then the proportion of "half and half" reports bumps up to about 19%. I don't have a similar question for the "stronger" or "more irascible" questions, but I wish I did!

And don't worry. The statistical methods I'll employ in my dissertation are a lot more sophisticated than this. I've got a paper in the works with my collaborator Zack Almquist in which we will estimate not only the rate at which informants consider household pairs to be equal, but also the rate at which informants incorrectly label a pair to be equal, and the rate at which different informants disagree (that word again!) on a comparison between two household members.]]>
<![CDATA[Some advice to people collecting social network data on unordered pairs from multiple informants]]>Thu, 09 May 2013 23:17:35 GMThttp://www.mynof3.com/1/post/2013/05/some-advice-to-people-collecting-social-network-data-on-unordered-pairs-from-multiple-informants.htmlLast year, I lived in Dominica, where I collected data on the social dominance structure within rural households. For each pair of household members over age 12, I wanted to ask each household member over age 12 who was more likely to win a serious disagreement, and how certain the informant was in their response. I also asked informants who in a household pair they was physically stronger, and who they thought had a more fiery temper. This is some pretty complicated data. I've learned the hard way how to collect it to minimize data entry errors. So if you're a field anthropologist or something similar and you're collecting data on unordered pairs of individuals from multiple informants...listen close.

1. This stuff is complicated. Respect that!

Suppose you visit h households. Each household has n(h) members, plus up to five other "home people", the colloquial term for villagers who share meals and chores with this household on a daily basis, but don't necessarily sleep in it. In total, there are m(h) home people, including the household members (thus m(h) - n(h) extra-household home people). For each household, there are p(h) = m(h)(1 - m(h))/2 unordered pairs of home people. If you ask household member about each of the unordered pairs, you will end up with n(h)p(h) informant reports for household h. Simple, right?

No. You need at least four linked tables to do this right. First, you need a table for individual villagers that tells you the household they live in, if known (note that you will get the names of extra-household home people who do not necessarily live in the set of households you visit!). Second, you need a table to store the affiliations of home people to households (because a villager could be a home person to multiple households in the village!). Third, you need a table of the unordered pairs of home people. Finally, you need a table that stores the informant reports on each of the unordered pairs of home people affiliated with the household in which they reside. If you do any of this incorrectly, you are fucked. Thankfully, I designed my database correctly. 

2. Enter the data directly into a computer.


You have four linked tables. The number unordered pairs increases with the square of the number of household members. You don't have time or cognitive capacity to fiddle around with long lists of home people dyads, villagers, and the link. Don't try it. Create a graphic user interface that links to your database (MS Access is a good way to do this; just make a bunch of forms). You will make fewer data entry errors because you won't be copying things a bunch of times, and you will be able to query information quickly. Thankfully, I did this right, too. But then I went wrong.

3. Create your dyad records and your records for informant reports on dyads before collecting the data on those dyads!


After I got to the field, I had to do a major overhaul on my survey questionnaire, which meant I had to completely redo my graphic user interface forms. I was in the field, lonely, and missing my family. So I was not in my right mind. I also was running short on time. So I decided I would just enter dyads by hand as I collected the data rather than create a SQL query that would automatically create the dyads after I entered in new individuals. Granted, that is actually a difficult think to code into an MS Access form (which is what I was using). Still, by entering the dyads by hand, I increased the probability of errors. Turns out that I have 72 missing dyads (sounds like a whole lot, but there are thousands of dyads in my dataset). That's bad because none of those dyads were represented in any of my informant reports on social dominance networks for the households that include the missing dyads. That means I will have to impute that data. While I know some pretty cool data imputation methods, this of course won't totally solve the problem.

Similarly, I decided I would just enter new informant report records by hand, manually entering in the individual identifiers of the individuals in the dyad the informant was responding to. I also manually entered the identifier into the field storing which dyad member was more likely to win an argument instead of having the field force me to choose one of the dyad members. Again, this would take more time to code, but so does reconciling errors. I haven't counted the errors in my informant report data, but I know there are a few. Many of them are likely reconcilable because the mistakes will be obvious. But not all.

These and other tips will I relate to viewers of my upcoming presentation at the American Anthropological Association Meetings (in November). I'm presenting in a session, led by Stanford's Jamie Jones, on Bayesian inference of social dominance networks from multiple informant reports.]]>
<![CDATA[Great discussion about reproductive ecology and life history theory]]>Fri, 11 Jan 2013 23:49:50 GMThttp://www.mynof3.com/1/post/2013/01/great-discussion-about-reproductive-ecology-and-life-history-theory.htmlMy latest adventure as Teaching Assistant for Dr. Kathy O'Connor's Reproductive Endocrinology Lab class was a discussion about reproductive ecology, part of which I led. (Reproductive ecology is the study of the factors that influence the modulation of reproductive function, in particular the environmentally specific availability of energy and nutrients.) Class started with Kathy's review of the aims and history of reproductive ecology (from an anthropologist's perspective). She segued into a discussion of some key controversies in the study of female reproductive ecology, and how this ties in with a new interest among reproductive ecologists in the male reproductive axis. 

One of the most fascinating questions here is how responsive the female reproductive axis is to environmental cues (such as energy and nutrient intake), and why. A related question is whether the male reproductive axis is more or less robust to environmental change than the female axis and, again, why. To what extent and in what ways is the sex-specfic robustness of the reproductive axis adaptive? Our students, who range from undergraduates in Anthropology to a Communications graduate student, all gave convincing, controversial (in a good way), and (most importantly) testable answers to these questions. It was a great primer on how to think from the perspective of a reproductive ecologist, which will help the students understand why anthropologists are teaching a class on laboratory methods in endocrinology.

So why do anthropologists do endocrinology? The first day of class, Kathy discussed the contrast between the anthropological and biomedical perspectives. Anthropology is holistic, evolutionary, and cross-cultural. The biomedical perspective is more in tune with basic research on physiology, but it determines what is normal based on studies of W.E.I.R.D. (Western, Educated, Industrialized, Rich, and Democratic) populations. But what is normal? Specifically, what are normal estrogen levels? Testosterone levels? A physician might give you a range of female sex hormone values within which a woman can conceive. But an anthropological endocrinologist would show you a graph of Bangladeshi women's sex hormone profiles that are well below that range, and yet they're conceiving and giving birth!

So what is the solution? Kathy teaches that we need to quantify and explain variation in endocrinological function across and within individuals and populations. The sticky part is understanding when to focus on one level of variation and why. It's that challenge I focused on in the section of the discussion that I led.

Near the end of class, we discussed an article by Bribiescas on the evolutionary tradeoffs that the human male reproductive axis faces between survival and reproduction. I showed several of the charts and graphs that Bribiescas used, and asked the students to identify the levels of variation he was focusing on or masking in each plot, and why. I asked them what the consequences of masking certain levels of variation might be for the conclusions that Bribiescas was making. We also discussed how he was employing a cross-cultural perspective to outline key features of male reproductive ecology.

Throughout, I hinted to the students at two "mystery" levels of variation that all of Bribiescas's plots masked and which are extremely important in practical endocrinology (and also to the highfalutin hypotheses that we test using hormone assays). To my surprise, one of the students solved one of the mysteries by noting that some of Bribiescas's graphs showed urinary hormone profiles, others salivary profiles, and still others serum profiles. I seriously jumped for joy that she figured it out. I know, I'm a nerd. Anyway, different matrices (urine, blood, serum) can tell us different things about reproductive function. Sometimes, people forget that. 

The students haven't yet figured out what the other important source of variation is. And I'm not going to give it away! Stay tuned for the answer once they get it. They're pretty smart, so I don't doubt they will. I'll probably jump for joy again, nerd-like, when they do.

]]>
<![CDATA[The first day of a long quarter. Well, kind of.]]>Tue, 18 Dec 2012 05:35:01 GMThttp://www.mynof3.com/1/post/2012/12/the-first-day-of-a-long-quarter-well-kind-of.htmlNext quarter I'll be the teaching assistant for my department's reproductive endocrinology course. In that course, we use biochemistry to measure the amount of hormones in urine, saliva, and other body fluids. In the process, some interesting hypotheses are tested...and lots of sweat is spilled over whether or not you're performing lab procedures properly.

Being the teaching assistant for this course is a daunting task if you're a graduate student who happens to be an endocrinologist. I'm not that. So this will be a big challenge for me. Thankfully, I like challenges and often overcome them.

I've taken this course before, but I need to review the lab procedures and concepts to get myself back up to speed. Today was the first day of my lab work crash review. I have to say. Picking up a pipette again after a few years is kind of like riding a bike. Except you only use your thumb, and it's a lot more precise. 

Tomorrow, I prepare my samples (of urine and saliva...ew....but still, cool), standards (help you figure out the limits of detection for your assay, among other things), and controls (help you figure out if your assay sucks or not by remeasuring a sample that has a known hormone concentration). The day or two after that, we'll run the assays. We'll be measuring cortisol in matched urine and saliva specimens, and progesterone in urine. It is only three plates, which is far fewer than I ran back when I did my class project. 

Much thanks to the folks at the lab for their assistance in my retraining. I'll maintain this log of my lab course teaching assistance experience.]]>