Friday, March 4th 2011

Mate magnet madness: When the range of possible explanations exceeds your own hypothesis

Figure 1. My apologies to Baby Jaguar
for not finding a picture that included
him.

My daughter will be three in just a few weeks. She loves telling stories. These stories have the same, uncomplicated arc every time: she and her friends Dora, Diego, Boots and Baby Jaguar go on an adventure to rescue Mommy from the giant condor. Or sometimes Mommy and Dora and Diego and Boots and Baby Jaguar are rescuing her. Or sometimes Daddy does the rescuing.

There is almost always a net, then a pair of Rescue Scissors needed to cut the captive free. But the variation in these stories is very small, the framework borrowed heavily from one of the few mythologies known to my little girl: Dora the Explorer.

Evolutionary psychology is often a kind of story-telling, and instead of borrowing from a preschool cartoon they borrow from the concept of anisogamy. Anisogamy is sexual reproduction formed by unequal gametes, in our lineage a big egg made by females and little sperm made by males. This provides the foundation for differential reproductive investment, where females often put in the time and effort of gestation, lactation and care. From here, proponents of EP see essential differences between what men and women want in relationships, and the kinds of relationships that are optimal, and a model this broad makes it possible to shoehorn any behavior into its adaptive framework.

Figure 2. The actual image that
accompanied Tierney’s column.

Enter John Tierney, my (not) favorite journalist for the New York Times. This is the man who thinks that sexism is a radical act (I am referring to his charming articles on gender disparities in science). So I suppose I shouldn’t have been surprised when he outed himself as an EP fanboi in his most recent piece, “The Threatening Scent of Fertile Women.”

Tierney covered the work of Jon Maner and others who have studied relationship maintenance – the suite of behaviors that keeps a couple together. In particular, Tierney focuses on the problem of the wandering eye, or rather, the possible mechanisms that prevent it in a monogamous couple. The idea here is that relationship maintenance is evolutionarily adaptive, because when a couple stays together it is easier to raise offspring and increase reproductive success.

The range of explanations

The study that frames Tierney’s column is Miller and Maner (2010). Thirty eight undergraduate men rated the attractiveness of a woman with whom they interact, at several points over her menstrual cycle. The authors found NO relationship between where a woman is in her cycle and how attractive a single man finds her, but a negative relationship between the chance a woman is fertile and how attractive a partnered man finds her.

What do Miller and Maner (2010) discuss, and what is the idea Tierney is so enamored with?

“It’s possible that some of the men in Florida were just trying to look virtuous by downgrading the woman’s attractiveness, the way a husband will instantly dismiss any woman pointed out by his wife. (That Victoria’s Secret model? Ugh! A skeleton with silicone.) But Jon Maner, a co-author of the study, says that’s unlikely because the men filled out their answers in private and didn’t expect the ratings to be seen by anyone except the researchers.

“It seems the men were truly trying to ward off any temptation they felt toward the ovulating woman,” said Dr. Maner, who did the work with Saul Miller, a fellow psychologist at Florida State. “They were trying to convince themselves that she was undesirable. I suspect some men really came to believe what they said. Others might still have felt the undercurrent of their forbidden desire, but I bet just voicing their lack of attraction helped them suppress it.””

This conjecture is unconnected to the study’s methodology and results. Nowhere in that study did they assess the participants’ state of mind or ask them how they felt about this. How do we know they were trying to convince themselves of anything? This finding, while interesting, does not test their hypothesis for an evolutionary framework for relationship maintenance that includes adaptively suppressing attraction to others.

Maner et al (2009) studied the attention people pay to images of attractive people of the opposite sex when first exposed to sexual words like “lust” and “kiss.” They recruited 120 straight undergraduates, thirty six of whom were in committed relationships. Individuals in committed relationships paid far less attention to the attractive images than those not in relationships. Tierney titters,

“The subliminal priming with words related to sex apparently activated some unconscious protective mechanism: Tempt me not! I see nothing! I see nothing!

I’ve done my own share of human subjects research, and subjects will often tell you or do what they think you want, or they will just not be honest if they don’t want you to know the truth. What if, as originally posed by Tierney himself, the respondents weren’t warding off temptation but wanted to look virtuous? What if, now bear with me because this might seem crazy, the people in these studies were in love with their partners and genuinely uninterested in anyone else? Too often EP wants to provide a single explanation for a behavior, when the range of possible explanations far exceeds their hypothesis.

An anthropological perspective

Jamie Jones, Associate Professor at Stanford and blogger at Monkey’s Uncle describes anthropology like this,

“…[A]nthropology is the science charged with explaining the origin and maintenance of human diversity in all its forms. To achieve this end, anthropology must be unapologetically grand in its scope. How can we explain human diversity without documenting its full extent, through both time and space, and across cultures? … Where does the tapestry of human diversity come from and how is it that we continually manage to resist powerful homogenizing forces and hang on to our diversity? What commonalities transcend local difference to unite all humanity? How is it that civilizations rise and fall? And what is the fate of humanity?”

Jamie beautifully depicts the importance of documenting and understanding diversity even in the face of efforts to simplify human nature. Thus, to me, an anthropological perspective is often at odds with EP explanations for behavior.

An anthropological perspective asks, what happens if you take these basic observations and, instead of deciding on a favorite explanation and applying it to everyone, put them into a model in which you can vary context (age/sex specific mortality rates, distribution of resources, what have you) and see what range of strategies actually give fitness benefits? That is, when you actually throw some variation into the equation, is this still the best strategy for the partnered men with whom Tierney feels simpatico?

Right now we don’t know. Much psychological empiricism rests on undergraduates who participate in studies for course credit. When one wants to make connections to evolutionary adaptedness, they may be a place to start, but not end.

I have a real problem with continuing to use this population to make statements of universality for all humans. Undergraduates usually are trying to avoid pregnancy and build their financial and social capital, so relationship maintenance for the sake of reproductive success rarely exists. Until we can show that relationship maintenance, and the particular behaviors Miller, Maner and others study within that are shown across many populations, and particularly across reproductively-aged folks, their argument for adaptation fails.

Figure 3. Celebrations of marriage.

Another problem is that most work on relationships in EP tends to be heteronormative, meaning that they think nothing of assuming that either everyone is straight, or the universally best behavioral strategy is to be straight. They also tend to assume that the best strategy is to be monogamous, with occasional sneaky infidelity permitted if one can get better genes or more offspring that way (keep in mind that there is a difference between what might be biologically advantageous in a certain context, and what is culturally appropriate – the argument here is not against the culture of monogamy).

But heterosexual monogamy is only one reproductive strategy of many that humans employ. Depending on how you measure it, monogamy and polygyny (single male, multi female marriage) vie for the most frequent strategy – in fact, polygyny occurs in about 80% of modern human societies (Murdock and White 1969). There are even a few rare populations that practice polyandry, which is the marriage of a single female and multiple males. And, even in those populations where monogamy is practiced, serial monogamy is far more frequent than lifetime monogamy: this means that individuals have a series of monogamous relationships rather than find one mate for life (so no, divorce is not a modern human invention).

When taking an even broader, comparative perspective, monogamy isn’t practiced by our closest relatives at all. Chimpanzees and bonobos, both equally related to us, are promiscuous. This is a scientific term for a reproductive strategy that involves females and males making reproductive decisions to mate with many individuals at each fertile period. Bonobos are also promiscuous, but they also use heterosexual and homosexual sex to reduce stress and aggression, and form bonds among one another. Gorillas, our next closest relative, are polygynous. Orangutans are very solitary, but essentially promiscuous. It’s only once you delve into the lesser apes, the gibbons, that you see any monogamy, and they are far less monogamous than we first thought (Brockelman et al 1998).

Maintaining a heterosexual, monogamous relationship is certainly advantageous at certain times, in certain contexts. But it is not universally adaptive, even within humans. Without anyone studying these behaviors in populations that use different reproductive strategies, and in the absence of comparative data to support these assertions, we are at an impasse.

Conclusion

In the words of a friend, EP is plugged into evolutionary theory with little more than a ratty old extension cord. EP takes some very basic, ancestral conditions, like differential costs of reproduction, and uses it in a sufficiently vague way that any behavior can relate to females generally being the ones to put in all the time and effort into making babies. Yet EP often ignores the three conditions necessary for natural selection, the mechanism for evolution. For natural selection to act on a trait, the trait must be variable, heritable, and produce differential reproductive success. Rarely does EP understand variation in a trait, rarely does it examine whether said trait has a genetic component, and rarely does it test whether their trait confers a reproductive advantage.

Are fertile women a threat to partnered harmony, their scents providing a temptation that noble men must suppress? I can’t rule it out, but I also think it is one of the least likely of many possible explanations.

Unfortunately for readers of the New York Times, Tierney loved this idea more than he loved interrogating it.

Acknowledgements

I’d like to thank Charles Roseman, friend, faculty curmudgeon and Bastard Colleague from Hell, for taking a look at an early draft of this post and providing commentary crucial to its improvement. Any rhetorical or scientific errors are my own.

References

Brockelman, W., Reichard, U., Treesucon, U., & Raemaekers, J. (1998). Dispersal, pair formation and social structure in gibbons ( Hylobates lar ) Behavioral Ecology and Sociobiology, 42 (5), 329-339 DOI: 10.1007/s002650050445

MANER, J., GAILLIOT, M., & MILLER, S. (2009). The implicit cognition of relationship maintenance: Inattention to attractive alternatives Journal of Experimental Social Psychology, 45 (1), 174-179 DOI: 10.1016/j.jesp.2008.08.002

Miller, S, & Maner, J (2010). Evolution and relationship maintenance: Fertility cues lead committed men to devalue relationship alternatives Journal of Experimental Social Psychology, 46, 1081-1084

Murdock, G., & White, D. (1969). Standard Cross-Cultural Sample Ethnology, 8 (4) DOI: 10.2307/3772907

Image sources

Dora picture: http://www.doratheexplorertvshow.com/dora/dora-explora-pics.htm
Lady magnet: http://www.nytimes.com/2011/02/22/science/22tier.html?_r=2&ref=johntierney
Same-sex marriage: http://markusisthedrug.onsugar.com/date/2009/05/07

Thursday, February 24th 2011

ResearchBlogging Editor’s Selections: PMDD

Just a quick note to let you all know that my PMDD post was chosen by both Krystal D’Kosta and Jason Goldman for their Editor’s Selections this week over at ResearchBlogging.org.

Comments Off on ResearchBlogging Editor’s Selections: PMDD

Monday, February 21st 2011

Tag-teaming research blogging: Me and Sci do it up, PMDD-style

When I was in college, my favorite hangout was the basement of the Harvard Book Store, where they had the used books and cheap remainders (they were also across the street from my freshman dorm, Wigglesworth, and yes, that is a most excellent name). I worked my way through several sci-fi and fantasy series, and got nearly all my Women’s Studies books, because of that one lovely room.

One night in my freshman year I was browsing the philosophy section with a new boyfriend, a person with whom I often felt inferior and less-educated. I saw an author name on the spine of an old hardcover and, hoping to impress the boyfriend, pointed it out. “Hobbes Machiavelli, I’ve read stuff by him,” I said. I arched my eyebrows with what I hoped was an air of intelligence.

The boyfriend, and a nearby witness, both turned towards me. “Hobbes and Machiavelli are two different people,” he said slowly.

As a blush crept up my face, I realized several things: the excerpt of “The Prince” I had barely skimmed in high school was by Niccolo Machiavelli, Hobbes was a totally different dude, and my boyfriend thought I was a posturing idiot.

It’s a good idea to know what you’re talking about before opening your mouth.

* * *
ResearchBlogging.orgThese days, if I don’t know the answer to something, I don’t try to fake it. Recently, a Twitter follower suggested I write on this New Scientist story and the empirical article upon which it was reporting on brain activity, hormones and Premenstrual Dysphoric Disorder. As I am not an expert on issues of the brain, rather than try to be I enlisted brilliant neuroscientist Scicurious to do tag-team blog posts where we could each cover the material where we had expertise. I had a few thoughts about the way the New Scientist article author framed the study, and about the hormone analyses. So I’ll talk about that, and Sci will cover BRAINZ in this post.

What is this study about?

Rapkin et al (2011) seek to understand why a minority of women experience Premenstrual Dysphoric Disorder (PMDD), a suite of premenstrual behaviors that include severe and debilitating irritability, depression and anxiety. They used PET scans to look at brain stuff (cue Scicurious) and also looked at hormone concentrations to see if the reproductive hormones that decline in the premenstrual phase had anything to do with it. They found no difference in hormone concentrations between control and PMDD women, but did find variation in cerebellar activity by menstrual phase. You need to read Scicurious’s take on this, because she provides important background and context to the study of the cerebellum for mood.

The New Scientist piece makes a lot of the potential effect of progesterone on GABA receptors in the brain, but as far as I can tell the article itself does not measure GABA receptors. Progesterone, allopregnanolone and GABA are all interrelated and important chemicals when it comes to mood (Concas et al 1998), but like I said, since the study didn’t actually look at GABA, I’m not going there. Sci has also made some important points about this issue, and on what the study authors found (which is admittedly cool) with what they discuss around GABA (which might be a wee bit of a stretch).

Nits to pick with New Scientist

Zukerman, the author of the New Scientist piece, begins her piece, entitled “Why women get anxious at ‘that time of the month’” with this:

“Is it that time of the month? These are the words no man should ever utter. How about this for a diplomatic alternative: “Are your GABA receptors playing up?”

You may be spot on. It seems that these brain cells are to blame for some women’s monthly mood swings.

Many women feel a little irritable before menstruating, but up to 8 per cent suffer extreme symptoms, including anxiety, depression and fatigue.”

There are a few things that trouble me about this. First, without citing any actual incidence of this symptom, the author claims that many women suffer from irritability before their period. This just perpetuates the idea that irritability is a common premenstrual trait, when the premenstrual phase is an incredibly variable period. This is despite the fact that at most only eight percent of women actually get these symptoms to the point that they are debilitating (the two studies the study authors cite give a 5% and 8% incidence, so 8% may be high).

From a public health or science research perspective, eight percent of reproductively aged women is a pretty significant quantity. I absolutely want more research to be done on PMDD and, full disclosure, I’m running some pilot studies to work on it in the future myself. However, these results don’t necessarily translate to women who may just get a little irritable or experience other mild behavioral symptoms before their period.

And that is why both the title and the “Is it that time of the month” joke at the start of the story were misleading. Besides its obvious sexism, where any female behavior that deviates from the pleasing and passive risks eliciting that question, the link here in the mind of a popular reader is that women’s behavior is governed by hormone and brain interactions more generally than the paper actually implies.

So, to reiterate: PMDD impacts maybe eight percent of reproductively aged women (notice that I keep specifically referencing “reproductively-aged women,” which further shrinks the pool of women down to those between menarche and menopause). This is nothing to sneeze at. But this isn’t everyone.

Hormones

In order to see if there were differences in hormone concentrations between normal and PMDD women, Rapkin et al (2011) took blood on the days of the PET scans: this translated into one follicular phase (first half of the cycle, between menses and ovulation) and one late luteal phase collection (the week or so before the next menses). They found no difference in the mean concentrations of estradiol and progesterone between the two groups, at either time period.

Table 1 from Rapkin et al (2011). None of these differences between groups are significant according to the authors, but they didn’t report p-values anywhere I could find.

There are several problems with this. First, the sample size is tiny. I have certainly been known to run analyses with fewer subjects, but the way I and other folks who do hormone work get around this is to sample each individual many more times. When collecting hormone information on reproductively-aged women, for instance, you want to collect a minimum of one menstrual cycle’s worth of data… every single day.

More power!

My advisor raised me right, and so I did a power analysis of the data the study authors provided. A power analysis is a way to determine the statistical power of a test. You can do it beforehand to determine an appropriate sample size for your experiment, or afterwards if you didn’t find something statistically significant and don’t know if your analysis was effective. When there are small but important differences between two groups, but the sample size is also small, your statistical test can be insignificant and thus miss that important difference.

Let’s take the hormone and time period that should be the most meaningful: progesterone in the late luteal phase. PMDD women had 5.50 ± 5.27 ng/mL, and control women had 6.76 ± 7.53 ng/mL. If we say that the smallest difference between these two groups that would be interesting is around 6 ng/mL (just splitting the difference between the two standard deviations, but this is pretty generous), then according to my calculations this test only has a power of about 60%. Therefore, 40% of the time a test with a sample size this small wouldn’t catch a potentially important difference between the groups. To put it into more perspective, the standard is to have a power of at least 80%.

What’s blood got to do with it?

On Fertile Ground: A Natural History of Human Reproduction
On Fertile Ground, by Peter T. Ellison. Go to the Amazon page to embiggen the image and you’ll see the plastic tube one of the women is holding to collect spit.


Most people with a clinical background or doing a more clinical collaboration seem to be needle-happy. That is, when they want to measure hormones, they take it out of your arm rather than from the many other places you can get it: blood spots (using a little lancet on your finger), saliva, urine, and feces. This will some day be a blog post in its own right.

Here is the short answer: saliva is very often better than venous blood. Hormones are secreted from their organs in a pulsatile way, meaning they are released in short bursts, which leads to measurements going up and down quickly. Since they circulate in the blood, serum measurements of hormones are likely to pick up this noise. This is yet another reason why only two samples for each of the twenty four subjects is troubling. There are other reasons, related to what version of the hormone you are measuring when getting it from blood, spit or elsewhere, the higher compliance and greater frequency of sampling you can do with saliva, and the fact that you don’t have to stick your subjects or increase their risk of infection.

The only studies looking at variation in hormones across the cycle in menstrually-related mood disorders use blood (Bloch et al 1998, Rubinow et al 1988). The Bloch et al 1998 measures 10 women with PMS and 10 controls using serum every day for a cycle (hooray, every day!) but they measure testosterone, cortisol, and other hormones not comparable to this study. Plus, they are looking at women with PMS, not PMDD, which is a much more broadly-defined syndrome. It would be harder to find a difference between these two groups than controls and women with PMDD.

The Rubinow et al 1988 is old enough that I can’t get it online, the abstract says nothing about how frequently the hormones are measured or the number of women in the study, and I don’t know how strictly they define menstrual disorders (again, as opposed to the rather strictly-defined PMDD).

Variation is the spice of life

My last issue with the hormones is with the two windows during which they measured them. Women were measured in their follicular phase anywhere from 8-12 days into their cycle for the first measurement; then the late luteal phase measurement was 10-14 days after a measured LH surge (which occurs around midcycle).

Here is the kind of variation I see when I measure women’s hormone concentrations every day. What you’re looking at is salivary estradiol (pmol/L) measured daily in over twenty Polish women, aligned by midcycle drop date. The first graph is all the women together, the second is the average and standard deviation.

Individual Polish women’s estradiol concentrations.

Average Polish women’s estradiol concentrations.

Here is salivary progesterone from the same population, aligned by the end of the cycle. Again, the first graph is everyone individually, the second is average and standard deviation.

Individual Polish women’s progesterone concentrations.

Average Polish women’s progesterone concentrations.

A few important things to note: this isn’t the same way the study authors aligned their data (though the way I have shown it here is more physiologically meaningful) and the units are different. However, if you look at about the times when the study authors were taking their measurements – mid to late follicular phase and late luteal phase – you see a TON of variation between those days, both within and between women. This is why a single measurement in that general window is, in essence, of no use. You have way too much noise in a single measurement to be able to begin to say anything about differences between groups.

The punchline

PMDD is very likely related to hormone concentrations – if not in their average values between groups, then in how those hormones differentially impact brain functioning (the brain sensitivity stuff Sci discusses so well). But we won’t know these potential differences if we don’t gather the hormone data correctly. Just because brain scans are cool — and really, they are and I applaud the study authors for doing stuff that I simply cannot do and finding interesting results — doesn’t mean you can give the hormones the short shrift.

References

Bloch M, Schmidt PJ, Su TP, Tobin MB, & Rubinow DR (1998). Pituitary-adrenal hormones and testosterone across the menstrual cycle in women with premenstrual syndrome and controls. Biological psychiatry, 43 (12), 897-903 PMID: 9627744

Concas A, Mostallino MC, Porcu P, Follesa P, Barbaccia ML, Trabucchi M, Purdy RH, Grisenti P, & Biggio G (1998). Role of brain allopregnanolone in the plasticity of gamma-aminobutyric acid type A receptor in rat brain during pregnancy and after delivery. Proceedings of the National Academy of Sciences of the United States of America, 95 (22), 13284-9 PMID: 9789080

Rapkin AJ, Berman SM, Mandelkern MA, Silverman DH, Morgan M, & London ED (2011). Neuroimaging evidence of cerebellar involvement in premenstrual dysphoric disorder. Biological psychiatry, 69 (4), 374-80 PMID: 21092938

Rubinow DR, Hoban MC, Grover GN, Galloway DS, Roy-Byrne P, Andersen R, & Merriam GR (1988). Changes in plasma hormones across the menstrual cycle in patients with menstrually related mood disorder and in control subjects. American journal of obstetrics and gynecology, 158 (1), 5-11 PMID: 2962499