A substantial part of the news aggregator and blog world is discussing the
gender interest analysis paper that recently hit the streets and having taken the time to read the actual study, I figured I’d throw my $0.02 cents in. I’ve got a few issues or questions about the study that I will discuss below. Note that I am not a social scientist, so it’s entirely possible that some of these things are considered normal in the field. I’m not saying that the conclusion is absolutely wrong, merely that to an outsider there appears to be some issues.
The official online version of the study is not yet available online, so I am basing my responses off of this:
http://www.psychologicalscience.org/journals/ps/19_4_inpress/Farris.pdf For those who don’t want to wade through all of my rambling on the subject, here is my description of what I think the study was actually on:
Is there a gender difference in assigning emotions to a photo given a short period of time compared to a long period of time, hindered by a non-random and non-balanced participant distribution?
------
1. The study participants participated for class credit. I assume that this means that they were all members of the same class or at least classes in the same department (I’m guessing, perhaps Psychology). I have a problem saying that a group of young, white students, at the same college, in the same field represents "men" and "women." It’s entirely possible, for example, that the type of women who are drawn to Psychology as undergraduates are more perceptive than men. Or that one gender has a greater percentage of foreign students.
Also as a potentially huge factor, what was the dating status of the subjects and did that affect their interpretation of the photos?
2. The 178:102 ratio of males to females in the study. Why? If you are trying to compare two genders, doesn’t it make sense to have an equal ratio?
3. The study says that participants were randomly selected to either see the images for 500ms or 3000ms. Define random. (Anyone who's ever played a game with me that involves dice knows that randomness does not necessary reflect perfect entropy... particularly within a relatively small number of subjects.) Also, the study does not break out the results based on the length of time that they had to see the image. It could be possible that men are terrible at judging the pictures with a split second but good with the longer period of time. Not knowing the distribution of the short/long viewers makes it hard to tell. The paper says that it didn’t have a significant impact, but show me the numbers.
4. The pilot testing (from what I can tell reading the paper) used college age models, again, perhaps drawn from the student body of the Psychology department. They were photographed (again, from what I can tell) while not actually in a social situation and then assigned an emotion by a two-panel review.
I suspect most people reading this have some background in theatre, and are aware that projecting emotions is not necessarily easy for actors let alone random students.
So at this point I’m seeing the study as comparing whether the study population does better at assigning the same emotion given a short period of examination compared to a group given a long period of examination.
5. Back to that sample size... 67 males thought that the "friendly" photos were sexually interested while 32 females did, representing 37.8% and 31.9% of their genders, respectively. However, there were ~1.75 males for every 1 female in the study. If we assume that the percentage stays the same, that means that 58 females would have incorrectly labeled the images, if you balance the genders. Is a 9 person difference significant or could that be accounted for by random chance or other factors (see issue 1)?
Similarly, the reports talks at some length about the fact that men rated the interested photos as friendly, but when you do the math, 21 males and 9 females (15.5 adjusted if there were gender balance) mistook the intentions. Is a 6 or 7 person difference significant?
Also on the subject of whether men misinterpret sexual interest for friendliness, the paper rather arrogantly says that even though their female subjects did not self-report that phenomena, they were confident that their study was a better metric on the subject. I think we'll all grant that the plural of anecdote is not data, but I find the statement rather troubled.
First, unlike the study, the men being reported on by the test subjects were responding to real interest, not an emotion assigned by a panel of people looking at staged photos. Second, attempting to compare the cues given by the staged photos to the cues given in real life is a ridiculous comparison - there's duration, other types of contact, other senses... It's apples to oranges.
There are situations where men misread sexual interest for friendliness but I don’t think it’s anywhere nearly as significant as the authors would like it to be.