I don't know if it's rotten, but it smells funny

Apr 27, 2007 01:10

To me, psychology is interesting and frustrating.



Here's a quick rundown (in retrospect, it's not as quick as I thought it would be...) of things as I see them. You have these things that you are interested in knowing about (elements of how people think, feel, perceive, remember, learn, etc.) because presumably they have some predictive power; we call these things "constructs". The relationships between constructs are called "theories".

Now, people want to judge whether theories are consistent with what they see (or might end up seeing in the future) in the real world. To do this, they devise procedures ("operational definitions") that produce observable variables; the measurements we make of these observable variables are supposed to relate to the constructs in some fairly direct way (so- say we want to know about people's attitudes towards peanut butter. Since we can't penetrate their mental essence and come to know directly how they feel about peanut butter, we must instead devise a procedure- like giving that person a questionnaire about peanut butter- that gives us something we can observe, and is hopefully an accurate measure of how they feel about peanut butter).

Some of these procedures are just there to measure things. Other procedures are there to affect things. This allows experiments to mirror the structure of the theory that they're trying to test. That is, the theory says that variables A and B cause C kinds of changes in variables D and E. We use our affecting procedures to change the values of observable variables LikeA and LikeB. We use our measuring procedures to find out what happens to observable variables LikeD and LikeE. The hypothesis is that LikeA and LikeB will follow pattern C with respect to LikeD and LikeE. Normally, we think of hypotheses as being "derived from" theories, because of that common pattern C that underlies them (both the hypothesis and the theory), and the fact that LikeA is "just like" A, LikeB is "just like" B, etc.

After we affected some variables and measured others (i.e., after we performed an experiment) did pattern C hold in what we saw? If not, then this experiment served as evidence against the theory in question.

If you haven't been exposed to this before, let it sink in. According to some psychological studies, being able to call to mind very quickly the meanings of terms in a discussion will help your ability to remember and understand the discussion much easier. So go back and reread if you need to.

A problem in psychology research, as I see it, is that theories are too hard to kill. Let us say that I believe in the theory that people's attitudes towards peanut butter do not change in the short term, but only slowly worsen as they age.

My arch-nemesis devises the following diabolical experiment- he will give people questionnaires regarding their attitudes towards peanut butter. Then, he will give people the tastiest peanut butter sandwiches he can, served by the most attractive graduate student/models he can find. Then, he will give people another peanut butter questionnaire. My arch-nemesis thinks that my theory would predict that the questionnaire scores should remain the same, no matter how tasty the sandwiches and no matter how attractive the grad student/models. He also thinks that my theory is wrong, that the questionnaire scores will improve, and that once I see the improving questionnaire scores, I will admit that my theory is wrong.

The results of the experiment come back. People put very positive evaluations of peanut butter on their questionnaires after they'd had those tasty, tasty sandwiches- far more positive than their questionnaires before the sandwiches. This seems to run directly counter to my theory about people's attitudes towards peanut butter slowly declining over time. My archnemesis twirls his mustache in delighted anticipation of the death of my theory. He writes a paper and gets it published in the journal Science.

Then, I point out that people's attitudes towards peanut butter are not properly being measured by the questionnaire. What about closeted peanut butter addicts that are ashamed of their addiction? They would never put down answers that really reflected their attitudes! Since the questionnaires weren't measuring people's attitudes, the observed variables were not "just like" the constructs they were supposed to correspond to. That means that to me, his experiment does not serve as evidence against my theory. I have my own style of measuring the construct of peanut butter attitudes- I measure how quickly people pick up and eat a peanut butter sandwich that I have placed under an inviting sign right next to them.

I do my own experiment, where I place a public sandwich, measure how long it takes for the subject to pick it up and eat it, have my hottest grad student administer a super-tasty sandwich to the subject, and place another public sandwich, measuring how long it takes for the subject to pick that last sandwich up and eat it. This time, the subjects take longer and longer times to pick up and eat the sandwiches presented to them. HA! In your face! Since time-to-pick-up-and-eat-a-sandwich is an accurate measure of people's attitudes towards peanut butter, and people took longer and longer times to pick up the sandwiches, people's attitudes did indeed exhibit the slow decline that I predicted them to! My reply to my arch-nemesis's experiment is received enthusiastically by the journal Nature.

The point is that there is a weak spot in the chain of reasoning required to kill a theory. People can argue that they are blue in the face about whether a given experiment really supports or refutes a theory, because the experiment may or may not have measured variables that are "just like" the constructs that the theory is talking about. So the people that believe in a theory can, by their twisting explanations, make their theory "squirm out of the way" of experimental results that should've been damning.

There is a solution to this. Experimenters can, for every construct their experiment involves, follow multiple independent procedures to measure that construct. So a better experiment than either of the above would be if we measured people's attitudes towards peanut butter in several different ways, checking to see how many of these different ways followed the patterns we were expecting. I might measure peanut-butter-related attitudes by showing subliminal pictures of peanut butter and measuring how many squiggles that the polygraph they're hooked up to makes, or with a questionnaire, or by asking their friends, or by asking how much they would be willing to pay for different quantities of peanut butter, etc. It's too bad that this is expensive and time-consuming, and that sometimes it produces patterns of results that are hard to interpret (because some variables exhibit the expected changes, and others supposedly related to the same construct do not).

But there's an even worse problem that ends up slowing down the "progress" of psychology research (or at least, makes it much more irritating).

The constructs that we make theories about don't just come out of nowhere. We have to come up with them. Notice that implicit in my theory about peanut butter attitudes is the idea that there are such things in people's mental processes that we will call "peanut butter attitudes", which itself assumes that people have attitudes. These are perfectly common-sense attitudes to have :) but that doesn't mean that this supposed element of people's mental life is a productive concept to think about.

Before we even make up our theories (kind of), we have to slice up people's mental life (or whatever other domain we are interested in) into categories, and subcategories, and we give these categories names as constructs so our theories can refer to them. If we end up categorizing things in stupid ways, we're going to have a hard time explaining things with those categories. In fact, this could cripple us for as long as we adhere to those categories.

Our act of categorizing the various parts of the domain we're interested in (into various constructs we're interested in) can be guided by our analogies, but if we're going to do this we must ensure that our analogies have some depth to them, or we won't be guided very well for very long. I've read papers that variously describe the changing position of that part of our visual field that we are paying attention to as "like the motion of a spotlight" or "like the motion of a frog's tongue". Those comparisons are dumber than the dumbest comparison you could otherwise come up with (i.e., this one).

We cannot try to destroy our systems of categorization and our analogies with experiments. Experiments are by necessity practiced in terms of the existing categories. The categories are on another level of explanation, irritatingly immune to experiment. Because of their influence and persistence, they must be chosen with the utmost care- and, if I had my way, some way of making them explicit would be present on every paper (or, at least, every journal). Alternative systems of categorization should not be treated as mutually exclusive any more than different languages are. Some systems of categorization will be more concise for the same level of accuracy in prediction, but for good determinations of which systems of categorization are most concise, we must tolerate and deal in multiple systems of categorization simultaneously.

There is one last barrier I see. This time (unlike in previous cases) it's cultural.

A few hundred years ago, Charles Babbage designed a wonderful device called the Analytical Engine. It was the first-designed programmable computer. Of course, it wasn't the first-built programmable computer- some other computer got built first, in the 1900s. The Analytical Engine was eventually built after other programmable computers had been built, centuries after its design. So what was the hold up? The problem was that the machine required very heavy slabs of metal to be cut very precisely. Anything that, in Babbage's time, could've been cut precisely enough, wouldn't have been strong enough. Anything that was strong enough couldn't be cut precisely enough. And so the Analytical Engine's construction had to wait until materials science and our ability to finely cut tools had improved enough.

Around 1900, physics was at a turning point. Physicists were faced with a class of apparently unsolvable problems (which most of them just willfully ignored). They were held back by math. Math just wasn't good enough to let physics get beyond its rut. Then, after math had matured, physicists had the mental tools to make sense of their strange, till-then unsolvable problems. They then proceded to apply the new mathematical tools to their problems to form "quantum mechanics", and physics took a big leap forward.

And now, I think psychology is faced with its own turning point. This time, there are many great mental tools out there- math, stats, the works- that psychology researchers (I'm overgeneralizing here) don't have access to. Unlike in previous eras, the tools exist. But there's too much to learn, and people don't have enough time to learn it all. For example, to properly build my personal set of categories, I would like to steep myself in the fields of stochastic processes (how things act when they are nudged randomly a little bit at a time), every statistical procedure I can learn, neural anatomy and dynamics, developmental psychology, and educational psychology. That's going to take forever- probably half of my life.

"Okay, okay, not too bad", you might be thinking. "We can just collaborate with people from other disciplines that do know all that other non-psychology stuff, and we're in the clear." This is a nice thought, and it will probably do us all some good. But the problem is that other people must be able to read and understand your work for it to be reviewed favorably for publications in journals. If I were to come up with some sweet idea and try to get it published- but (for example) it had a statistical technique that other psychology researchers were unfamiliar with- I'd have a harder time getting it published. I would be told by reviewers to use "more standard" techniques, even if those techniques were not appropriate for the experimental circumstances or for my underlying idea. Our collective, interdisciplinary group might be able, as a system, to understand the new and revolutionary concepts we want to get across. But the individuals that review your paper for publication will not be part of an analogous interdisciplinary group, and they will not like your departures from convention and what they already understand.

I'm not sure what the remedy is for this. It might be necessary to start new journals founded on the principle that interdisciplinary groups of reviewers should collaborate on the review process. While we're at it, we should probably reduce the price of subscription to something that individuals (not just institutions) can afford, because if it's being reproduced electronically, it's not like we're investing a huge amount of effort and expense into typesetting it anymore.

So... Sorry for the long post. That ended up being kinda big.
Previous post Next post
Up