SOC: How `awful' is `awful,' and is my `awful' the same as yours?"

From: Chris Rasch (crasch@openknowledge.org)
Date: Mon Apr 23 2001 - 12:50:15 MDT


N.Y. Times
January 2, 2001

Researcher Challenges a Host of Psychological Studies

By ERICA GOODE

Here is the problem, as Dr. Linda Bartoshuk sees it: Say that two men,
call them Richard and John, are both suffering from depression, and a
researcher wants to find out if a particular medication will offer them
relief.

Asked to rate the intensity of his depression on a scale of 1 to 10,
Richard selects a 6. John, given the same rating scale, also picks a 6.
But does he feel the same degree of depression as John?

Many researchers, said Dr. Bartoshuk, a psychologist at the Yale
University School of Medicine and an expert on taste perception, assume
the answer is yes, that, in effect, a 6 is a 6 is a 6.

But in fact, Dr. Bartoshuk said, nobody really knows, since depression,
like many internal experiences, is subjective.

And therein lies an error that compromises many psychological studies,
she
believes, perhaps calling their findings into question.

Dr. Bartoshuk has described the error which pops up in studies of
everything from depression and taste perception to pain, prejudice,
eating
disorders and self-esteem at several professional meetings. And she has
written several papers detailing how the mistake is made.

But not all of her colleagues have reacted with gratitude.

"There seems to be a lot of venom," Dr. Bartoshuk said. "I've gotten
nasty
letters that basically said I was a moron."

As Dr. Bartoshuk describes it, the problem is simple but easily
overlooked. It creeps in, she says, when researchers use a common
measurement technique, the rating scale, to compare different people's
subjective experiences. Undetected, it can produce distorted or even
backward results.

The problem can be corrected if studies are properly designed to take
account of the relative nature of the subject's responses, Dr.
Bartoshuk
said.

But in a review of eight volumes of the journal Physiology and Behavior,

Dr. Bartoshuk said, she found the error in 17 of 68 human studies that
used rating scales. In some cases, the mistake completely invalidated
the
studies' findings, she said.

Mapping the inner world is the goal of much psychological research. And

for many researchers, rating scales offer a convenient method to explore

what their research subjects are feeling and thinking.

In a typical study, subjects are asked to indicate, by locating
themselves
on a 9- or 10-point adjective scale, the intensity of a feeling or
sensation.

A scale measuring hunger, for example, might ask the subjects to rate
their appetite levels from 1 ("not at all hungry") to 10 ("very
hungry").
An experiment on taste perception might invite ratings of the taste
intensity of a particular substance from 1 ("very weak") to 9 ("very
strong").

Rating scales are highly effective when researchers want to find out how

the same subjects' feelings, attitudes or sensations change over time or

in different situations. A psychologist, for example, might ask a group
of
subjects to rate how depressed they felt, give them six weeks of
psychotherapy, and then have them repeat their ratings to see how
helpful
the therapy was.

The scales are also useful when researchers randomly assign subjects to
different groups giving each group a different drug, for example and
compare the results.

But many investigators also use rating scales for another purpose: they

want to know how two groups of subjects, who differ in some way,
experience things differently. A researcher might ask, for instance,
whether men were more jealous than women, whether cocaine addicts craved

cocaine more than smokers craved nicotine, or whether people with
different psychiatric diagnoses suffered from similar or different
levels
of depression.

And this, said Dr. Bartoshuk, is where the mistake begins.

Such studies, she said, rest on the notion that the adjectives that mark

the points on the rating scale mean the same thing to everyone.

But the reality is just the opposite: because internal experiences are
subjective, the same adjective may mean something very different to
different people.

A famine victim's "very hungry," for example, probably indicates a level

of hunger far higher than the "very hungry" of a Fortune 500 C.E.O.
questioned an hour before dinner. Similarly, "very depressed" likely
suggests an emotion far more extreme to a person suffering from manic
depression than to someone whose emotional life is generally on an even
keel.

"Everybody knows that adjectives are entirely relative," Dr. Bartoshuk
said, "yet scientists pick up these adjectives and use them on scales
like
they are absolute." Or, as she posed the question in a recent article,
"How `awful' is `awful,' and is my `awful' the same as yours?"

In some studies, the fact that two subjects may interpret a scale
differently makes little difference, because what the scale measures is
not central to the goal of the study.

But in other cases, the assumption that people who choose the same
adjective mean the same thing by it can wreak havoc.

Some of the most striking examples of this, Dr. Bartoshuk has found, are

in taste research, where the confusion has obscured important
differences
in perception that have an effect on diet, and thus upon health.

Her own studies, for example, focus on genetic differences in taste
perception. In particular, Dr. Bartoshuk has identified three different
types of tasters: supertasters, medium tasters and nontasters.

Supertasters not only live in a much more intense taste world than
medium
tasters or nontasters, she has found. They also have a higher density of

taste buds on their tongues, providing a way for researchers to easily
recognize them.

But a funny thing happens if an experimenter tries to compare the taste
experiences of supertasters and nontasters using a traditional rating
scale.

A "strong" taste for a supertaster is much more intense than a "strong"
taste for a nontaster. In fact, supertasters are operating on a much
larger taste scale altogether, with a higher intensity ceiling and a
greater distance between points.

But when the ratings of supertasters are recorded on the same scale as
those of nontasters, these differences in perceived intensity are
obscured. Large differences between the two groups look smaller. And
small
differences disappear altogether, or even appear to go in the opposite
direction.

As a result, some researchers, using traditional rating techniques, have

failed to find large differences between supertasters and nontasters,
even
for substances, like the chemical known as PROP, to which supertasters
are
known to be especially sensitive.

Dr. Bartoshuk and other investigators by using methods that get around
the
subjectivity problem, like asking subjects to anchor their responses by
referring to an entirely independent scale have been able to demonstrate

large differences.

While Dr. Bartoshuk suspects that the rating scale mistake may be most
common in studies of sensory perception, two social psychologists, Dr.
Monica Biernat of the University of Kansas and Dr. Melvin Manis of the
University of Michigan, have uncovered the same problem in their own
field, in slightly different form.

In a series of studies, for example, Dr. Biernat and Dr. Manis have
shown
that researchers cannot assume that subjects always mean the same thing
when they apply descriptiions like "tall," "aggressive" or "financially
successful" to others.

Rather, the meaning of such words shift, depending on whether the person

being described is male or female, white or African-American, or belongs

to another group that is often stereotyped.

In one study, for example, the subjects were shown photographs of men
and
women at a farmer's market and asked to rate how financially successful
they believed the person in the photo to be.

Men and women in the photographs were rated as equally successful by the

subjects, Dr. Manis said. Given the same photos and asked how much money

the person in the photograph earned, however, the subjects judged the
women to have lower salaries than the men.

Similarly, a man and a woman may both be rated as "very tall." But when
people are asked to guess the height of the man and the woman in feet
and
inches, they will judge the man to be taller.

"It's a little bit of a paradox," Dr. Manis said. "In much of our
language, these adjectives are adjusted to fit the category you're
talking
about."

"You can say, `I know about a large chihuahua and, amazingly, you can
fit
it into a small car,' " he added. "It's because the meaning of the words

`large' and `small' depend on whether you are talking about cars or
chihuahuas."

Dr. S. S. Stevens, a renowned psychophysicist at Harvard, made the same
point 40 years ago, writing that, "Mice may be called large or small,
and
so may elephants, and it is quite understandable when someone says it
was
a large mouse that ran up the trunk of the small elephant."

Like other early developers of scaling techniques, Dr. Stevens, who
pioneered a method called "magnitude estimation," did not use his scale
to
compare the responses of different groups of subjects: if scientists do
not know whether they are dealing with elephants or mice, Dr. Stevens
realized, it becomes anyone's guess what small and large really mean.

The difficulties of such comparisons also became evident early on to
cross-cultural researchers, who found they had to take into account the
different shades of subjective meaning that colored their subjects'
views
of the world.

Dr. Douglas Wedell, an associate professor at the University of South
Carolina and an expert in rating scale methodology, recalled a study in
the 1940's that asked people from many different countries to locate
themselves on a "ladder of happiness."

The researchers, he said, were startled to discover that the desperately

poor residents of famine-torn regions of India rated their happiness at
close to the same level as American suburbanites whose homes were
stocked
with modern conveniences and who always knew where their next meal was
coming from.

"Across the world, just about everybody was happy," Dr. Wedell said,
though what exactly happiness meant obviously differed greatly from
culture to culture.

The scaling mistake that Dr. Bartoshuk has drawn attention to, he
agreed,
"is a major error that you'll find in major places."

Yet not everyone agrees. When a piece about Dr. Bartoshuk's concerns
appeared in The Monitor, a publication of the American Psychological
Association, for example, one letter writer sneered, "It seems likely
that
the response to the problem is small because the problem is, too."

Other colleagues, Dr. Bartoshuk said, dismiss the error as rare.

Still, when she explains the situation at meetings, she said, people
often
look at her illustrations and gasp.

"It's one of those `aha!' experiences," Dr. Bartoshuk said. "This is all

terribly obvious when you think about it. The question is, how did such
mistakes get into the literature?"

--
Use e-gold?  Send me two cents:
twocents@openknowledge.org">http://2cw.org/257121&twocents@openknowledge.org
Read the _Wall Street Performer Protocol_:
http://www.openknowledge.org/writing/open-source/scb/



This archive was generated by hypermail 2b30 : Mon May 28 2001 - 09:59:55 MDT