Introduction To Sampling Arguments
For the purposes of this course, the word "argument" will be used to refer
to any attempt to persuade another person that some claim is
or is not true. This chapter will teach the basic analysis of a special kind
of argument I call a "sampling argument." Sampling arguments are commonly
used to support general statements. Generalizations are
statements that cover the whole of some population, such as
Americans, wombats, the water in the oceans, left-handed Armenian
mole-diggers, Scotsmen with Irish names, tea-drinkers, trees, people who do
horrible things to turnips... well, you get the idea. A sampling
argument is an argument that starts with a "sample,"
a small group of taken by some method from a larger population,
and then attempts to persuade us that a feature clearly
seen in the sample must therefore also be a feature of the population.
Main Topic: Sampling
The essence of a sampling argument is the "sample." Usually, populations
are so large that we cannot reasonably test the state of
every member of that population. For instance, if we wanted to know what
proportion of Scotsmen get tipsy (slightly drunk) on Hogmanay,
we cannot possibly hire enough obervers to follow around every Scotsman
around on the evening of December 31st. (Especially if we count female
Scots as "Scotsmen" Oh, lets just call them "Scots."), so we're scre... I
mean, so we have to fall back on looking at a much smaller
number of Scots and extrapolating the results to all
haggis-eatin', kilt-wearin' caber-tossers. (This is perhaps an unfair
characterization of the Scots. Very few of them actually toss cabers.) So
let's just hire people to follow around a randomly selected group of one
million Scots next Hogmanay and to report on whether or not they get
tipsy. Say that 75% of these randomly selected Scots get tipsy on
Hogmanay, we could then make the following argument.
Exactly 75% of our sample got tipsy this Hogmanay, therefore 75% of all
haggis-eaters got tipsy this Hogmanay.
Here's how the terminology of generalization matches up with this
argument.
Facts
Population: All Scots (several million of them.)
Sample: One million randomly selected Scots.
Feature being tested: Tipsiness.
State of the sample: 75% tipsy at this Hogmanay.
Conclusion drawn from those facts
State of the population : 75% tipsy at this Hogmanay.
This is how a generalization works, if it works at all. A sample is
taken, and it is argued that the state of the sample must be
the same as the state of the population. If the state of the sample cannot
reasonably be explained without assuming that the population has the same
state, the argument is good. If we can reasonably explain the
state of the sample without assuming that the population has the
same state, the argument is no good, lousy, bogus, wack, heinous.... I'll
stop now.
For another example, imagine that two people, call them "Jeeves" and
"Wooster," are trying to figure out the overall composition of the
following population. Imagine also that neither of them can see the
population the way you can. (You can see that this
population is extremely well mixed. In fact, there are only two deviations
from perfect mixing. They appear in the top left and bottom right corners
of the field. By some strange coincidence, that's where Jeeves and Wooster
take their samples from.) They know that it's composed of 2,600 colored
dots, but that's about it. Neither of them has any idea of how the dots
are distributed, or anything else besides the fact that it's made up of
dots. And of course, neither of them knows that the population is made up
of 650 red dots (25%), 650 blue dots (25%) and 1,300 green dots
(50%) Now Jeeves takes a sample from the top left corner of the
population (red line) while Wooster takes a sample from the bottom right
corner, (blue line). Each of them then makes a claim about the composition
of the population based on their samples.


























Jeeves's sample is 50% green, 25% red and 25% blue. So he claims that the
population is 50% green, 25% red and 25% blue.
Wooster's sample is 25% green, 25% red and 50% blue. So he claims that
the population is 25% green, 25% red and 50% blue.
That's quite a big difference. Who's closer to being right and why?
The reason Jeeves's argument is better than Wooster's argument is that
argument Jeeves's sample is big enough to swallow it's imperfection in the
mixing of the population (which means that his sample is representative
of the population) while Wooster's sample is so small that it's
imperfection crosses the sample border, distorting the result (which means
that his sample isn't representative of the population).
Are these samples too small? Well that depends on what we know about the structure
of the population.
Sample Size
We saw above that it's possible to have a sample that's way too
small to accurately represent the population it's taken from. However,
it is sometimes the case that a population is structured
in such a way that even a small sample can be perfectly representative, if
it's taken the right way. A population is not always arranged as a chaotic
mixture of individuals. Some populations are arranged in such a manner
that we can take a very small sample with absolute confidence that the
result will perfectly represent the composition of the population. For
instance, consider the population of dots shown below. Imagine that we
know that the population is structured in the way shown, but we don't know
the colors of any of the rows. Now imagine we take the very, very, very,
very small sample of exactly four dots comprising the first dot in each of
the first four rows, as shown in the top left corner of the image. That's
a sample of four out of four thousand. That's one per thousand, which
means one tenth of one percent, or 0.001. Is that too small?

Our sample comes out 50 percent red, 25 percent blue and 25 percent
green. Given that we know the structure of the population,
what are the chances that the population is 50 percent red, 25 percent
blue and 25 percent green?
Therefore, the following argument is very bad. (Technically, it
commits what we call a red herring fallacy:)
It hasn't been
proved that the dots in the picture above are 50% red, 25% blue and 25%
green because the sample upon which that generalization is based is only
0.001 of the population, which is waaaaaaaay too small a sample.
The key fact here - the thing that makes this argument
bad - is that the population is completely structured in
alternating homogeneous rows of red, green, red and blue dots. It is this
highly organized structure that allows a miniscule sample of just four
dots to perfectly represent the composition of the whole population.
As a matter of fact, there is no limit to how proportionally small a sample
can be. To see this, imagine a population of infinitely many dots,
part of which is shown below. (The rest of the dots extend off your screen
to the right.) This population is structured as you see here, in four rows
of dots, each row being composed of dots of exactly the same color.

How big a sample do you need to tell the composition of this population?
Will four dots do? It will if each of those four is the first dot
in it's respective row. Now, that is an infinitesimal sample,
which means it's equal to one divided by infinity, but that doesn't matter,
because the population structure makes that proportionally infinitesimal
sample perfectly representative of the whole.
There are two lessons here. The first is that even an infinitely small
sample can be representative if it's properly taken from a population that
has the right structure. The second is that it's possible to fail to convey
the most important facts about this situation. To see this, think three
things. First, think about a particular population that is very similar to
the one pictured above. Second, think about two opposing arguments about
that population. And third, think about two different critiques of
the weaker of those two arguments. It is those two critiques that
I want you to focus on here.
Key Fact: The population is arranged in
equally-sized rows of indentically-colored dominoes.
Perfect Mixing

Another way to get an accurate result with a very small sample is if a
population is perfectly mixed. Imagine another infinitely large
population in which individuals are so perfectly mixed that every part of
the population looks like the following picture. (Notice that this is NOT
a random distribution of dot colors. It's a carefully structured
distribution. A random distribution would be unevenly mixed, not
smoothly mixed like this one.)
Try to find a four-square group, or a contiguous line of four dots that isn't
a representative sample for this population.
Now imagine blindly picking dots from random places scattered
through the population. How many would you have to pick to guarantee a
representative sample? Not many!
Now imagine you work for a petroleum company. You check the composition of
oil products so the company can decide how each tanker load will be
processed. Your company's tankers contain a pumping system that circulates
the oil between all the tanker's oil-carrying compartments. All the oil is
moved and turbulance from the pumping process mixes the oil products so
thoroughly that every centiliter in that tanker is absolutely identical to
every other centiliter in that tanker. Given that a litre is one hundred
centiliters, would one liter be a big enough sample to test the composition
of the oil mixture in a tanker holding a billion liter
of oil products?
The point here is that small sample size may make the sample
untrustworthy, but there may be special circumstances that make
this particular sample an accurate representative of the population, even
though it is way smaller than we would normally accept as a good sample.
Estimating Sample Size: Variables and Values.
If 1% can be an adequate sample, 50% can be inadequate. Imagine that Noah
was an educational administrator who had to rely on state grants for his
funding. God issues a grant that will allow Noah collect two of every
animal, but Noah's immediate superiors insist he spends half of God's
money on computers. Thinking outside the box, Noah adapts to the situation
by only including one of every animal. What if aliens later came
across Noah's Ark bobbing on the flood waters, how big a sample would they
need to accurately represent the animal passenger list? Say they picked
50% of the animals at random. Would that give an accurate picture? Would
90% be enough to give a picture that was accurate to within 1%?
When we're worrying about sample size for a perfectly random sampling
method, it is sometimes useful to talk about variables and their values.
Consider Noah's Ark, but this time without any middle-management between
god and Noah. Noah marches on board two-by-two, one couple of each kind.
In this situation, sex and species can both be considered variables, each
with its own characteristic range of values. Sex is a variable with only
two values, and thus a sampling argument concerning the sexes of the
animals would only need a fairly small random sample. Species, on the
other hand, is a variable with thousands of possible values. Given that
Noah's Ark contains only two of each kind of animal, a sampling argument
concerning the distribution of species on the Ark would need a sample size
of considerably more than fifty percent, if it was based on a truly random
sample. (A non-random sample could do it accurately at only fifty percent,
if the sample was chosen in the right way.)
How big is big enough? Firstly, the issue of whether a particular sample
is big enough doesn't depend on what percentage of
the population is included. If the above well-mixed population was only
four individuals large, only a 100% sample would be big enough! Anything
less than four would leave out at least one color! If the above population
was 8 individuals, a 50% sample would do. If it was 16, a 25% sample would
work. If it was 32...
The minimum necessary sample size depends on the number
of different relevant properties individuals can have, and on the degree
of mixing in the population. In the well-mixed population above, the
number of different relevant properties is four, because there are four
colors, and the population is perfectly mixed. If the number of
different properties was larger, or if the population was less well
mixed, minimum necessary sample size would be larger.
So, there are two different things to think about
when you consider the proportional size of a sample to it's population:
- How large is the sample compared to the
number of different relevant properties individuals in the
population can have?
- How is the population structured? Is
it evenly or unevenly mixed? Are its members arranged in some way that
so that the given sample is sure to be representative of the
given population?
Finally, I have noticed that some students have a
tendency to say that a sample size is too small when they cannot think of
anything else to find wrong with the argument, or where the sample size is
ten percent or less. Don't do this. If you cannot see
any reason why this particular sample is too small for this
particular population, given this particular sampling method
and the structure of this particular population, then the sample
size is not too small. Arguments are only bad when there are specific
reasons to find them bad. Saying that the sample size is too small when
you can't think of anything else is never a good idea.
Sample Age
Imagine you are an atmospheric scientist studying
inertium monoxide levels in the atmosphere at various points in history.
Like the rest of Earth's air, intertium monoxide does not react with any
other gases in Earth's atmosphere. Say that because of the way inertium
monoxide is produced and distributed, the level of inertium monoxide in
Earth's atmosphere at any point on Earth is never more than ten percent
more or less than the global average at that time. (If today's global
average is 1%, there's nowhere on Earth where the inertium monoxide levels
are lower than 0.9% or higher than 1.1%.) You recover an air sample that's
been held absolutely isolated for three thousand years at the base of a
really old glacier. (They can actually do this for samples that are
several hundred years old.) The sample contains 10% inertium monoxide, so
you can conclude that three thousand years ago, Earth's atmosphere held a
global average of between 9 and 11 percent intertium monoxide.
Is the sample too old? Not if the sample was absolutely isolated!
Remember, there's nothing in Earth's air that inertium can react with, so
the sample can't change over time. Isolation prevents sample gasses from
escaping and gasses from later atmospheres from getting in, so it can't
change that way either. So, in this case, a three thousand year old sample
is enough for a good generalization, provided that all the other
factors are taken care of. Notice however that we can't use this sample to
generalize about today's atmosphere. Atmospheres can change quite
spectacularly over time. Imagine trying to generalize about the air in Los
Angeles today based on an air sample taken in 1902! This is why I use the
term obsolete, which means that we know that conditions have changed,
so that the sample is no good anymore. A sample can be very old
without being obsolete, and a sample can be obsolete without being very
old at all.
Here's a real-life example. More than about 4 billion years ago, the solar
system was nothing but a widely spread out mass of gas and dust particles
which was slowly but surely organizing itself into bigger and bigger
clumps, many of which banged into each other to make larger clumps. Our
Earth was one of those lumps. While the Earth was first forming, it was
hot and mostly molten, so the heavier materials gravitated to the center
of the lump and the lighter materials were forced up to the surface. The
heaviest materials became the Earth's core. Just before the Earth finished
forming, a really big lump smashed into it hard enough to kick some of
that core material up to the surface on the other side of the Earth. 4
billion years later, scientists found some of that material, figured out
what it was, and used it to figure out the exact chemical composition of
the Earth's core. Think about it. Not only is the few pounds of material
they used a tiny, tiny sample relative to the total size of the Earth's
core, that sample is 4 billion years old. However, the Earth's core has
been subjected to enormous heat, pressure and mixing by convection, so
it's extremely well mixed. Furthermore, there's no known substance that
could turn into nickel-iron over any timescale, so we have good reason to
think that the composition of that core has not changed in 4 billion
years, and that the composition of the pieces of core material that the
scientists used hadn't changed either. So in this case, a sample that's
about as old as a sample can get on this planet turns out not too old!
Of course, saying "That generalization's no good because the sample's
4,000,000,000 years old!" is a red herring fallacy.
Key Fact: There isn't any substance out there that will turn
into iron and nickel under these conditions, not even if you give
it billions of years to do so.
And, conversely, having a very very recent sample does not guarantee a
logically compelling argument. Some populations change very rapidly. Think
about trying to do a generalization about present computer use, or present
cell phone use, or home recording equipment, based on data from 1950.
There are two things to think about when you
consider the age of a sample:
- How quickly and in what ways do the things in
this population and sample change over time? What forces bring about
these changes, and how quickly or slowly do they operate?
- Given the known rate of change in this kind of
thing, was the sample taken recently enough that we have a reasonable
guarantee that the sample is still representative of the population?
We always need to think
about the age of a sample, but we can't dismiss a generalization based merely
on age. If we have good reason to think that the sample hasn't
changed since it was taken, and that either the population hasn't changed
either, or the generalization is about what the population was like at the
time the sample was taken, then the generalization is fine even if the
sample is old. Some samples become obsolete very quickly, some stay good
for a very long time indeed.
Randomness
People sometimes say that all samples have to be taken randomly, or
they're no good. This isn't exactly true. There are circumstances where
the population structure will make it possible for a small non-random
sample to be much more representative than an equally sized random
sample. Consider another set of dominoes, arranged in ten identically
sized and colored rows so that each row is a different color
from every other row. Selecting one domino from each row will give an
exactly correct picture of the overall population. Selecting ten dominoes
at random from the overall population has only a very, very, very
small chance of getting an accurate picture because of the very, very high
chance of getting two or more dominoes from the same row.
Sampling Method
A generalization can only work if it uses a sampling method that is
completely independent of the feature being tested. If
the sampling method is at all sensitive to that feature then it will tend
to either seek out or avoid members of that population that have that
feature. Either way, the result will be skewed.
Some people call this sensitivity "bias." I don't like that terminology.
For one thing, "bias" has more than one meaning, and not all it's meanings
have anything to do with the accuracy of generalizations. And a sampling
method can be very biased while still giving a very accurate
result. The thing to remember about "bias," is that it only counts if
the bias is relevant to the feature being tested. If you're sample
is biased, but you can show that the bias has nothing to do with the
feature being tested, then that bias gives you no reason to throw out the
study.
Let's go back to assessing the tipsiness of Hogmanaying Scots. Say we
happen to know the names and addresses of three significant groups of
Scots. We know the names and addresses of all Scottish accountants, all
Scottish teetotalers, and all Scots who have been convicted of drunk
driving at least three times. Say we examine every member of each group to
see whether he or she got tipsy last Hogmanay. And say we got the
following results.
1. 67% of all Scottish accountants got tipsy last Hogmanay.
2. 0% of all Scottish teetotalers got tipsy last Hogmanay.
3. 99% of all Scots who have been convicted of drunk driving at least
three times got tipsy last Hogmanay.
These results can't all be representative of the whole Scottish
population. At best, only one is right. So which of these figures is more
reliable? The answer is, whichever one is least sensitive to the feature
being tested. What do accountants have to do with tipsiness? Nothing that
I can think of! But teetotalers are people who habitually abstain from
alcoholic beverages. (Strange, but true.) So of course none of them got
tipsy on Hogmanay. Are all Scots teetotalers? I don't think so! So that
sample is definitely dependent on the feature being tested. On the other
hand, habitual drunk drivers can be expected to drink more than regular
Scots, so that sample is dependant too. (Notice that one of them is negatively
dependent, in that it avoids the feature being tested, and the
other is positively dependent, in that it seeks out
the feature being tested.) So, since we can't find an obvious link
between accountancy and tipsiness, sample number one is the only independant
sample.
Key Facts
Key Fact 1. Being an accountant has nothing to do with getting tipsy.
(Makes the sample good.)
Key Fact 2. Teetotallers don't drink, and this is a question of drinking
behavior. (Makes the sample bad.)
Key Fact 3. Drunk drivers can be expected to be heavy drinkers, and this
is a question of drinking behavior. (Makes the sample bad.)
It can be very difficult to tell whether or not a sampling method is
dependent. The trick to telling whether or not a sample is
dependant is to look at the way the sample was obtained. If it was
obtained in a way that has nothing to do with any of the possible outcomes
of the study, then it is not dependent. If, however, the method by which
the sample was chosen is logically connected to the properties the sample
is supposed to test for, then that's a dependency, and the argument is no
good. Consider the following sampling methods.
1. Testing American reactions to the war in Iraq by mailing questionnaires
to the membership of the American Pacifists Association.
2. Testing the distribution of blood types across the United States by
taking blood samples from members of the Mayflower Society, a group which
restricts its membership to people who have at least one direct ancestor
that came over to America on the Mayflower.
3. Assessing the bodily proportions of 18th-century Americans by measuring
antique clothes preserved by historical societies.
Obviously, the first sampling method is no good
because (key fact) we would be taking our sample from a group that is
already self-selected to be against any war. The second sampling method
is also dependent because (key fact) the Mayflower passengers came from
a very small region in Europe whereas the vast majority of other
immigrants to the United States came from other regions, and continents,
and (other key fact) blood type is very highly correlated with ancestry.
Finally, there is the (key) fact that until recently, good quality
clothing (the kind that is likely to be preserved) tended to be reused
as long as it could be made to fit new people. Larger clothing was
easier to alter than smaller sized clothing, so it tended to be reused
until it wore out. Smaller sized clothes tended to be put away in the
hope that someone would come along who could use them, so smaller sized
clothing is much more likely to have been preserved than larger sized
clothing. Therefore, the third sampling method is also dependent.
There are two different things to think about when
you consider the sampling method used by a particular sampling argument:
- What particular fraction of the
population was picked out by this method? What are all the
characteristics of this fraction?
- How are the characteristics of this fraction,
the fraction that was picked out to be the sample, related to the feature
of the sampling argument? Does the chosen fraction have something in
common with the feature cited in the conclusion, so that the
sampling method is fatally dependent on that feature. Or
does that fraction have nothing in common with the feature,
so that the sample is safely independent of the feature.
The concepts of randomness and independence are very difficult for some
people. There are those who think that nonrandomness is a kind of magic
bullet that automatically kills an argument. Seeing that some sample was
not taken randomly, they stop thinking and write things like "the sample
was not taken randomly, so the argument is no good." Don't do this. Don't
think that nonrandomness magically kills arguments. When you see
that a sample is nonrandom, start thinking about the relationship between
the sampling method and the feature cited in the conclusion. If they have
nothing in common, the method is independent of the feature, and the
nonrandomness does not matter.
Fallacy Row
Fallacies are specific things that can go wrong with arguments. I like
to think of them as bad arguments that some people commonly mistake for
logically compelling arguments. Here I will talk about those fallacies I
think most relevant to sampling arguments. Some of them will also be
important in other contexts, while others will only be important when we
specifically discuss sampling. I will discuss six specific fallacies
that I think are relevant here. They are Inadequate Sample, Obsolete
sample, Dependent Sample, Anecdotal Evidence, Begging the Question and Red
Herring.The first four are fallacies that are specific to sampling
arguments, the last two are general fallacies that we will see again and
again as we go on.
Inadequate sample occurs when the population
clearly has not
been shown to be so evenly mixed that a sample of this size can
be reasonably assumed to properly represent the population. (Remember that
1% can sometimes be big enough while 90% (or more) can sometimes be too
small.)
The International
University on the Moon has over 20,000 students from all of Earth's 140 or
so countries. I've taken an absolutely random sample of 10 students out of
those 20,000, and 2 of those students were from Armenia, so we know that
20% of the students on the Moon are from Armenia.
Imagine that 143 countries are represented on the moon. In that case, a
ten-student sample will miss at least 133 of those countries. This means
that a sample needs to be at least 143 students to have a hope of being
adequate, and we would probably want about 300 to have anything like a
reasonable sample. (Key fact: There's about 140 different countries.)
Obsolete sample. The population clearly has not been shown
or clearly cannot be assumed to be unchanged since the sample was taken, so
it's clearly possible that the population has changed, making the
generalization out of date. (Remember that 15 billion years isn't
necessarily too old while an hour isn't necessarily recent enough.)
In 1843, 35% of
all American families owned at least one buggy whip, that means that
there's a 35% chance that there's a buggy whip in your house.
Considering that Americans almost completely stopped driving horse-drawn
buggies once automobiles became widely available, information from when
buggies were widely used is not going to represent present transportation
related realities. (Key facts: Buggy whips are only needed by people who
drive buggies, which are drawn by horses, and almost nobody uses horse-drawn
transport nowadays.)
Dependent sample. The sampling method clearly has not been
shown or clearly cannot be assumed to be random with respect to the feature
being tested, so that it's clearly possible that the sample fails to
accurately represent the population. (Remember that a "bias" that is not
relevant to the feature being tested cannot be a problem.)
Did you know that
they recently held a school assembly where they publicly interviewed 20
randomly chosen graduates of the schools Substance Control and Abuse
Rejection Enterprise program, and 100% of those SCARE graduates reported
that they've never tried drugs!
Considering that drugs are illegal, and that a student who publicly admits
to having tried drugs is going to be in a lot of trouble, it wouldn't be
surprising if some or all of those students were lying. (Key fact: people
tend to give answers that please the questioners, especially if the
questioners have power over them.)
This too counts as a counter argument. If my analysis turns out to be bad,
then it's a bad counter argument. But it's still a counter argument, whether
it's good or not.
Anecdotal Evidence. Here the arguer fixes on a particular
story and tries to use it to support a generalization. The problem is that
the anecdote could easily have been picked precisely because it supports the
point the arguer wants to make, and might be screamingly atypical of the
population he wants to generalize about.
Handgun Control,
Inc. faked statistics on gun violence. That proves all gun-control
activists are liars.
Key fact: That's just one incident, which could easily be the only
such incident. No general survey is cited here.
That Mensa member
tried to murder the people next door with thallium, and wrote snotty
articles about it in the Mensa newsletter. That proves that all smart
people are evil.
Key fact: That's just one incident, chosen because it's about a
"smart" person attempting murder. There are thousands of other Mensa members
and millions of other smart people.
Of course America
was as deeply involved in witch burning as Europe was. Didn't you hear
about the Salem Witch Trials?
Key fact: That's just one small series of
incidents that might easily have been America's only literal
witchhunt.
Keagan. Okay, I'll
admit that some cops are racist. But you'll have to give me some
pretty convincing evidence before I'll believe that all cops are
racist.
Aylin. But didn't you see the
Rodney King videotape? That videotape showed five white LAPD officers
repeatedly beating a prone, unresisting black motorist. They just kept
whaling on him, hitting him over and over again. It was a savage, stupid
beating that King would not have gotten if he had been white. That
proves all cops are racist thugs
It's true that Aylin gives a very salient example. That
is, he gives an example that sticks out, or otherwise makes a deep
impression on the listener. But salience isn't significance, and an example
can be very salient without being at all representative.
Key Fact: This is exactly one incident that was chosen precisely
because it supports the arguer's opinion, which makes it perfectly possibile
that this was one of only a very few racist attacks by police officers.
By the way, did you notice that Keagan's argument relied on a claimed lack
of evidence, and that Aylin's argument claimed to supply that evidence. This
made Aylin's argument one of those rare cases of a direct argument that's
also a counter argument. (Remember, this can only happens against an
argument based on the claim that there's no evidence for something.
Begging the Question, or the Fallacy of Validation by Examples
I was once caught in a dispute with a person who obviously considered
himself intelligent, educated and reasonable. At one point, this person made
a sweeping and controversial generalization based on no evidence whatsoever.
I asked him if he could back this up. His response was "let me validate with
examples," by which he meant he was going to give me a series of cases each
of which would be claimed to be an example of his thesis, and this
was supposed to be logical evidence that he was right. Here is a (completely
hypothetical) example of "validation by examples:"
Samantha. You
know of course that the media has a conservative bias.
Lauren. Um, do you have some kind of survey or university
study of randomly chosen news stories in which actual reporting was found
to put a more positive spin on similar sets of facts when a conservative
was involved than when a liberal was involved, because that's what you'd
need to prove actual bias.
Samantha. Rubbish, you don't need that. I can validate
this with examples. You remember when the media reported that liberal
Democrat senator Gropemeister tended to get touchy with women who visited
his office?
Lauren. Yeah, that was pretty creepy!
Samantha. Well the only reason they made such a big deal
of it was because he was a liberal! If he'd been a conservative they'd
have held back on front page stories describing the nasty details, the way
they did when conservative Republican Mayor Longsuffering of Buffalo was
caught doing the same thing.
Lauren. But didn't Longsuffering turn out to have been
falsely accused by a mentally disturbed ex-employee? And he's only a
medium city mayor, not a US senator, so...
Samantha. And I've got dozens of other examples. What
about Representative Random and the . . .
Lauren. Oops! Look at the time. Gotta go.
If you look carefully at Samantha's arguments, you can see two distinct
logical errors. The first is, of course, that she is "supporting" her claim
with a set of anecdotes that she chooses herself rather than by properly
collected and interpreted independent evidence. The second is that she's not
just giving anecdotes, she's also adding
in her own assumptions about
what they mean. When she finds negative coverage of a liberal,
she assumes that the negativity must be the result of
bias, and likewise less negative coverage of a conservative must
be a result of bias. Thus she builds her assumption of bias into her
interpretations of her data, and thereby assumes the very thing
she's supposed to prove.
The fallacy of begging the question consists of assuming
something the arguer really needs to prove. Notice that each of
Samantha's "examples" consists of her 1. alluding to a known incident (the
Gropemeister incident) and 2. claiming, without a shred of
evidence, that the way it was covered in the media (they made a
big deal about it) was because of conservative bias. Thus Samantha
does give some facts (the media made a big deal about Gropemeister) but the
only thing that connects those facts to her conclusion
is her assumption that the media has a conservative bias.
Red Herring
Apart from the fact that Red Herring is a very common fallacy,
I mention it here because people often attack sampling arguments on the
basis of sample age, sample size or sampling bias when these issues are
completely irrelevant to the strength of the argument. Therefore, an
arguer commits red herring if:
1. His criticism of a generalization is based on sample age when we have
no reason to think that either the population or the sample has
changed since the sample was taken.
2. His criticism of a generalization is based on sample size when we
have no reason to think that this particular sample is too small
for this particular population.
3. His criticism of a generalization is based on a bias in the sampling
method when we have no reason to think that this
particular bias has anything to do with the feature we're testing for.
Understanding Red Herring depends on understanding the concepts of "salience"
and "relevance." A thing is salient
when it is looks important to the issue. A thing is relevant
when it actually is important. Salient things are not always
relevant, and relevant things are not always salient. Something can look
important when it really isn't, and seemingly unimportant details can
sometimes be vital to understanding an issue. Red herring is committed
whenever someone bases a conclusion on something that is salient
(such as sample age, sample size or sampling method) but which in this particular
instance is not relevant to the issue at hand.
Exercises
1. In your own words, write out the definition of "argument" used
in this chapter.
2. What kind of statements are sampling arguments generally used to support?
3. In your own words, what are "generalizations?"
4. What do sampling arguments always start with?
Suppose that someone argues that 78% of all Irish people are Catholics
because he has surveyed Irish hurling players, who make up 9% of the Irish
people, and found out that 78% of Irish hurling players are Catholics.
5. What is the premise in this argument?
6. What is the conclusion in this argument?
7. What is the population in this example?
8. What is the sample in this example?
9. What is the feature in this example?
10. What is the fact (or evidence) in this example?
11. If we can't reasonably explain the fact without
assuming the conclusion, what does that mean for this argument?
12. If we can reasonably explain the fact without assuming the
conclusion, what does that mean for this argument?
13. Does the fact that a sample is very old necessarily mean
that the argument is bad?
14. Does the fact that a sample is very small necessarily mean
that the argument is bad?
15. Does the fact that a sample is not random necessarily mean
that the argument is bad?
16. In your own words, what is "population structure?"
17. In your own words, what is the first thing you think about when you
consider sample size?
18. In your own words, what is the second thing you think about when you
consider sample size?
19. In your own words, what is the first thing you think about when you
consider sample age?
20. In your own words, what is the second thing you think about when you
consider sample age?
21. In your own words, what is the first thing to think about when you
consider the randomness or nonrandomness of a sample?
22. In your own words, what is the second thing to think about when you
consider the randomness or nonrandomness of a sample?
23. In your own words, what is a "key fact?"
24. In your own words, define "inadequate sample."
25. In your own words, define "obsolete
sample."
26. In your own words, define "dependent
sample."
27. In your own words, define "anecdotal evidence."
28. In your own words, define "red herring."
29. What is "salience?"
30. What is "relevance?"
31. In your own words, define "begging the question."
32. In
your own words, define "validation by examples."
33. How is "validation by examples" related to the other fallacies in this
chapter?
34. What are the three different ways an arguer can commit red herring in
criticizing sampling arguments?
Answers
1. An "argument" is an attempt to persuade someone to believe something.
2. Sampling arguments are generally used to support generalizations.
3. "Generalizations" are general statements that make claims about all
of some certain kind of thing.
4. Sampling arguments always start with a sample taken from some larger
population.
5. The premise is "78% of Irish hurling players are Catholics."
6. The conclusion is "78% of all Irish people are Catholics."
7. The population is Irish people.
8. The sample is Irish hurling players.
9. The feature is the proportion of Catholics.
10. The fact is "78% of Irish hurling players are Catholics." (This time
it's the same as the premise.)
11. If we can't reasonably explain the fact without
assuming the conclusion, that means the argument is good.
12. If we can reasonably explain the fact without assuming the
conclusion, that means the argument is not good.
13. A sample being very old does not necessarily mean
that the argument is bad.
14. A sample being very small does not
necessarily mean that the argument is bad.
15. A sample not being random does not
necessarily mean that the argument is bad.
16. "Population structure" is the way different parts of a population are
arranged relative to each other.
17. How large is the sample compared to the
number of different relevant properties individuals in the
population can have?
18. How is the population structured?
19. How quickly and in what ways do the things in
this population and sample change over time?
20. Given this rate of change, do
we have a reasonable guarantee that the sample is still representative of
the population?
21. What are all the characteristics of particular fraction of the population
picked out by this method?
22. How are the characteristics of this fraction
related to the feature of the sampling argument?
23. A "key fact" is something about an argument that makes it a
good or bad argument.
24. "Inadequate sample" is when a population is too unevenly structured
to be represented by the given sample size.
25. "Obsolete sample" is
when a population is changes too rapidly to be represented by a
sample of the given age.
26. "Dependent sample" is
when the sampling method is somehow related to the feature.
27. "Anecdotal evidence" is when people talk about a few selected cases
rather than giving a properly conducted sample.
28. "Red herring" is when someone talks about something that is salient but
not relevant.
29. "Salience" is when something sticks out and looks important,
even though it might not mean anything.
30. "Relevance" is when something actually matters, even if it doesn't
immediately seem important.
31. "Begging the question" is when someone assumes something they
actually need to prove.
32. "Validation
by examples" is when someone gives a set of anecdotes and just assumes
that they support his point.
33. "Validation by examples" is a combination of begging the question and
anecdotal evidence.
34. An arguer can commit red herring by saying a sample is too old when it
isn't, saying a sample is too small when it isn't, or saying a sample is too
nonrandom when that doesn't matter.
Copyright © 2013 by Martin C. Young
This Site is Proudly Hosted By.