Probability

What does it mean to say that a statement is probably true?

How doe sone determine the probability of an event?

The probability that a tossed coin will land heads is .5.

The probability that it will rain tomorrow is .75.

The probability that ABBA will have another hit record is .05.

The probability of a 25 year old woman living one more year is .97.

Three theories about probabilities:

1. Classical theory (Logical or A priori theory)

2. Relative frequency theory

3. Subjectivist theory

Classical theory:

Divide the number of favored outcomes by the total number of equipossible outcomes.

Computations are independent of any sensory information or facts about the process that generates the outcome. How, when and where the coin is tossed is irrelevant.

Relative frequency theory:

Divide the number of observed favored outcomes by the number of observed outcomes. The probability of an event is considered only for the long run (the limit of the relative frequency) and does not apply to individual events but to classes of events.

Subjectivist theory:

What are the odds an individual would accept on a particular event? Probability is a measure of an individual's confidence. This is a subjective interpretation of probability that permits diferent assignments among different evaluaitons.

The Probability Calculus:

We use the probability calculus to determine compound probabilities. Several dimensions of the compound events are important. Are the events conjoined or disjoined? Are they independent of one another? Are they mutually exclusive?

Independent vs. dependent events: events are dependent if the occurrence of one effects the occurrence of the other. Otherwise they are independent.

For example: drawing a card from a desk without replacing it affects the next draw.

Mutually exclusive events: Two events are mutually exclusive when they cannot both occur together; one's occurring excludes the other's occurring.

For example: Drawing an ace or a king on a single draw. One can't do both on one draw.

1. Restricted Conjunction Rule

RULE: P(A and B) = P(A) X P(B)

RESTRICTIONS: A and B must be independent (e.g., rolling a die, or sampling with replacement)

EXAMPLES:

a. What is the probability of getting two heads on a single throw of two coins?

Pr(H1 and H 2) = 1/2 x 1/2 = 1/4

b. What is the probability of getting two fives on two rolls of a die?

Pr(E1 and E2) = 1/6 x 1/6 = 1/36

c. What is the probability of getting two Kings on two draws from a deck (with replacement).

Pr(K1 and K2) = 4/52 x 4/52 = 1/169

d. If there's a 40% chance the Ricky and Lucy will be marrried next month and a 30% chance that Fred and Ethel will split up next month what is the probability that both events will occur?

.40 x .30 = .12 = 12%

e. If there is a 10% chance of an error on any particular page of a 3 page document, what is the probability of an error occurring on every page?

.10 x .10 x .10 = .0001 = .01%

CAUTION: Be sure to consider all favorable combinations.

What is the probability of drawing a King and Queen on two draws from a deck with replacement? There are two ways to be successful here: (K1 and Q2) or (Q1 and K2). This problem involves a disjunction of outcomes and can be handled below.

2. General Conjunction Rule

RULE: P(A and B) = P(A) X P(B given A)

RESTRICTIONS: Best used when events A and B are dependent (e.g., more than one sampling without replacement)

EXAMPLES:

a. What is the probabilty of getting two Aces from a deck on two draws without replacement?

Pr(A1 and (A2 given A1)) = 4/52 x 3/51 = 1/221

b. Given an urn of five red balls, three green balls and two yellow balls, what is the probability of drawing first a red ball and then a green ball on two draws without replacement?

P(R1 and (G2 given R1)) = 5/10 x 3/9 = 1/6

c. A concert hall has ten rows of ten seats. In a random drawing for seats you are the first to draw two tickets. What is the probability of drawing two front row seats?

Pr(F1 and (F2 given F1)) = 10/100 x 9/99 = 1/110

3. Restricted Disjunction Rule

RULE: P(A or B) = P(A) + P(B)

RESTRICTIONS: A and B must be mutually exclusive

EXAMPLES:

a. What is the probability of getting either a three or an even number when rolling a die?

Pr(3 or E) = 1/6 + 3/6 = 4/6

b. Given an urn of five red balls, three green balls and two yellow balls, what is the probability of drawing a red ball or a green ball on one draw?

Pr(R or G) = 5/10 + 3/10 = 8/10

c. What is the probability of drawing either a nine or a King from a deck on a single draw?

Pr(9 or K) = 4/52 + 4/52 = 8/52

4. General Disjunction Rule

RULE: P(A or B) = P(A) + P(B) - P(A and B)

= P(A) + P(B) - [P(A) X P(B)]

RESTRICTIONS: Best used when events A and B are not mutually exclusive and independent

JUSTIFICATION: We subtract the probability of both events occuring because we don't want that combination to figure into the calculation twice. If the events are not exclusive then there are two ways to get A (with and without B) and two ways to get B (with and without A).

Two favorable combination for A = A and B

A and not B

Two favorable combinations for B = B and A

B and not A

But only three favorable combinations for A or B:

A and not B

B and not A

Both A and B

So we need to negate the second combination of A and B occuring together.

EXAMPLES:

a. What is the probability of getting at least one spade from a deck of cards on two draws (with replacement)?

Pr(at least one spade) = Pr(S1) + Pr(S2) - [Pr(S1) x Pr(S2)]

= 13/52 + 13/52 - [13/52 x 13/52]

b. What is the probability of getting at least one six when rolling a pair of die?

Pr(at least one 6) = Pr(61) + Pr(62) - [Pr(61) x Pr(62)]

= 1/6 + 1/6 - [1/6 x 1/6]

5. The Negation Rule

RULE: P(A) = 1 - P(not-A)

RESTRICTIONS: Best used when the probability of an event not happening is either known or easily computed. Consider also when disjunctive events are dependent.

EXAMPLES:

a. What is the probability of getting at least one head on two tosses of a coin? (independent events)

1 - Pr(no heads) = 1 - Pr(no heads on 1 and no heads on 2) = 1 - (1/2 x 1/2)

b. Given an urn of five red balls, three green balls and two yellow balls, what is the probability of getting either a red ball or a green ball on two draws without replacement? (dependent events)

1- Pr(neither R1 nor G1 and neither R2 nor G2) =

1 - Pr(Y1 and Y2) =

1 - Pr(Y1) x Pr(Y2 given Y1) =

1- (2/10 x 1/9)

c. What is the probability of getting at least one green ball on two draws without replacement?

1 - [Pr(no green on either draw)]

1 - [Pr(no G1) and Pr(no G2 given no G1)]

1 - [7/10 x 6/9]

Events are:	Mutually Exclusive	Not Mutually Exclusive
Independent	Con: Pr = 0 xxxx Dis: Restricted Rule P(A) + P(B)	Con: Restricted Rule P(A) x P(B) Dis: General Rule P(A) + P(B) - [P(A) x P(B)]
Dependent	Con: Pr = 0 xxxx Dis: Negation Rule 1- P(not-A)	Con: General Rule P(A) x P(B given A) Dis: Negation Rule 1 - P(not-A)

Using combinations of rules

a. What is the probability of getting either a two or a six on rolling two die? For each roll you are considering mutually exclusive events (either a 2 or a 6 but not both) and so you would use the restricted disjunction rule. But considering the two rolls, the events in question are independent and not mutually exclusive, so you want to also use the general disjuction rule.

Pr[(2 or 6)1 or (2 or 6)2]

= Pr(2 or 6)1 + Pr (2 or 6)2 - [Pr(2 or 6)1 x Pr (2 or 6)2]

b. What is the probability of drawing a King and Queen on two draws from a deck with replacement? There are two ways to be successful here. (Use the restricted disjunction rule (since the two favorable outcomes are mutually exclusive) and the restricted conjunction rule (since there is replacement))

Pr[(K1 and Q2) or (Q1 and K2)]

c. Given an urn of five red balls, three green balls and two yellow balls and a second urn of three red balls, two green balls and five yellow balls, what is the probability of drawing first a red ball and second a green ball with two draws per urn (no replacement)? (use the general disjuction rule and the general conjunction rule)

Pr[(R1 and G2 given R1)urn 1 or (R1 and G2 given R1)urn 2]

Statistical Reasoning

Statistical hypotheses

In a statistical hypothesis or statement a percentage of a population is said to have a certain property. You should be able to identify the percentage, the population and the property.

Examples:

51% of all American men are smokers.

34% of all American women are smokers.

One out of ten people living in Los Angeles suffers from alcoholism.

Nearly half of all adult women are employed.

Most freshman take philosophy.

Few logicians are rich.

a. The statement is about the population as a whole and not about any particular individual in that population. It is important when assessing the merits of a statistical statement to consider exactly what population is mentioned. Who or what might be included or excluded from the population as described? The population is also sometimes called the reference class.

b. The percentage indicates some quantity relative to the population. Fractions and decimal numbers are often used. But we will also include "quantifiers" such as "many," "most," "few," "nearly half,' and so on.

c. The property indicated can also be understood in terms of a value of a variable. As with the population, when assessing the merits of a statistical statement it is important to consider exactly what property is mentioned. What is or is not included in the description of that property?

Arguments from samples: statistical generalizations

A statistical generalization draws a statistical conclusion about an entire population based on the characteristics of a sample. For example,

43% of voting Americans polled said they voted for Bill.
So, 43% of all voting Americans voted for Bill.

a. The conclusion is a statistical hypothesis and therefore about a population, not an individual.

b. The percentage indicated in the conclusion is best understood as falling within a range (allowing a margin for error) at a certain level of confidence. For example, according to statistical theory, a representative sample of 500 would yield 95% certainty that results aren't off by more than 2%. Larger, more representative samples serve to increase the level of confidence and decrease the margin for error.

c. Obviously the size and nature of the sample is vital to strength of the argument. Below are some tips for selecting an adequate sample from which to generalize.

1. Pick a random sample. American students are said to test lower than Japanese students on math.

2. Pick a large sample. Must consider the appropriate level of confidence and the margin of error. A representative sample of 500 would yield 95% certainty that results aren't off by more than 2%.

3. Pick a diverse sample. Polling Ohio voters; you want to represent all relevant voting groups; e.g., the Literary Digest poll of 1936 predicted Landon would win over Roosevelt after 2.5 million questionaires to its readers were returned; names randomly drawn from phone books and subscription lists; but this sample is biased in favor of the well-to-do (those who owned phones).

4. Pick a stratified sample. Voluntary tests of 25,000 drivers throughout the United States showed that 25% of them use some drug and that 75% use no drugs at all while driving. The conclusion was that 25% of U.S. drivers do use drugs-a remarkable conclusion. The tests were taken at random times of the day at randomly selected freeway restaurants.

But fewer drivers at 5am, so this group was favored, hence a bias in the sampling; also bias in favor of freeway drivers, those who stop to eat, volunteers

Sample size; sample fairness:

When sampling we try to learn about a characteristic of a parent population; carefully identify the parent population and the characteristic in question, determine the desired amount of certainty, select a sample and examine for the characteristic; infer percentage of parent that has characteristic.

The larger the sample the more likely that the properties of the sample will reflect those of the parent population, says the law of large numbers

The larger the (fair) sample, the higher the confidence, the smaller the margin of error

"Average"

The Mean value of a set of data is the arithmetical average.

The Median of a set of data is the middle point when the data are arranged in ascending order. An equal number of data above and below median.

The Mode is the value that occurs with the greatest frequency.

Distribution: In a coin toss, heads can be expected 50% of the time, tails 50% of the time. With one toss there are two possible outcomes (H, T). With two tosses there are four possible outcomes (HH, TT, HT, TH), and the probability that only one H will occur is 50% (two combinations out of four). With three tosses there are eight combinations, the probability of 1 H or 2 H is .375.

0 H 1 H 2 H 3 H 4 H

1 TOSS 1/.50 1/.50 x x x

2 TOSSES 1/.25 2/.50 1/.25 x x

3 TOSSES 1/.125 3/.375 3/.375 1/.125 x

4 TOSSES 1/.062 4/.25 6/.375 4/.25 1/.062

5 tosses 0H/.031, 1H/.156, 2H/.312, 3H/.312, 4H/.156, 5H/.031

The probability of a sequences with exactly 1/2 H occurrences goes does down as number of tosses increases; but it always remains the most likely occurrence.

Statistical syllogisms

A statistical syllogism concludes that an individual is likely to have a certain property based on a statistical hypothesis that a significant percentage of a population has the property and a premise stating that the individual in question belongs to that population. For example,

Most people who have their gall bladder removed recover without serious implications.
Fred had his gall bladder removed.
So, it is likely that Fred will recover without serious complications.

There are several ways by which one might challenge such an argument.

1. Call one or more of the premises into question. For example, one might question whether the statistical hypothesis about gall bladder surgery is based on adequate sampling techniques.

2. Question the degree of support. Notice how little support these premises give the conclusion.

(1) Many air traffic controllers are under great stress.

(2) Many people under stress are heavy drinkers.

(3) Many heavy drinkers lose their driver's licenses.

(4) Many people who have lost their licenses are bad insurance risks.

(5) Many people who are bad insurance risks live in New York City.

(6) Lucy is an air traffic controller

So it is likely that Lucy lives in New York City

3. Supply a counterargument. In our example above the population (reference class) is "People who have their gall bladder removed." While it may be true of that population that most recover, it may not be true of every subset of that population. Consider the following argument in which we refine the reference class to which Fred belongs.

Most 103 year old persons who have major surgery suffer serious complications

Fred is a 103 year old who is having major surgery for gall bladder removal.

So, it's likely that Fred will suffer serious complications

We should use all available relevant evidence. The more specific the reference class, the better, though remember that this may limit your sample size in some cases.

Exercises: Which are acceptable? If not, what counterargument could be provided?

Most auto fatalities are the result of the drinking driver.
Fred was an auto fatality at 9:30 am on Sunday Morning
So, Fred's death was the result of the drinking driver.

Most sexually active women who take birth control pills according to the directions do not conceive.
Lucy is a sexually active woman who takes birth control pills according to the directions
Lucy will not conceive.

Most areas with low unemployment rates have higher wages.
American cities with strong service economy have low unemployment.
So, American cities with a strong service economy have high wages.

Most incumbents are re-elected in the US if they decide to run.
Mayor Mertz is an incumbent who is running and has stood for increasing expenditures on social programs.
So, Mayor Mertz will be re-elected

Most students will benefit materially from their college education.
Ricky is a college student studying Greek and Latin.
So, Ricky will benefit materially from his college education.

	0 H	1 H	2 H	3 H	4 H
1 TOSS	1/.50	1/.50	x	x	x
2 TOSSES	1/.25	2/.50	1/.25	x	x
3 TOSSES	1/.125	3/.375	3/.375	1/.125	x
4 TOSSES	1/.062	4/.25	6/.375	4/.25	1/.062