Questions and Answers
On this page there are previous questions and even some example questions. Browse by topic, or use your browser's search capability to search for key words, like "binomial", "p-value", "null hypothesis", or "z-score".
Don't forget to check the Statistics Topics pages for additional help on these types of problems.
Send a question to tutor@freestathelp.com
Scroll down to browse, or click to jump to specific topcs:
- Probability
- Counts and Proportions/Chi-square
- Correlation/Linear Regression
- Hypothesis Testing/Z-scores/Sampling Distribution
- Other/Miscellaneous
PROBABILITY
Question:
Type: Conditional Probability
In NY State, 48% of all teenagers own a skateboard & 39%
of all teenagers own a skateboard & roller blades. What is
the probability that a teenager owns roller blades given that
the teenager owns a skateboard?
Answer:
I'm betting that somewhere in your notes you have something on "conditional probability" that looks like this:
P(A | B) = P(A and B) / P(B), right? In words, that's "the probability of A given B equals the prob of A and B divided by the prob of B". The question asks for the "probability of owning roller blades given they own a skateboard". From that, can you identify which event is event "A" (roller blades or a skateboard) and which is "B"? What should help is that you are give two probabilities, so one of them has to be the one for P(A and B) and one of them has to be for P(B). Once you figure that out, you just divide them to get P(A|B). Does that make sense?
Question:
Type: Permutations and Combinations
How many different tests can be made from a test bank of 14 questions if the test consists of 9 questions?
14c9= 14!/(14-9)!9! = 2002 I do not know where this number came from.
Find the probability of selecting 2 science books and 3 math books from 9 science books a 10 math books. The books are selected at random.
2 science bboks can be selected as 9 C 2 and 3 math books can be selected as 10 C 3. Hence, 9 c 2 x 10 c 3, right? Why?
Answer:
If you have 14 questions and want to choose 9, you are doing the common "combinations" method. Your formula is correct, it will be 14! in the numerator and (14-9)!9! in the denominator.
14! = 14 x 13 x 12 x 11 x ...
so yes, 14c9 = 2002
in terms of where the numbers come from, think of it this way:
you're going to randomly order the questions, and then take the first 9. So, there are 14! ways to order the questions, but several of those orders would result in you selecting the same 9 questions (since it doesn't matter in what order you select them - if they are in the first 9 slots you'll select them), so you've sort of "over-counted" the number of ways you can select 9 because several of the ways are actually the same. That's why you divide by (14-9)! and 9!. What you are doing is saying that those 9 questions could be ordered 9! ways, all in the first 9 slots, and the remaining questions could be ordered (14-9)! ways in the remaining slots.
Does that make sense? If it does, then your second answer should make sense, because you're doing the same thing separately for the science and math books, and then multiplying them together, because for each way of selecting 2 science books there are 10c3 was of choosing 3 math books, and vice versa.
Question:
Type: Permutations and Combinations
In how many ways can 9 people line up to get tickets at a ticket booth
Answer:
The easiest way is to think of the 9 places as "slots" to be filled:
__ __ __ __ __ __ __ __ __
How many different people could you put in the first slot? Answer: 9 How many possibilities does that leave for the second slot? Answer: 8. And so on. So for each slot sequentially, the possible number of people to put in each slot is:
9, 8, 7, 6, 5, 4, 3, 2, 1
And what do you do to those? You multiply them (because for each of the 9 possible people for the first slot, there are 8 possible people EACH for the second slot):
9x8x7x6x5x4x3x2x1 = 9! (or "nine factorial") = 362,880
COUNTS AND PROPORTIONS/CHI-SQUARE
Question:
Type: Hypothesis testing, Chi-squared value, proportions
Test the null hyopthesis at the .01 level of significance that
the disrubution of blood types for college students complies
with the proportins described in the blood bank bulletin,
namely, .44 for O, .41 for A, .10 for B and .05 for AB. Now
however, assume that the results are available for a random
sample of only 60 students. The results are as follows 27
for O, 26 for A, $ for B and 3 for AB.
Note: The expcexted frequency for AB (.05)(60) =3, is less
than 5, the smallest permissible frequency. Create a sufficently
large expected frequency by combing B ans AB.
I understand Level of singificance would be x2 at a critical value
of 9.21 with df=2, and I get the Ho Po= .44, Pa=.41 and
Pb=A+AB=.15
So, x2 would look like (27-?/?)2+ (26-?/?)2+(7-?/?)2=
I also know that each set of ? would have different values, but
not sure of how to get those values.
Answer:
You are actually very close. To get the "expected" counts in each case, you have to ask yourself: "If the sample of 60 had exactly the distribution the problem gives (i.e. the 'hypothesized distribution'), how many people would have each type of blood?" Well, 44% of the 60 would have O, or 60 x .44 = 26.4 (it's ok that it's not a whole number), 41% would have A, or 60 x .41 = 24.6, and 15% would have B or AB, or 60 x .15 = 9. Basically, the expected values are those that would make the sample have the hypothesized proportions. The expected values are 26.4 for O, 24.6 for A, and 9 for B or AB.
By the way, you may wonder why those are the "expected" values. We're not saying that those are what we'd "expect" to happen each time we take a sample. Instead, just like how E[x] or "expected value" is equivalent to the mean of x, what we're saying is that if we took repeated samples of size 60 from a population that actually did have the probabilities stated in Ho, on average 26.4 people would have O, 24.6% would have A, and the rest would either have B or AB. Does that make sense? What we're doing with the formula you wrote down is quantifying how far away from the mean, or "expected value", our observed values are to see if what we got could come from such a distribution.
One thing about the formula you wrote down: you only square the numerator of each term, not the denominator, so it's really:
(observed - expected)2 / expected
in each case.
When I apply that formula, I get something that is not even close to the critical value (so do we "reject" or "fail to reject" Ho?).
Question:
Type: Binomial
The probability of an individual person contracting the H1N1 virus is .12. If we select four people at random, what is:
a) the probability that 2 of them contracts it?
b) the probability that at least one of them contracts it?
Answer:
This is "binomial". The way I know this is that each person has only one of two outcomes (contracts, does not contract), and we are interested in a count of the total number of people who contract it. So, we will solve things using the binomial formula: P(X = r) = (n choose r)*p^r*(1-p)^(n-r)
a) P(X = 2) = (4 choose 2)*.12^2*.88^2 = 0.067
b) Keep in mind this is a BUNCH of events together: P(X >= 1) = P(X = 1) + P(X = 2) + P(X = 3) + P(X = 4), so we could find each one of those individually and then add them together, or we could realize that this is also equal to:
1 - P(X = 0) (the probability that at least one gets it is equal to 1 minus the probability that no one does)
So 1 - P(X = 0) = 1 - [(4 choose 0)*.12^0*.88^4] = 1 - 0.5997 = .4003
CORRELATION/LINEAR REGRESSION
Question:
Type: Correlation, Association between two variables
I have to use the values of the linear correlation
coefficient to calculate the coefficient of determination
and explain the variation of the data about the
regression line and the unexplained variation
Can you tell me what formula is used to determine
the values of a linear correlation coefficient? How
do I calculate the coefficient of determination?
r= 0.350
r= -0.275
r= -0.891
r= 0.964
Answer:
The coefficient of determination is also called "R-squared" and is the square of the correlation coefficient. R-squared will be a number between 0 and 1, and represents the percent of the (linear) variation in Y that is explained by changes in X when you regress Y on X. (So, if R-squared = .34 then 34% of the change in Y is explained by changes in X).
For the four values you gave, you will square each one (making each one positive). The fact that some of the correlations are negative tells you something about the relationship between X and Y as well. If "r" is positive, then as X goes up, Y goes up. If the "r" is negative, then as X goes up, what do you suppose happens to Y?
Question:
Type: Scatterplot, Regression
Listed below are measures of pain intensity before
and after using the proprietary drug Duragesic. The
data are listed in order by row, and corresponding
measures are from the same subject before and
after the treatment. For example, the first subject
had a meausre of 1.2 before the treatment and a
measure of 0.4 after the treatment. Each pair of
measurments is from the one subject, and the
intensity of pain was measured using the standard
visual analog score.
Pain Intensity Before Duragesic Treatment
1.2 1.3 1.5 1.6 8.0 3.4 3.5 2.8 2.6 2.2
3.0 7.1 2.3 2.1 3.4 6.4 5.0 4.2 2.8 3.9
5.2 6.9 6.9 5.0 5.5 6.0 5.5 8.6 9.4 10.0
7.6
Pain Intensity After Duragestic Treatment
0.4 1.4 1.8 2.9 6.0 1.4 0.7 3.9 0.9 1.8
0.9 9.3 8.0 6.8 2.3 0.4 0.7 1.2 4.5 2.0
1.6 2.0 2.0 6.8 6.6 4.1 4.6 2.9 5.4 4.8
4.1
Analyzing the Results
1) Use the given data to construct a scatter plot, then
use the methods of section 10-2 to test for linear
correlation between the pain intensity before the
treatment and after the treatment. If there is a
significant linear correlation, does it follow that the
drug treatment is effective?
2) Use the given data to find the equation of the
regression line. Let the response (Y) variable be the
pain intensity after the treatment. What would be the
equation of the regression line for a treatment having
absolutly no effect?
3) The methods of section 9-3 can be used to test the
claim that two populations have the same mean. Identify
the specific claim that the treatment is effective, then use
the methods of section 9-3 to test the claim. The methods
of section 9-3 are based on the requirments that these
samples are independent. Are they independent in this case?
4) The methods of section 9-4 can be used to test a claim
about matched data. Identify the specific claim that the
treatment is effective, then use the methods of section
9-4 to test the claim.
5) Which of the preceeding results is best for determining
weather the drug treatment is effective in reducing pain?
Which of the preceding reults is least effective in determining
whether the drug treatment is effective in reducing pain? Based
on the preceding results, does the drug appear to be effective?
Answer:
Keep in mind that I try not to do your homework for you, but just
give you a push in the right direction!
1) when constructing the scatterplot, I assume you know that each
point is actually a pair of data: the before measure and the after
measure. Draw the "before" on the horizontal axis and the "after"
on the vertical. Now, draw a 45-degree line. What is true about
any point on that line? Isn't it true that to be on that line, the "before"
and the "after" are the SAME? Given that, what can you say about a
point that lies above or below that line, in terms of whether pain has
increased or decreased? That is, if pain is less "after" than it was
"before", where will that point be in relation to the line? To answer
the specific question asked about linear correlations and improvement,
let me ask you two things:
a) can you draw a bunch of points that show that the drug is effective
without those points having any linear correlation?
b) can you draw a bunch of points with a strong linear correlation that
show that the drug is NOT effective?
2) I think we answered this one above. Think about what I said about
points where the pain is the same before and after.
3) I don't know what book you're using, but from the question I can
deduce that section 9-3 tests differences between 2 means. I'll leave
that to you to do. The question about independence...what do you think?
Are the before and after populations independent? Keep in mind that the
same people are in each sample. Are multiple readings from the same
person COMPLETELY independent?
4) Again, I don't have your book in front of me...but typically with matched
data you subtract the two points. Let's say you subtract the after from the
before. Then if someone's pain went down that difference would be (positive
or negative?). So, to be effective, that's the claim that the average difference
is (greater than zero or less than zero?)
5) I'll leave this for you. You should be able to speak to this after doing all
of that analysis.
HYPOTHESIS TESTING/Z-SCORES/SAMPLING DISTRIBUTION
Question:
Type: Normal Distribution, Mean and Sample Mean
Assume IQ has a Normal Distribution with a mean of 130 and a standard deviation of 20. Find:Answer:
a) The probability of being above 135
b) The IQ which represents the 90th percentile
c) The probability that the average of 25 people will be below 128
For "a", this is straight forward: P(X > 135) = P(Z > [135 - 130]/20) = P(Z > 0.25) = 0.599
For "b", we work backwards. We are given the probability and want the IQ, so, we want to find the value "?" such that P(X < ?) = .9, or
P(Z < [? - 130]/20) = .9. From the normal table we know that a z-score of 1.28 gives that probability (that is P(Z < 1.28) = .9), so we
want the quantity [? - 130]/20 to equal 1.28. Set equal and solve: ? = 1.28*20 + 130 = 155.6
For "c", this is now a new random variable: Xbar. It has a Normal distribution with a mean of 135 and a std error of 20/sqrt{25}, or 20/5 = 4.
So P(Xbar < 128) = P(Z < [128-130]/4) = P(Z < -.5) = 0.31
Question:
Type: Normal Distribution, Mean and Sample Mean
Amounts of Coke: Assume that cans of coke are filled so that the
actual amounts have a mean of 12.00 oz and a standard deviation of 0.11
oz
a. Find the probability that a sample of 36 cans will have a mean amount of at least 12.19 oz, as in Data Set in Appendix B
b. Based on the result from part(a), is it reasonable to believe
that the cans are actually filled with a mean of 12.00 oz? If the mean
is not 12.00 oz, are the consumers being cheated?
Answer:
The amount in a SINGLE coke can has a mean of 12 and a std dev of 0.11. The AVERAGE amount in a SAMPLE of coke cans has a normal distribution (why?) with a mean of 12 and a std dev of: 0.11/6 (let me know if you don't know why - this part is key to the problem). So the question is find the probability that the average fill amount is 12.19 or higher, or:
P(x-bar >= 12.19). But x-bar is the sample mean, so it has the distribution we stated above (Normal, mean=12, std dev = 0.11/6), so we solve this the way we always do for random variables with a Normal distribution:
P(x-bar >= 12.19) = P(Z >= [12.19 - 12.00] / [0.11/6] ) = P(Z >= .19/.0183) = P(Z >= 10.36).
Normally we would then look that number up in a table. However, we know that 99% of the probability in a standard Normal distribution is between -3.5 and 3.5, so 10.36 is VERY unlikely, so the probability is going to be VERY small.
For part b, what they are trying to get you to see is that 10.36 is VERY unlikely, so the fact that there's a dataset that produces that result means that the true mean is probably not 12.00, because if it were, you would NOT get a sample mean of 12.19. It's just too unlikely. However, consumers are probably not being cheated because it's likely the mean is higher than 12.00 - that's the only way a sample mean of 12.19 could happen.
Type: Hypothesis Test, Sample Mean, Type II Error
The competing hypotheses for the distribution of time are:
Ho: The population of values is represented by a rectangle with base 3 to 7 and height 1/4
H1: The population of values is represented by a rectangle with base 0 to 4 and height 1/4
Reject Ho if service time is 3.2 or more extreme. Shade the region the corresponds to the significance level and clearly label the region.
I know how to draw the picture but I do not know what way to shade for Ho.
Answer:
A type 2 error is one where you fail to reject a false null hypothesis, right? That is, the population is actually the one stated in H1, but you get a value that doesn't make you reject H0. Since you reject H0if you get a value of 3.2 or less, you DON'T reject H0 if you get a value GREATER than 3.2, right?
HOWEVER, and this is really important: how far to the right of 3.2 do you shade? Do you shade all
the way to 7? NO!!! If H1 is really true, then the distribution is really the one stated in H1, and the only possible values are ones in that distribution. So you shade from 3.2 up to...what?
Question:
Type: Hypothesis Test, 1-tailed vs. two-tailed
Highway safety engineers test new road signs, hoping that increased reflectivity will make them more visible to drivers. Volunteers drive through a test course with several of the new and old style signs and rate which kind shows up best.
a) Is this a one-tailed or two-tailed test? Why?
b) In this context, what would a Type I error be?
c) In this context, what would a Type II error be?
d) In this context, describe what is mean by the power of the test
e) If the hypothesis is tested at the 1% level of significance instead of 5%, how will this affect the power of the test?
Answer:
a) we use two-tailed test when we don't have any pre-existing ideas of what the outcome will be. That's not the case here, is it? We're assuming that the new signs won't be any worse than the old ones - they'll either be the same or they'll be better, right?
b and c) A Type II error is failing to reject a false null hypothesis, right? So that would be: the new signs are truly better, but the test doesn't demonstrate it. Can you answer the one about a Type I error now?
d) Power is 1 minus the probability of a Type II error. So it's the probability of rejecting a false null hypothesis, so it's the probability of what in terms of the old signs and new signs?
e) This is a good question, but difficult. If we move the level of significance from 5% to 1%, we are lowering the probability of a type I error. That makes it more difficult to reject the Null Hypothesis, even if it really is false, therefore INCREASING our probability of making a type II error. If the probability of a type II error goes UP, then the power goes DOWN.
Question:
Type: Normal Distribution, Percentile
The lifetime of ZZZ batteries are normally distributed with a mean of 265 hours and a standard deviation
of 10 hours. Find the number of hours that represent the 40th percentile
Answer:
The 40th percentile is the point at which there is .40 probability below it (or to the left, on a graph). So, in a regular question they'll ask "how much probability is to the left of X" while here they are saying "What is X so that the probability is .40?"
You want to find X so that when you standardize it into a z-score it corresponds to the point where P(z < (x - 265)/10 ) = .40
So find the z-score that corresponds to that probability, say it was -1.5 (it's not, but say that it was), then you'd want your z-score to be -1.5, so you'd solve:
(x - 265)/10 = -1.5 => x = -15 + 265 = 250.
Question:
Type: Confidence Interval for a Sample Mean
A sample of the math test scores of 35 fourth-graders has a mean of 82 with a standard deviation of 15.
a. Find the 95% confidence interval of the mean math test scores of all fourth-graders.
b. Find the 99% confidence interval of the mean math test scores of all fourth-graders.
c. Which interval is larger? Explain why.
Answer:
The whole idea of confidence intervals is to give some precision to the estimate you're making. So, someone says, "what's the average score of all fourth-graders" and you say "well, when I looked at only 35 of them I got a mean of 82", that doesn't mean much without some knowledge of how spread out they are. Maybe someone got a 40 or a 50, and others got 100. Or maybe everyone got between 75 and 85. So the idea is to give some level of precision. Keep that in mind when we answer part c.
I'm not sure where you are in your course, so I don't know if you've only talked about the "Standard Normal" distribution, or whether you've covered the "t-distribution" yet. Since the problem doesn't say anything about the population's distribution, typically we'd use the t-distribution (let me know if that doesn't make sense). So in general, the formula should look something like:
x-bar ± t-score * (s/sqrt{n}) where the "t-score" is something from a table of the t-distribution, right?
x-bar is always your sample mean: 82. "s" is the sample standard deviation: 15. And "n" is the size of your sample: 35. The t-score is the point where there is 95% of the probability inbetween, and only 5% outside of it. In a Standard Normal distribution, that number is 1.96. That is, 95% of the probability is between -1.96 and +1.96. In this problem, however, you have to use a t-distribution with (n-1) degrees of freedom. Let me know if you have trouble doing that.
Now you have all of the things you need (by the way, "sqrt{n}" means "the square root of n"). For the left-hand side of the interval, you'll have:
82 - t-score*(15/sqrt{35}), and for the right-hand side you'll have 85 + t-score*(15/sqrt{35})
The only difference between parts a and b will be the t-score you use. In part b you want the one where 99% of the probability is between (so will that be larger or smaller than the one for 95%? And what will that do to the endpoints of your interval? Make them further out or closer in?).
If you can answer those last questions, you have your answer to part c. If not, ask yourself this: say for example I told you a 95% confidence interval was (75, 85) (that is, "75 to 85"). If you wanted to be MORE sure the interval included the real mean, would it have to be wider or smaller? That is, would you need to include MORE possibilities or fewer?
Question:
Type: Hypothesis Test, Two Means
The department at a university is concerned that dual degree students may be receiving
lower grades than the regular MBA students. Two independent random samples have been
selected. 100 observations from population 1 (dual degree students) and 100 from
population 2 (MBA students). The sample means obtained are X1(bar)=85 and X2(bar)=88.
It is known from previous studies that the population variances are 4.0 and 5.0 respectively.
Using a level of significance of .10, is there evidence that the dual degree students are
receiving lower grades? Fully explain your answer.
Answer:
The first thing is to set up the hypothesis you are testing. For the Null Hypothesis (Ho) we always assume that nothing is going on, or no association, or no difference, or whatever. So in this case, you assume that the mean grades for the two groups of students are equal:
Ho: MEAN of dual degree students = MEAN of mba students
For the alternative hypothesis, in this case it's only one-sided, and that's that the dual degree students' grades are lower:
Ha: MEAN dual degree < MEAN mba
Notice that for Ho, if the means were equal, their difference would be zero, and if the dual degree students mean was lower, MEAN dual degree minus MEAN mba would be negative. So you are testing the difference between two sample means. You should have a formula for that in your notes or book, it will look something like:
(xbar1 - xbar2) / sqrt{ some big mess }
and that big mess is a combination of population variances and sample sizes, right? sigma1 / n1 + sigma2 / n2, does that look familiar? You have all of the information: you have xbar1 and xbar2 (the sample means), sigma1 and sigma2 (the population variances) and n1 and n2 (the sizes of each sample).
When you plug those in you'll get a (negative) z-score, and then it's a matter of comparing that to the critical value (which will also be negative), and will be the point of the t-distribution where .10 is to the left of it. (Why?) Also, can you calculate the degrees of freedom? Isn't it n1 + n2 - 1?
If you "reject" or "fail to reject" Ho, be sure to explain it in terms of the problem. That is "given my results, there is evidence (or there is not evidence) that dual degree students have a lower mean grade than MBA students".
Question:
Type: Normal Distribution, Emperical Rule
In a set of 90 ACT scores, where the mean is 25 and the standard deviation is 4.69,
(a) how many scores are expected to be lower than 20.31 (one standard deviation below the mean)?
(b) How many of the 90 scores are expected to be below 34.38 (two standard deviations above the mean)?
Answer:
One thing that it doesn't say is what type of distribution the scores have. Can I assume they have a Normal distribution? I'll start as if I can, but at the bottom I'll say something about if I can't assume that.
For all of these, you want to know "how many standard deviations above or below the mean is the number they are giving?" That's because in a Standard Normal (mean=0 and std dev=1) you know what proportion of the data lies above or below any point. And notice that since the mean = 0 and the std dev = 1, the point 1.34, for example, is EXACTLY 1.34 standard deviations above the mean.
It turns out that in this problem they actually do this for you. So to find the answer to "a", you first have to find the PROPORTION of scores more than 1 standard deviation below the mean. For a Normal distribution that's the area to the left of -1.0. Find that proportion, and then multiply it by the total number of scores you have (90) to get the NUMBER of scores (and be sure to round that to a whole number). part "b" is the same, except that it's the area to the left of -2.0.
IF you can't assume the distribution is Normal, then you should have some general rules from your notes or the book like: "in any distribution, about X% is within 1 standard deviation of the mean, and about Y% is within 2 standard deviations of the mean", right? Let's say that it's actually 67% of the distribution is within 2 standard deviations of the mean. That means how much is OUTSIDE of 2 std devs? 100% - 67%, or 33%. And if the distribution is symmetric, that means half of that is more than 2 std devs ABOVE the mean, and the other half is more than 2 std devs BELOW the mean. So what proportion is more than 2 std devs below the mean? Half of 33%, or 16.5%. So you'd expect 16.5% of the 90 scores to be below 34.38.
Question:
Type: Normal Distribution, Mean and Sample Mean, Hypothesis Test
scores for men on the verbal portion of the SAT-I test are normally distibuted with a mean of 509 and the standard deviation of 112. randomlly selected men are given the columbia review course before taking the SAT test. assume that the course has no effect.
A) if 1 of the men is randomly selected find the probability that his score is at least 590
B) if 16 men are randomly selected find the probability that there mean score is 590
C) in finding the probability for part b why can the central limit theorem be used even thought the sample is less then 30
D) is the ramdom sample of 16 men does result in a mean score of 590 is there strong evidence to suppost the claim that the course is actually effective?why/why not?
Answer:
So, the population mean (mu) is 509 and sigma is 112. Men are given a review course, but we're to assume initially that the course has no effect. Again, that's our typical null hypothesis: assume there is no association, no effect, or in general the status quo.
A) P(X > 590) (notice this isn't x-bar because it's not a sample mean. It's just one person from the overall population.) Just standardize this and calculate a p-value: P(X > 590) = P(Z > (590-509)/112) = ...
B) I assume the question is actually "...that their mean score is AT LEAST 590" right? That's similar to part A, except that now we are dealing with a sample mean. So instead of X it's X-bar, so when we standardize to a z-score, we don't divide by 112, we divide by...what? There's a square root of "n" in there, right? Where does that go?
C) The CLT says that the distribution for a sample mean is Normal if one of two things is true: if either the sample is large enough, or...what? Something about the population's distribution...
D) This will come from your p-value in part B. Your previous question showed that you understand p-values, so it's just a matter of making sure you have the right hypothesis test down. Remember that our Null Hypothesis is that there is no effect, or Ho: mean = 509. The alternative is that there is an effect, or that Ha: mean > 509.
I tried to not completely do the problem for you, but instead give you some help to get you there.
Question:
Type: Hypothesis Test, Sample Mean
106 body temperatures with the mean of 98.20F assume standard devation is known as 0.62F,consider the body temperature of the population is less then 98.06 the signifigance level is .05
A) what is the test statistic
B) whats the critical value
C) what is the p value
D) what is the conclusion
Answer:
First of all, keep in mind that for Ho we always assume that status quo, or that there is no difference or no effect. Therefore they would look like:
Ho: mean = 98.6 (that is, that the sample of 106 people came from the population who has a mean body temp of 98.6)
Ha: mean < 98.6
And you're given that the population standard deviation is 0.62, and that in a sample of 106 people you got a sample mean of 98.2, right?
A) The test statistic will look like:
{x-bar - mu} / {std dev/sq root(n)} where "x-bar" is the sample mean (which is what?), "mu" is the hypothesized population mean (which is what?), and "std dev" is the known standard deviation (and "n" is the sample size). When I put in the numbers you provide, I get a test statistic of -6.64, which is a huge number...but that's what I get with the info you've provided.
B) 1.96 would actually be the critical value at alpha = .05 if this were a two-sided alternative. It's not. It's one sided, so you need to find a different critical value. Secondly, because the alternative is "less than", the critical value is actually going to be negative. Does that make sense?
C) Assuming the test statistic I got in part A is correct (again, I'm not convinced) the p-value would be so small that it won't be on your Normal Table. the p-value would be < .0001.
D) Instead of just saying "reject" or "fail to reject", you should also state what the conclusion is in terms of the problem. Do you have evidence that this sample did (or did not?) come from a population who has a mean body temp of 98.6?
Question:
Type: Hypothesis Testing
ABC Casino claims to have the loosest slots around. According to ABC their $1 slots have an average payout of 96 cents with a standard deviation of $15. On my last visit to ABC, I played the $1 slot 10,000 times and lost $3500. Do I have sufficienct evidence (at ALPHA= .05) to file a complaint with the Gaming Commission?
Answer:
This is a neat question, and requires some thought.
First of all, what would be sufficient evidence to file a complaint? If there is evidence that their "claim" is false, right? So, what you have to determine is whether your success rate is reasonable given their claim, or unreasonable (or unlikely). This is a classic hypothesis test.
So, what is the hypothesis test? Ho: mean = 0.96 vs. H1: mean < 0.96 (it's one-sided because you would only have a complaint if the average payout is lower than they claim - if it's higher than they claim you'd be ok with that!)
The key here is to realize what info you have. You have the hypothesized mean of THE POPULATION, and you also have a SAMPLE MEAN (that is, the mean from one sample). Those are two different things, right? "mu" and "x-bar". You need to think about what's true between a population (and its mean and standard deviation) and the distribution of possible SAMPLE MEANS.
If Ho were true, what is the likelihood you'd get the sample mean you did? (Which is what? you put in 10,000 and lost 3500, which means you got paid 6500 out of 10,000, or 6500/10000 = .65, right?).
The POPULATION standard deviation is 15, but what's the standard deviation of the sample mean? It's 15/sqrt(10000) = 0.15. You now have the hypothesized mean and standard deviation of the distribution of the sample mean. How do you determine if something is likely to occur in that distribution? You standardize it to a Standard Normal, sometimes it's called creating a z-score.
Remember that in general, a z-score = (x - mu)/std dev . What are "x" (or x-bar in this case), "mu", and "std dev" in this case? When you get that z-score, how do you determine if it's likely or unlikely? You compare it to the cut-off for an unlikely z-score. In this case, since alpha = 0.05, you compare it to a z-score of -2.545 (why is it negative?). You are going to determine that your result is unlikely if your z-score is (greater or less than?) that value. And based on that you'll say whether or not you have sufficient evidence to file a complaint.
Question:
Type: Confidence Interval
Suppose you take a single random sample of size 100 from a bell-curved population with a mean of 77 and a std dev of 5, and you get a mean test score in that sample of 76. Is this something you would have expected? Use the standardized score in your answer.
Which would be wider, a 90% confidence interval or a 95% confidence interval? (Assume both of them were calculated using the same sample data.) Explain your answer.
Answer:
The first one deals with "sample means" which have a Normal Distribution if the population they are drawn from has a Normal Distribution (which is the case here). Basically: if you took all of the possible samples of 100 and plotted the means of all of those samples, they would have a Normal Distribution centered at 77, with a standard deviation of...what? not 5, but 5/sqrt(100), or 5/10 = 1/2. So, is 76 a sample mean that is possible from such a distribution? well, the "standardized score" is (x-bar - mu)/ (sigma/sqrt(n)) or (76 - 77)/1/2, or -2.0. Is this possible? I'll leave that for you to answer.
For the second one, let me ask you this: if you wanted to be MORE confident that an interval included the right value, would that interval have to be bigger or smaller? That is, what if someone asked you when the next bus was, and you said "um, I think between 3:00 and 3:10" and they replied, "I don't want to miss it - give me an interval you're REALLY sure that if I'm here during that time I won't miss the bus" will that interval be wider or narrower?
OTHER/MISCELLANEOUS