Hypothesis Testing and Inference
Hypothesis testing and inference is a mechanism in statistics used to determine if a particular claim is statistically significant, that is, statistical evidence exists in favor of or against a given hypothesis. The Statistics package provides 11 commonly used statistical tests, including 7 standard parametric tests and 4 non-parametric tests.
All tests generate a report of all major calculations to userinfo at level 1 (hence, if output is suppressed, the reports are still generated). To access the reports, you need to specify the statistics information level to 1 using the following command.
>
|
|
| (1) |
|
1 Tests for Population Mean
|
|
Two standard parametric tests are available to test for a population mean given a sample from that population. The OneSampleZTest should be used whenever the standard deviation of the population is known. If the standard deviation is unknown, the OneSampleTTest should be applied instead.
Generate a sample from a random variable that represents the sum of two Rayleigh distributions.
>
|
|
The following then are the known values of the mean and standard deviation of the population.
>
|
|
| (1.1) |
>
|
|
| (1.2) |
Assuming that we do not know the population mean but we know the standard deviation of the population, test the hypothesis that this sample was drawn from a distribution with mean equal to 12.
>
|
|
Standard Z-Test on One Sample
-----------------------------
Null Hypothesis:
Sample drawn from population with mean 12 and known standard deviation 5.28188
Alt. Hypothesis:
Sample drawn from population with mean not equal to 12 and known standard deviation 5.28188
Sample size: 100
Sample mean: 13.7517
Distribution: Normal(0,1)
Computed statistic: 3.31636
Computed pvalue: 0.000911977
Confidence interval: 12.71643272 .. 14.78689098
(population mean)
Result: [Rejected]
There exists statistical evidence against the null hypothesis
| |
Similarly, if we assume that the standard deviation is unknown, we can apply the one sample t-test on the same hypothesis - this time with a 90% confidence interval.
>
|
|
Standard T-Test on One Sample
-----------------------------
Null Hypothesis:
Sample drawn from population with mean 12
Alt. Hypothesis:
Sample drawn from population with mean not equal to 12
Sample size: 100
Sample mean: 13.7517
Sample standard dev.: 5.14945
Distribution: StudentT(99)
Computed statistic: 3.40165
Computed pvalue: 0.000967459
Confidence interval: 12.89665167 .. 14.60667203
(population mean)
Result: [Rejected]
There exists statistical evidence against the null hypothesis
| |
|
|
2 Tests for the Difference of Two Population Means
|
|
Three standard parametric tests are available for testing the difference between two population means when examining two samples. The TwoSampleZTest should be applied when the standard deviation of both populations is known. If the standard deviations are unknown then the TwoSampleTTest is available for unrelated data and the TwoSamplePairedTTest is available for paired data.
Consider three data sets.
Calculate some known quantities with regards to these samples.
>
|
|
| (2.1) |
>
|
|
| (2.2) |
>
|
|
| (2.3) |
Assuming that we do not know the means of the populations from which X and Y were drawn, but we know the standard deviation of each to be 4 and 3 respectively, test the hypothesis that the difference between the means is 3.
>
|
|
Standard Z-Test on Two Samples
------------------------------
Null Hypothesis:
Sample drawn from populations with difference of means equal to 3
Alt. Hypothesis:
Sample drawn from population with difference of means not equal to 3
Sample sizes: 10, 10
Sample means: 7.6, 7.2
Difference in means: 0.4
Distribution: Normal(0,1)
Computed statistic: -1.64438
Computed pvalue: 0.100097
Confidence interval: -2.698975162 .. 3.498975162
(difference of population means)
Result: [Accepted]
There is no statistical evidence against the null hypothesis
| |
If we now compare samples X and Z under the hypothesis that the difference in means (Mean(X)-Mean(Z)) is 1, and assume we do not know the standard deviation of either sample, we can apply the two sample t-test.
>
|
|
Standard T-Test on Two Samples (Unequal Variances)
--------------------------------------------------
Null Hypothesis:
Sample drawn from populations with difference of means equal to 1
Alt. Hypothesis:
Sample drawn from population with difference of means not equal to 1
Sample sizes: 10, 10
Sample means: 7.6, 8.4
Sample standard devs.: 4.24788, 3.97772
Difference in means: -0.8
Distribution: StudentT(17.92283210)
Computed statistic: -0.978107
Computed pvalue: 0.34104
Confidence interval: -4.667499017 .. 3.067499017
(difference of population means)
Result: [Accepted]
There is no statistical evidence against the null hypothesis
| |
If we instead drew the data for X and Z from paired sampling, we can apply the two sample t-test for paired data.
>
|
|
Standard T-Test with Paired Samples
-----------------------------------
Null Hypothesis:
Sample drawn from populations with difference of means equal to 1
Alt. Hypothesis:
Sample drawn from population with difference of means not equal to 1
Sample size: 10
Difference in means: -0.8
Difference std. dev.: 1.31656
Distribution: StudentT(9)
Computed statistic: -4.32346
Computed pvalue: 0.00192341
Confidence interval: -1.741810891 .. .1418108907
(difference of population means)
Result: [Rejected]
There exists statistical evidence against the null hypothesis
| |
|
|
3 Tests for Population Variance / Standard Deviation
|
|
Two standard parametric tests are available for examining hypotheses regarding the population variance and standard deviation using the variance ratio. The OneSampleChiSquareTest function should be applied when comparing a sample standard deviation against an assumed population standard deviation. When comparing the variances of two independent samples for a specific ratio, the TwoSampleFTest function should be used instead.
Generate a sample from a Maxwell distribution and an Exponential distribution.
>
|
|
The following then are the known values of the variances of each population.
>
|
|
| (3.1) |
>
|
|
| (3.2) |
Consider the hypothesis that S is drawn from a sample with a standard deviation of 4 and apply the OneSampleChiSquareTest.
>
|
|
Chi-Square Test on One Sample
-----------------------------
Null Hypothesis:
Sample drawn from population with standard deviation equal to 2
Alt. Hypothesis:
Sample drawn from population with standard deviation not equal to 2
Sample size: 100
Sample standard dev.: 1.83342
Distribution: ChiSquare(99)
Computed statistic: 83.1952
Computed pvalue: 0.253798
Confidence interval: 1.609754032 .. 2.129836954
(population standard deviation)
Result: [Accepted]
There is no statistical evidence against the null hypothesis
| |
Now consider the hypothesis that samples S and T were drawn from populations that had a variance ratio of 2. The TwoSampleFTest compares the ratio of S and T against an assumed variance ratio of the populations. Thus, if we were to instead test that the samples had the same variance ratio, we would use an assume ratio of 1 instead.
>
|
|
F-Ratio Test on Two Samples
---------------------------
Null Hypothesis:
Sample drawn from populations with ratio of variances equal to 2
Alt. Hypothesis:
Sample drawn from population with ratio of variances not equal to 2
Sample sizes: 100, 100
Sample variances: 3.36142, 4.08274
Ratio of variances: 0.823326
Distribution: FRatio(99,99)
Computed statistic: 0.411663
Computed pvalue: 1.45561e-05
Confidence interval: .5539687377 .. 1.223654982
(ratio of population variances)
Result: [Rejected]
There exists statistical evidence against the null hypothesis
| |
|
|
4 Tests for Normality
|
|
The Statistics package provides an implementation of Shapiro and Wilk's W-test for normality. This test is used to determine if a provided sample could be considered to be drawn from a normal distribution.
Generate a sample of twenty points from a normal distribution and another from a uniform distribution.
>
|
|
Consider the hypothesis that S is drawn from a normal distribution and apply Shapiro and Wilk's W-test.
>
|
|
Shapiro and Wilk's W-Test for Normality
---------------------------------------
Null Hypothesis:
Sample drawn from a population that follows a normal distribution
Alt. Hypothesis:
Sample drawn from population that does not follow a normal distribution
Sample size: 20
Computed statistic: 0.972002
Computed pvalue: 0.784909
Result: [Accepted]
There is no statistical evidence against the null hypothesis
| |
Apply the same hypothesis with regards to the data drawn from the uniform distribution.
>
|
|
Shapiro and Wilk's W-Test for Normality
---------------------------------------
Null Hypothesis:
Sample drawn from a population that follows a normal distribution
Alt. Hypothesis:
Sample drawn from population that does not follow a normal distribution
Sample size: 20
Computed statistic: 0.889151
Computed pvalue: 0.0259262
Result: [Rejected]
There exists statistical evidence against the null hypothesis
| |
|
|
5 Tests for Goodness-of-Fit
|
|
The Statistics package provides two methods of testing goodness-of-fit. The ChiSquareGoodnessOfFitTest function should be used to determine if an observed or empirical data set fits expected values for that data set. Similarly, the ChiSquareSuitableModelTest is available for testing how well a given probability distribution approximates a data sample.
Consider the following number of sales made on each day of the week at a jewelry store, tallied over one sales week (Monday to Saturday).
>
|
|
| (5.1) |
We wish to test the hypothesis that sales are uniformly distributed throughout the week. The expected number of sales per day is then given by the number of sales averaged over the week.
>
|
|
| (5.2) |
>
|
|
| (5.3) |
We now test the hypothesis (using ChiSquareGoodnessOfFitTest) that the observed number of sales per day is consistent with a uniformly distributed number of sales each day.
>
|
|
Chi-Square Test for Goodness-of-Fit
-----------------------------------
Null Hypothesis:
Observed sample does not differ from expected sample
Alt. Hypothesis:
Observed sample differs from expected sample
Categories: 6
Distribution: ChiSquare(5)
Computed statistic: 5
Computed pvalue: 0.41588
Critical value: 11.07049741
Result: [Accepted]
There is no statistical evidence against the null hypothesis
| |
Hence we conclude that a uniformly distributed number of sales is a reasonable claim.
Consider a dataset of times during a day when sales are made. Determine if sales are uniformly distributed during the day (consider an 8 hour working day where sales are measured between 0.0 and 8.0, the number of hours into the day). The data in this case is continuous and we are testing against a uniform probability distribution.
>
|
|
Apply the chi square suitable model test to determine if a uniform distribution closely matches the provided data.
>
|
|
Chi-Square Test for Suitable Probability Model
----------------------------------------------
Null Hypothesis:
Sample was drawn from specified probability distribution
Alt. Hypothesis:
Sample was not drawn from specified probability distribution
Bins: 4
Distribution: ChiSquare(3)
Computed statistic: 9.5191
Computed pvalue: 0.023129
Critical value: 7.814728288
Result: [Rejected]
There exists statistical evidence against the null hypothesis
| |
Hence we conclude that the sale times are not uniformly distributed throughout the day. Closer examination of the data reveals that most of the sales were made roughly half way through the day.
|
|
6 Tests for Independence in a Two-Way Table
|
|
The Statistics package contains the ChiSquareIndependenceTest function, which is used to determine if two attributes are independent of one another.
Consider a sample of 476 patients that are part of a survey to determine if a new drug is effective at fighting a new disease. Patients are randomly given either the new drug or a placebo, and their recovery rate is tabulated as follows:
>
|
|
Construct the two-way table for this result.
>
|
|
Finally, apply the chi square test for independence to test the hypothesis that the results are independent. That is, the drug has no effect on the recovery rate from the disease.
>
|
|
Chi-Square Test for Independence
--------------------------------
Null Hypothesis:
Two attributes within a population are independent of one another
Alt. Hypothesis:
Two attributes within a population are not independent of one another
Dimensions: 2
Total Elements: 476
Distribution: ChiSquare(1)
Computed statistic: 5.26704
Computed pvalue: 0.0217328
Critical value: criticalvalue
Result: [Rejected]
There exists statistical evidence against the null hypothesis
| |
Thus we conclude that there exists statistical evidence in favor of the drug having an effect on recovery rate. Closer examination reveals that the drug improves a patient's chance of recovery from the disease.
|
|
7 Output Options
|
|
The default output from each test is a report containing expressions of the form name = value for key output from the test. Using the output option, specific values can be returned instead.
>
|
|
Consider the following data set.
>
|
|
Apply the one sample t-test on this data to test for a population mean of 5:
>
|
|
| (7.1) |
A true value for the hypothesis indicates that there is no statistical evidence against the null hypothesis (and there exists statistical evidence against it otherwise). If we were only interested in the confidence interval from this calculation, we can use option output=confidenceinterval.
>
|
|
| (7.2) |
A list of valid output options are available on the help page for each test.
|
Return to Index for Example Worksheets
|