![]() |
|
|
|
|
|
Caution in using statistical test resultsTests of statistical significance have great appeal because they provide researchers with an independent, objective basis for assessing their findings. But they only tell us the probability that our results could have been due to chance variations that occur in the selection of a random or probability sample. If the odds are low enough that the result could not have occurred by chance, we can reject the null hypothesis and accept the alternative hypothesis formulated at the beginning of the research project. While useful, we caution against uncritical dependence on levels of significance in the interpretation of results. First, statistical tests say nothing about the substantive value or the importance of the results. With a large enough N, almost any result will be statistically significant. But a very weak level of association or a meager difference between means, while significant at the .01 level, has very little theoretical or practical value. For example, with df = 50, a correlation coefficient of .231 is significant at the .05 level. Although significant, this correlation indicates only a slight association between two variables. Its coefficient of determination would be only 0.53, meaning that only about 5% of the variation in the dependent variable could be attributed to the effect of the independent variable. The result, while significant, would have to be interpreted as indicating only a weak association and would not suggest much theoretical or practical importance. Second, there are times when it may be appropriate to bend the traditional .05 level of significance in assessing results. Remember, the .05 level is an arbitrary criterion established by the research community. Why not a .07 level or .03 level? Any other level with a low probability for results occurring due to chance could have been used. By tradition, the research community settled on the .05 level. What do you do when results are close to but still slightly below the .05 level for a test? It is entirely appropriate to point out the results are "approaching significance" and to give and interpret a result accordingly. For instance, a chi square of 11.35 with 6 degrees of freedom lies between the .10 and .05 levels of significance. You could simply report the result as nonsignificant, and show it as p<.05; retain the null hypothesis; and conclude that the two variables are not related. Or, you could report your results more precisely by using the notation of .10<p<.05, which says the probability of the null hypothesis being true is between .10 and .05. In similar fashion, significance levels can be reported between .30 and .20 or any other levels. Third, there are situations where using a statistical test is not appropriate. When data are collected from all members of population, there is no sampling error. Therefore, there is no basis for making statements about the probability of relationships or differences occurring due to chance. For measurements based on enumeration of a population, the results are parameters. This is not to say that whatever is found is absolutely accurate. There still is the possibility of error in measurement. For more on this point, review the discussion of measurement error in Chapter 7. But whatever is found does not involve sampling error. Strictly speaking, statistical tests are also inappropriate when nonprobability samples are used. Probability theory, the basis of all tests of statistical significance, rests on the use of probability or random samples. When convenience or other nonprobability techniques of sampling are used, there is no basis for drawing on probability theory. For these samples, there is no sampling error and, therefore, statistics from non-probability samples cannot be used to estimate population parameters. Results from nonprobability samples, however, can be and are frequently reported in the literature. Careful researchers note this fact and interpret their results in theoretical terms or in terms of their practical value. |