Methods for Social Researchers in Developing Countries



Samples
and
populations

Probability
theory and statistical inferrence


Inferring a population
mean


Tests of
statistical significance


Tests of
differences between
means


Coefficient
of
correlation


Caution
with
association

Chi square


Other
tests of
significance

Caution in
using
statistical
test results


Aids

Home   TOC   Parts   Glossary   Links   References   Contact Us   Help

 

Calculating rho

The formula for rho is:

                    6d 2                  6(28.50)                171.0
  rho = 1 - _________ = 1 -    _______   =   1 -   ______   =   1 - .339 = .661
                 N(N 2 - 1)               8(64-1)                  504

Like r, rho can vary from +1.0 to -1.0. The size of rho depends on the differences between each set of scores and the N for the analysis. As differences between ranks increase, so do their squares. As the N increases, this reduces the value for 6 times the sum of the squares of the differences. As the result of this division decreases, rho increases because a smaller number is subtracted from the 1 at the beginning of the formula.

If ranks for each set of scores match exactly, the differences in ranks will equal 0. Therefore, rho will equal 1, representing a perfect correlation. As differences in ranks increase, rho will decrease.

The critical values for rho are provided in most statistical textbooks. Look for values of the Spearman Rank Order Correlation Coefficient. With rho, critical values are associated with the number of pairs used.   In our example, there were 8 pairs of scores.   For 8 pairs, the critical value of rho at the .05 level is .643. The rho we found was .661 which exceeded the .05 level. Therefore, we can reject the null hypothesis and accept the alternative hypothesis that grades in subject X are associated with those in subject Y.

Web-based calculation of rho

Web calculators are also available for calculating rho. We used Spearman Rank Order Correlation Coefficient. You can do the calculation by entering raw scores for X and Y or by entering ranks for each score. We confirmed our hand calculation for rho, as just described, using the calculator for rho and got a slightly different result. The calculator gave a rho of 0.6504 compared to the 0.661 we got. This difference occurred because the Web-based calculator carried out calculations with a larger decimal value than we used, but the important result was the same. The result was significant at the .05 level.

Caution with association

Finding a strong correlation between variables only says that the variables are related and nothing more. Such a finding is no basis for claiming that the independent variable caused the changes in the dependent variable. Claims for cause and effect can only be made when the effects of all independent influences on the dependent variable are eliminated. This is extremely hard to do in social research. Therefore, as we pointed out in the section on "Cause and Effect" in Chapter 3, social scientists are very careful when they interpret correlation results. Finding associations is important. These results add to our knowledge of social relationships. But as every student in statistics learns: Correlation is not the basis for causation.

Chi square

Chi square, written as , is frequently used for testing for association between two nominal or two ordinal variables. It will tell you what the probability is that an observed variation of one variable with another is the result of chance, due to the particular sample you selected, or whether the covariation between the two variables is probably real.It will not tell you the strength of association between the two variables. Other tests, such as the coefficient of contingency, described later, can be used for this purpose. This and other statistical tests are described in most statistical textbooks and on many Web sites, including those cited at the end of this chapter.

Chi square can also be calculated for any sized bivariate table containing two nominal or a nominal and ordinal variable. Table 19.7 shows a 3x2 table for an ordinal variable, socio-economic status of fathers, expressed in three levels, and a nominal variable, their attitude toward completion of schooling by daughters, which is expressed as yes (to complete secondary school) or no (not to complete secondary schooling). The upper half of the table gives the observed frequencies for the relationship between these two variables. A quick calculation shows that the percentages of fathers who said "yes" increased directly with the socio-economic status of the families (low, 44%; middle, 67%; high, 75%).   But, assuming random samples were used to select the fathers, are these differences large enough not to be attributed to sampling error? We can find out by conducting a chi square test.

PREV       NEXT