Cautionary Comments

Weighted data, missing data, small sample sizes, complex sample designs, and capitalization on chance in fitting a statistical model are sources of potential problems in data analysis. The Decision Tree does not deal with these complications. If one of these situations exists, use the Decision Tree with caution.

The statistical measures in the terminal screens are descriptive of the particular sample being examined. For some statistical measures, the values obtained will also be a good estimate of the value in the population as a whole, whereas other statistics may underestimate or overestimate the population value. In general, the amount of bias is relatively small and sometimes adjustments can be made for it. These adjustments are discussed in some statistical texts but not in this Decision Tree. If a statistic is a biased estimator of the population value it is so noted in the Decision Tree.

In principle, a confidence interval may be placed around any statistic. It is also possible to test the significance of the difference between values of a statistic calculated for two nonoverlapping groups. These procedures are not indicated in the Decision Tree but are discussed in standard textbooks.

The Decision Tree does not explicitly consider possible transformations of the data such as bracketing, using logarithms, ranking, etc. Transformations may be used to simplify analysis or to bring data into line with assumptions. For example, it is often possible to transform score so that they correspond to a normal distribution, constitute an interval scale, or relate linearly to another variable. Occasionally, it may be wise to eliminate cases with extreme values. MicrOsiris can effect these transformations with the RECODE command. For guidance on selecting appropriate transformations, see Kruskal (1978).
Common assumptions for inferences based on techniques using one or more intervally scaled variables (particularly when the interval scaled variable is a dependent variable) include the following: first, that the observations are independent, i.e., the selection of one case for inclusion in the sample does not affect the chances of any other case being included, and the value of a variable for one case in no way affects the value of the variable for any other case; second, that the observations are drawn from a population normally distributed on the intervally scaled variables; and third, if more than one variable is involved, that the interval scaled variables have equal variances within categories of the other variables, i.e., there is homogeneity of variance. Bivariate or multivariate normality is also sometimes assumed.</>