|
Cautionary Comments
-
Weighted data, missing data, small sample sizes,
complex
sample designs, and capitalization on chance in fitting a statistical model
are sources of potential problems in data analysis. The Decision Tree does not deal
with these complications. If one of these situations exists, use the Decision
Tree
with caution.
-
The statistical measures in the terminal
screens are descriptive of the particular sample being examined. For some
statistical measures, the values obtained will also be a good estimate of the
value in the population as a whole, whereas other statistics may underestimate
or overestimate the population value. In general, the amount of
bias is
relatively small and sometimes adjustments can be made for it. These
adjustments are discussed in some statistical texts but not in this Decision
Tree. If
a statistic is a biased estimator of the population value it is so noted in
the Decision Tree.
-
In principle, a confidence interval may
be placed around any statistic. It is also possible to test the significance
of the difference between values of a statistic calculated for two
non-overlapping groups. These procedures are not indicated in the Decision
Tree but
are discussed in standard textbooks.
-
The Decision Tree does not explicitly consider
possible transformations of the data such as bracketing, using logarithms,
ranking, etc. Transformations may be used to simplify analysis or to bring
data into line with assumptions. For example, it is often possible to
transform score so that they correspond to a normal distribution, constitute
an interval scale, or relate linearly to another variable. Occasionally, it
may be wise to eliminate cases with extreme values. MicrOsiris can effect
these transformations with the RECODE command. For guidance on selecting
appropriate transformations, see Kruskal (1978).
-
Common assumptions for inferences based
on techniques using one or more intervally scaled variables (particularly when
the intervally scaled variable is a dependent variable) include the following:
first, that the observations are independent, i.e., the selection of one case
for inclusion in the sample does not affect the chances of any other
case being included, and the value of a variable for one case in no way
affects the value of the variable for any other case; second, that the
observations are drawn from a population normally distributed on the
intervally scaled variables; and third, if more than one variable is involved,
that the intervally scaled variables have equal variances within categories of
the other variables, i.e., there is homogeneity of variance.
Bivariate or
multivariate normality is also sometimes assumed.
|