GENERAL DESCRIPTION
MCA examines the relationships between several categorical independent variables and a single dependent variable, and determines the effects of each predictor before and after adjustment for its inter-correlations with other predictors in the analysis. It also provides information about the bivariate and multivariate relationships between the predictors and the dependent variable. See Andrews, et al., Multiple Classification Analysis, for a complete description of the methodology used.
COMMAND FEATURES
Missing Data: Cases with missing data on the independent variables may be eliminated (see DELETE keyword). Cases with missing data on the dependent variable are automatically excluded from the analysis. MCA produces RECODE control statements for computing residuals if requested.
PRINTED OUTPUT
Dependent Variable Statistics: For the dependent variable (Y):
Grand mean
Standard deviation (square root of unbiased estimator of the population variance.)
Sum of Y
Sum of Y-squared
Total sum of squares
Explained sum of squares
Residual sum of squares
Number of cases used in the analysis
The sum of weights
Independent Variable Category Statistics: For each category of an independent variable:
The number of cases (raw, weighted, and percentages)
Mean and standard deviation
Deviation of the category mean (unadjusted and adjusted)
Adjusted class mean MCA coefficient
Eta and eta squared
Partial beta and beta-squared coefficients
Unadjusted and adjusted sum of squares
Bivariate frequency tables for every pair of predictors (optional)
One-Way Analysis of Variance Summary Statistics: If only one independent variable is specified, the following are printed:
Eta squared
Adjustment factor
Adjusted eta and eta squared
Total sum of squares
Between-mean sum of squares
Within-groups sum of squares
F value (degrees of freedom are printed)
RESIDUAL RECODE CONTROL STATEMENT OUTPUT
RECODE control statements to compute predicted and residual values based on the MCA regression may be written to the file assigned to RESIDUAL (keyword RESIDUALS). These statements may be used with LISTDATA to list the residuals or with TRANS to create a permanent residuals dataset.
INPUT DATA
The dependent variables must be measured on an interval scale or must be a dichotomy. Predictor variables must be categorical, preferably with six or fewer categories. When there is more than one predictor, all predictor codes must be coded in the range 0 to 31.
RESTRICTIONS
1. Predictor categories must be in the range 0 to 31 when more than one predictor is defined.
2. The total number of predictor codes, obtained by summing the number of codes for each predictor, must be less than or equal to the value assigned to the MAXC parameter. Thus, if there are two predictors, one with codes 0,1,2 and the other with codes 1,2,3,4, set MAXC to 7.
CONTROL STATEMENTS
Filter Statement (optional)
Title Statement
Parameter Statement
CRITERION=n
Tolerance of the convergence test selected. Range: 0.0 to 1.0.
Default: CRITERION=.005.
DELETE=(MD1,MD2)
MD1 Delete all cases where any independent variable equals its first missing-data code.
MD2 Delete all cases where any independent variable equals its second missing-data code.
DEPV=variable
number
The dependent variable.
MAXC=n
The maximum total number of predictor codes for all
predictors (see restriction 2).
Default: MAXC=99.
MAXI=n The maximum number of iterations.
Default: 25 iterations.
PRINT=(DICT|CODEBK,TABLES,TRACE)
DICT Print the input dictionary.
CODEBK Print the input dictionary and codebook records.
TABLES Print the pair-wise cross-tabulations of the independent variables.
TRACE Print the coefficients from all iterations.
RECODE=n Use RECODE n, previously entered via the RECODE command.
RESIDUALS Write RECODE control statements for computing predicted and residual values, to the file assigned to RESIDUAL. The predicted value variable number will be R10000 and the residual value variable number will be R10001.
TEST=%MEAN|CUTOFF|%RATIO.
The convergence test desired. If not specified, MCA iterates until the maximum number of iterations (MAXI) is exceeded.
%MEAN Test whether the change in all coefficients from one iteration to the next is below a specified fraction of the grand mean (see CRITERION keyword).
CUTOFF Test whether the change in all coefficients from one iteration to the next is less than a specified value (see CRITERION keyword).
%RATIO Test whether the change is less than a specified fraction of the ratio of the standard deviation of the dependent variable to its mean (see CRITERION keyword).
VARS=variable numbers
The list of independent variables. One-way analysis of
variance is performed if only one variable is specified.
WTVAR=n Use variable n as a weight variable
REFERENCES
Andrews, F. M., J. N. Morgan, J. A. Sonquist and L. Klem. Multiple Classification Analysis. Second
edition.
EXAMPLES
Example 1: Predicting income (V268) from V251 (occupation), V30 (marital status), and V32 (education).
File assignments: dictin=scf.dic datain=scf.dat
Filter statement include v37=1
Page title PREDICTING INCOME
Parameter statement: print=(dict) depv=v268 vars=v251,v30,v32 delete=(md1,md2) test=%mean
*** MCA -- MULTIPLE CLASSIFICATION ANALYSIS ***
PREDICTING INCOME
Number of variables: 4
Variables containing invalid characters will
be treated as missing data
The data are not weighted
For the independent variables, cases with
MD1 or MD2 values will be deleted
The iteration maximum is 25
The convergence test is %MEAN
The tolerance factor is .00500
INPUT DICTIONARY:
VNUM NAME TYPE LOC WID NDEC MD1 MD2 REFNO
V30 MARITAL STATUS I 9 2 0 9 30
V32 EDUC OF HEAD I 11 2 0 9 32
V37 RACE I 13 2 0 9 37
V251 OCCUPATION B I 25 2 0 251
V268 TOTAL FAMILY INC I 27 4 0 268
0 cases deleted due to missing data on the dependent variable.
0 cases deleted due to missing data on the independent variables.
0 cases deleted due to predictor codes outside the range 0 to 31.
299 cases were used in the analysis.
RESULTS BASED ON ITERATION 6
DEPENDENT VARIABLE (Y) = V268 TOTAL FAMILY INC
MEAN 10528.32
STANDARD DEVIATION 7553.407
SUM OF Y 3147968.
SUM OF
TOTAL SUM OF SQUARES .1700208E+11
EXPLAINED SUM OF SQUARES .8352816E+10
RESIDUAL SUM OF SQUARES .8649263E+10
NUMBER OF CASES 299
PREDICTOR V251 OCCUPATION B
UNADJUSTED
NO OF SUM OF CLASS DEVIATION FROM
CLASS CASES WEIGHTS % MEAN GRAND MEAN COEFFICIENT
0 68 68 22.7 4592.206 -5936.115 -4256.094
1 30 30 10.0 16396.07 5867.746 1165.547
2 22 22 7.4 19716.09 9187.770 7577.927
3 14 14 4.7 15615.71 5087.393 3987.124
4 22 22 7.4 9988.636 -539.6847 547.4017
5 42 42 14.0 12596.05 2067.727 1663.999
6 36 36 12.0 10407.06 -121.2655 461.7471
7 36 36 12.0 7910.333 -2617.988 -1574.841
8 21 21 7.0 11960.00 1431.679 1774.740
9 8 8 2.7 4009.000 -6519.321 -5901.890
STANDARD
CLASS ADJUSTED MEAN DEVIATION
0 6272.228 4161.586
1 11693.87 9158.358
2 18106.25 6896.417
3 14515.45 11944.88
4 11075.72 5269.902
5 12192.32 5372.033
6 10990.07 4254.318
7 8953.480 5063.992
8 12303.06 6163.097
9 4626.431 2196.427
ETA-SQUARE = .380238 BETA-SQUARE .195452
ETA = .616634 BETA .442099
ETA-SQUARE (ADJ) = .360938
ETA (ADJ) = .600781
UNADJUSTED DEVIATION SS = .646484E+10
ADJUSTED DEVIATION SS = .332309E+10
PREDICTOR V30 MARITAL STATUS
UNADJUSTED
NO OF SUM OF CLASS DEVIATION FROM
CLASS CASES WEIGHTS % MEAN GRAND MEAN COEFFICIENT
1 221 221 73.9 12449.90 1921.575 1123.470
2 17 17 5.7 7115.882 -3412.439 -2828.932
3 41 41 13.7 3732.463 -6795.858 -2956.380
4 16 16 5.4 5748.750 -4779.571 -4603.841
5
4 4
1.3
7640.000
-2888.321 -1330.495
STANDARD
CLASS ADJUSTED MEAN DEVIATION
1 11651.79 7563.060
2 7699.389 4465.809
3 7571.941 2752.520
4 5924.480 4340.339
5 9197.826 8306.206
ETA-SQUARE = .194470 BETA-SQUARE .658475E-01
ETA = .440988 BETA .256608
ETA-SQUARE (ADJ) = .183511
ETA (ADJ) = .428382
UNADJUSTED DEVIATION SS = .330640E+10
ADJUSTED DEVIATION SS = .111955E+10
PREDICTOR SUMMARY STATISTICS
PREDICTOR V32 EDUC OF HEAD
UNADJUSTED
NO OF SUM OF CLASS DEVIATION FROM
CLASS CASES WEIGHTS % MEAN GRAND MEAN COEFFICIENT
1 16 16 5.4 5973.375 -4554.946 -564.7311
2 71 71 23.7 6579.493 -3948.828 -2085.182
3 44 44 14.7 11013.86 485.5426 397.8526
4 70 70 23.4 10257.70 -270.6211 -789.0604
5 37 37 12.4 11210.03 681.7060 -1273.955
6 30 30 10.0 14161.87 3633.546 2836.744
7 17 17 5.7 16022.71 5494.385 3034.737
8 14 14 4.7 19327.71 8799.393 7518.277
STANDARD
CLASS ADJUSTED MEAN DEVIATION
1 9963.590 6006.004
2 8443.139 4868.404
3 10926.17 8730.284
4 9739.261 6009.121
5 9254.365 5760.727
6 13365.06 7470.542
7 13563.06 6769.267
8 18046.60 12470.24
ETA-SQUARE = .203802 BETA-SQUARE .949135E-01
ETA = .451445 BETA .308080
ETA-SQUARE (ADJ) = .184650
ETA (ADJ) = .429709
UNADJUSTED DEVIATION SS = .346507E+10
ADJUSTED DEVIATION SS = .161373E+10
ANALYSIS SUMMARY STATISTICS
DEPENDENT VARIABLE (Y) = V268 TOTAL FAMILY INC
R-SQUARED(UNADJUSTED) = PROP. OF VARIATION EXPLAINED BY FITTED MODEL: .49128
ADJUSTMENT FOR DEGREES OF FREEDOM = 1.07194
*** MULTIPLE R (ADJUSTED) = .67430 MULTIPLE R-SQUARED (ADJUSTED) = .45468
LISTING OF BETAS IN DESCENDING ORDER
RANK VAR. NO. NAME BETA
1 V251 OCCUPATION B .442099
2 V32 EDUC OF HEAD .308080
3 V30 MARITAL STATUS .256608
*** MULTIPLE R (ADJUSTED) = .67430 MULTIPLE R-SQUARED (ADJUSTED) = .45468