GENERAL DESCRIPTION
MNA performs a multivariate analysis of nominal-scale dependent variables, using a series of parallel dummy-variable regressions derived from each of the dependent variable codes, dichotomized to a 0-1 variable. The program's major use is to give an additive multivariate model showing the relationship between a set of predictors and the dependent variable in terms of a set of coefficients analogous to MCA coefficients.
The advantage MNA has over other techniques applicable to the same data is the simplicity and direct interpretability of the MNA coefficients and the categorical prediction algorithm. See Andrews and Messenger, Multivariate Nominal Scale Analysis for a complete description of the MNA technique.
SPECIAL TERMINOLOGY
MNA coefficients: An array of statistics, one for each pair of dependent variable and predictor variable codes, which are transformed dummy variable regression coefficients. They are transformed to include a coefficient for each predictor variable code.
Classification matrix: A matrix indicating the pattern of correct categorical predictions made by MNA. Rows are actual codes and columns are predicted codes.
Forecast: A set of predictions (summing to 1.0), where prediction P(i) is the probability estimate of the occurrence of the i-th dependent variable code, given the predictor variable characteristics present.
Generalized eta-square and R-square: Extensions of the bivariate eta-square and multivariate R-square. These statistics are appropriate for assessing the strength of relationship on the entire set of dummy dependent variable codes. You can view them as a variance-weighted average of the standard statistics over the complete set of dummy dependent variables.
Theta (bivariate and multivariate): Statistics representing the proportion of cases that are correctly classed when using a prediction-to-the-mode strategy in the bivariate sense and prediction-to-the-maximum forecast in the multivariate sense.
COMMAND FEATURES
Statistics: MNA computes the univariate distribution of the dependent variable, gives (in effect) a bivariate distribution of the dependent variable with each predictor, and computes and prints the multivariate "MNA coefficients." Bivariate statistics are the bivariate theta and the code specific and generalized eta-square; they provide two alternatives for measuring the strength of the simple bivariate relationship between a specific predictor and the dependent variable. The program also prints a series of statistics for each predictor called "Beta Square." These indicate the relative importance of the predictor when holding all other independent variables constant. Multivariate statistics are the multivariate theta and the code specific and generalized R-square.
Missing data: Cases with missing data on the dependent variable may be eliminated with the DDELETE=(MD1,MD2) option. Cases with missing data on the independent variables may be eliminated with the DELETE=(MD1,MD2) KEYWORD.
Residuals: For each case input to MNA there is a set of residual scores--one score for each dependent variable code. Residuals are defined as the difference between the dummy variable score belonging to the particular code and the forecast value for that code. Thus the residual scores are the differences between the MNA-derived probabilities and 0 or 1, depending on whether or not the object actually did fall in the designated category of the dependent variable. Control statements for RECODE to compute predicted and residual values may be written to a file for later use (see RESIDUALS option). You can use these recode statements with LISTDATA to list the residuals or with TRANS to create a permanent residuals dataset.
PRINTED OUTPUT
See the later section Interpreting MNA Output for help in understanding the following statistics.
Information on the analysis.
Numbers of cases eliminated due to missing data on the dependent variable and range of valid codes
Non-empty predictor codes
Minimum number of significant digits in solution vectors
Dependent Variable Statistics.
Frequency distribution
Weighted frequency distribution
Weighted frequency distribution expressed as a percent
R-squared (for each dependent variable code)
Adjusted R-squared (for each dependent variable code)
Predictor Variable Statistics.
Frequency for each code
Weighted frequency for each code
Weighted frequency expressed as a percent for each code
For each predictor code:
Weighted frequency marginal for each code of the dependent variable (Y) expressed as percents
Adjusted percents (sums of percents and coefficients) for each code of the dependent variable
Coefficients for each code of the dependent variable
Theta
Eta-squared (for each dependent variable code)
Beta-squared (for each dependent variable code)
Generalized eta-squared
Joint and Multivariate Prediction.
Generalized R-squared
Joint theta (proportion of cases correctly classed)
Classification matrix. Rows of the matrix indicate actual codes; columns indicate predicted codes.
INTERPRETING MNA OUTPUT
Consult the example printout at the end of this write-up as noted in the following discussions. See Multivariate Nominal Scale Analysis (Andrews and Messenger, 1973) for a complete description of how to interpret MNA results.
Examination Strategies
In looking at a large number of detail statistics from MNA, two things are of particular interest: 1) large coefficients, and 2) large differences between the percents and the adjusted percents.
If an independent variable is ordinal scale, the occurrence of monotonic change across successive coefficients or percentages may also be of interest. This occurs in the example in the way V46, "Better or worse a year from now" affects the likelihood of the first car being a compact.
Theta Statistic
The multivariate statistic Theta indicates the proportion of cases correctly classified after taking into account each respondent's scores on all dependent variables. In the example, Theta is .8043 indicating that 80% of the cases could be correctly classified after taking into account each respondent's scores on all independent variables. This is a gain of more than 10 percentage points over the mode of the overall percentage distribution (69.6% for "Large" car).
Identifying the mode is important; it shows that even if you know nothing about the respondents, you could predict the first car for everyone to be large and be correct 69.6% of the time. Relationships of the independent variables to the dependent variable act to increase predictability above this 69.6% level.
The bivariate Theta statistic indicates the proportion correctly classified for a single independent variable.
Forecasts and the Proportion Classed Correctly
For any case a forecast can be derived. The forecast consists of a set of probabilities; it shows the likelihood of that case falling into each category of the dependent variable. You compute the probability for each category by summing the coefficients relevant to that case and adding in the overall percent. Assume we have a person who earns $20,000 a year, is 28 years old, single, has a college degree, expects to be about as well off next year, expects his/her income to be a little bit more next year, and holds a professional position. The forecast is computed as shown in the table below:
Size of First Car Small Compact Mid-Size Large
Overall Percents 7.2 8.7 14.5 69.6
Coeff: $20,000/yr -5.05 8.10 2.30 -5.35
Coeff: 28 Years old 11.41 -.25 13.13 -24.29
Coeff: Single 10.40 -2.10 -1.11 -7.19
Coeff: College Degree 15.64 -4.17 1.23 -12.69
Coeff: About the Same. 6.73 -4.91 -4.25 2.44
Coeff: A Little More Income 2.03 -.97 -2.74 1.68
Coeff: Professional -19.31 1.05 15.73 2.52
Forecast: 29.10 5.45 38.78 26.69
The forecast gives a set of predicted scores for each case; you predict a case to be in the dependent variable for which the probability is highest. The person represented in the table above would be assigned the "Mid-Size" category.
INPUT DATA
MNA is designed to analyze a nominally scaled dependent variable with three or more code categories. The MNA coefficients and summary statistics are identical to (or generalizations of) those that would be produced by parallel MCA runs dichotomizing each dependent variable code against the others.
RESTRICTIONS
Code categories must be in the range -32,768 to 32,767.
CONTROL STATEMENTS
Filter (optional)
Job Title (required if using a Runfile)
Options and Parameters
DEPV=n The dependent variable.
MAXD=n The maximum number of dependent variable codes.
Default: MAXD=10
DDELETE=(MD1,MD2)
MD1 Delete all cases where the dependent variable equals its first missing-data code.
MD2 Delete all cases where the dependent variable equals its second missing-data code.
VARS=(variable list) The list of independent variables.
DELETE=(MD1,MD2)
MD1 Delete all cases where any independent variable equals its first missing-data code.
MD2 Delete all cases where any independent variable equals its second missing-data code.
MAXC=n The maximum total number of predictor codes for all
predictors, e.g., the sum of the number of predictor codes for all predictors.
Default: MAXC=99
PRINT=(DICT|CODES)
DICT Print the input dictionary.
CODES Print the input dictionary and category labels.
RECODE=n Use RECODE n, previously entered via the RECODE command.
RESIDUALS Indicates that you want RECODE control statements to compute residual values written to file assignment RESIDUAL. The residual value variable numbers will be R10001-R1000n, and correspond to the first, second, etc., dependent variable codes.
WT=n Use variable n as a weight variable
REFERENCES
Andrews, F. M., J. N. Morgan, J. A. Sonquist and L. Klem. Multiple Classification Analysis. Second edition. Ann Arbor: Institute for Social Research, The University of Michigan, 1973.
Andrews, F. M. and R. C. Messenger. Multivariate Nominal Scale Analysis. Ann Arbor: Institute for Social Research, The University of Michigan, 1973.
EXAMPLES
Example 1: Explaining size of first car for childless families. Predictors are income (bracketed), age of head of household (bracketed), education, and feelings of "well-offness."
Command: RECODE
Recode statements: r1=brac(v268,<1=1,1-4500=2,4501-9500=3,9501-15000=4, -
15001-21000=5,>21000=6)
r2=brac(v20,0-30=1,31-45=2,46-60=3,>60=4)
v189=brac(v189,1=1,2=2,>2=3,else=0)
name r1'bracketed income',r2'bracketed age'
end
Command: MNA
File Assignments: dictin=scf.dic datain=scf.dat
Filter: INCLUDE V26=0 AND V193=1-8
Job Title: explaining size of first car for childless families
Options and Parameters: depv=v193 r=1 v=r1,r2,v30,v32,v46,v49,v251 ddel=(md1,md2)
*** MNA - MULTIVARIATE NOMINAL SCALE ANALYSIS **
EXPLAINING SIZE OF FIRST CAR FOR CHILDLESS FAMILIES
Number of variables: 8
The data are not weighted
Transforming the data by RECODE number 1
For the dependent variable, cases with MD1 or MD2 values will be deleted
Number of cases = 138
0 cases deleted due to missing data on the dependent variable.
0 cases deleted due to missing data on the independent variables.
0 cases deleted due to predictor codes outside range -99 to 999.
PREDICTOR NON-EMPTY CODES
R1 BRACKETED INCOME 2 3 4 5 6
R2 BRACKETED AGE 1 2 3 4
V30 MARITAL STATUS 1 2 3 4 5
V32 EDUC OF HEAD 1 2 3 4 5 6 7 8
V46 B/W YEAR FROM NOW 1 3 5 8
V49 SM/LG INC NEXT YEAR 0 1 3 5 8 9
V251 OCCUPATION B 0 1 2 3 4 5 6 7 8 9
*** THE MINIMUM NUMBER OF SIGNIFICANT DIGITS IN THE SOLUTION VECTORS IS 4
DEPENDENT VARIABLE V193 SIZE OF CAR
Code 1 2 3 5
Small Compact Mid-Size Large Totals
Frequency 10 12 20 96 138
Percent 7.2 8.7 14.5 69.6 100.0
R-squared .2532 .3325 .3349 .3295
Adjusted .0000 .1035 .1066 .0994
R1 BRACKETED INCOME
1 2 3 5
Small Compact Mid-Size Large
ETA-Squared .0164 .0514 .0047 .0407
BETA-Squared .0290 .0330 .0104 .0288
Generalized ETA-Squared = .0299
Bivariate THETA = .6957
EXPLAINING SIZE OF FIRST CAR FOR CHILDLESS FAMILIES
Code Y 1 2 3 5
Small Compact Mid-Size Large
2 Percent 2.94 8.82 11.76 76.47
N 34 Adj Pct 3.72 9.84 10.12 76.32
Pct 24.64 Coeff. -3.52 1.14 -4.37 6.75
3 Percent 6.67 2.22 15.56 75.56
N 45 Adj Pct 6.80 4.11 18.74 70.36
Pct 32.61 Coeff. -.45 -4.59 4.25 .79
4 Percent 12.12 15.15 15.15 57.58
N 33 Adj Pct 14.63 12.88 13.40 59.09
Pct 23.91 Coeff. 7.38 4.18 -1.09 -10.47
5 Percent 6.25 18.75 18.75 56.25
N 16 Adj Pct 2.20 16.79 16.79 64.21
Pct 11.59 Coeff. -5.05 8.10 2.30 -5.35
6 Percent 10.00 .00 10.00 80.00
N 10 Adj Pct 4.94 -1.29 10.17 86.18
Pct 7.25 Coeff. -2.30 -9.99 -4.33 16.62
R2 BRACKETED AGE
1 2 3 5
Small Compact Mid-Size Large
ETA-Squared .0210 .0282 .0580 .1140
BETA-Squared .0355 .0004 .0639 .0702
Generalized ETA-Squared = .0725
Bivariate THETA = .6957
Code Y 1 2 3 5
Small Compact Mid-Size Large
1 Percent 14.29 19.05 33.33 33.33
N 21 Adj Pct 18.66 8.45 27.62 45.27
Pct 15.22 Coeff. 11.41 -.25 13.13 -24.29
2 Percent .00 10.00 20.00 70.00
N 10 Adj Pct 2.90 8.15 4.94 84.01
Pct 7.25 Coeff. -4.34 -.54 -9.56 14.44
3 Percent 8.51 8.51 8.51 74.47
N 47 Adj Pct 5.66 9.52 4.29 80.54
Pct 34.06 Coeff. -1.59 .82 -10.20 10.97
4 Percent 5.00 5.00 11.67 78.33
N 60 Adj Pct 5.22 8.23 19.48 67.06
Pct 43.48 Coeff. -2.03 -.47 4.99 -2.50
V30 MARITAL STATUS
1 2 3 5
Small Compact Mid-Size Large
ETA-Squared .0584 .0410 .0645 .0792
BETA-Squared .0945 .0419 .0476 .0682
Generalized ETA-Squared = .0662
Bivariate THETA = .7101
Code Y 1 2 3 5
Small Compact Mid-Size Large
1 Married Percent 6.25 8.33 12.50 72.92
N 96 Adj Pct 5.12 8.97 13.34 72.57
Pct 69.57 Coeff. -2.13 .28 -1.16 3.01
2 Single Percent 13.33 13.33 13.33 60.00
N 15 Adj Pct 17.65 6.59 13.38 62.37
Pct 10.87 Coeff. 10.40 -2.10 -1.11 -7.19
3 Divorced Percent .00 5.88 11.76 82.35
N 17 Adj Pct 2.28 10.93 8.96 77.83
Pct 12.32 Coeff. -4.96 2.23 -5.53 8.27
4 Widowed Percent 12.50 .00 50.00 37.50
N 8 Adj Pct 9.83 -5.00 44.66 50.51
Pct 5.80 Coeff. 2.59 -13.69 30.16 -19.06
5 Separated Percent 50.00 50.00 .00 .00
N 2 Adj Pct 63.06 46.96 4.74 -14.77
Pct 1.45 Coeff. 55.82 38.27 -9.75 -84.34
V32 EDUC OF HEAD
1 2 3 5
Small Compact Mid-Size Large
ETA-Squared .0564 .1053 .0594 .0587
BETA-Squared .1045 .0937 .1028 .0383
Generalized ETA-Squared = .0662
Bivariate THETA = .6957
Code Y 1 2 3 5
Small Compact Mid-Size Large
1 0-8th Grade Percent .00 .00 .00 100.00
N 5 Adj Pct 5.97 -7.75 4.65 97.13
Pct 3.62 Coeff. -1.28 -16.45 -9.84 27.57
2 9th Grade Percent 7.32 7.32 7.32 78.05
N 41 Adj Pct 9.87 9.51 13.22 67.40
Pct 29.71 Coeff. 2.63 .81 -1.27 -2.17
3 10th Grade Percent 4.55 .00 22.73 72.73
N 22 Adj Pct 5.36 3.52 30.40 60.72
Pct 15.94 Coeff. -1.88 -5.17 15.90 -8.84
4 11th Grade Percent 3.70 3.70 22.22 70.37
N 27 Adj Pct .60 -.10 24.98 74.51
Pct 19.57 Coeff. -6.64 -8.79 10.49 4.94
5 Completed HS Percent .00 25.00 12.50 62.50
N 16 Adj Pct -6.43 22.46 1.30 82.68
Pct 11.59 Coeff. -13.68 13.76 -13.20 13.11
6 Some College Percent 17.65 23.53 5.88 52.94
N 17 Adj Pct 15.84 21.93 -3.66 65.89
Pct 12.32 Coeff. 8.59 13.24 -18.15 -3.68
7 College Degree Percent 16.67 .00 33.33 50.00
N 6 Adj Pct 22.89 4.52 15.72 56.87
Pct 4.35 Coeff. 15.64 -4.17 1.23 -12.69
8 Graduate Degree Percent 25.00 .00 25.00 50.00
N
4 Adj Pct
31.84 3.72
9.61 54.82
Pct 2.90
Coeff.
24.60 -4.97
-4.88 -14.74
V46 B/W YEAR FROM NOW
1 2 3 5
Small Compact Mid-Size Large
ETA-Squared .0063 .1015 .0297 .0828
BETA-Squared .0201 .0865 .0421 .0714
Generalized ETA-Squared = .0616
Bivariate THETA = .6957
Code Y 1 2 3 5
Small Compact Mid-Size Large
1 Much Worse Percent 3.85 26.92 26.92 42.31
N 26 Adj Pct 2.17 24.17 29.28 44.37
Pct 18.84 Coeff. -5.07 15.48 14.79 -25.19
3 A Little Worse Percent 7.04 5.63 11.27 76.06
N 71 Adj Pct 6.30 7.71 10.53 75.46
Pct 51.45 Coeff. -.94 -.98 -3.97 5.89
5 About the Same Percent 9.09 4.55 13.64 72.73
N 22 Adj Pct 13.98 3.78 10.24 72.00
Pct 15.94 Coeff. 6.73 -4.91 -4.25 2.44
8 Much Better Percent 10.53 .00 10.53 78.95
N 19 Adj Pct 9.93 -3.12 13.99 79.20
Pct 13.77 Coeff. 2.68 -11.82 -.50 9.64
V49 SM/LG INC NEXT YEAR
1 2 3 5
Small Compact Mid-Size Large
ETA-Squared .0145 .0279 .0709 .0387
BETA-Squared .0275 .0374 .1404 .0770
Generalized ETA-Squared = .0418
Bivariate THETA = .7029
Code Y 1 2 3 5
Small Compact Mid-Size Large
0 Much Smaller Percent 3.03 9.09 9.09 78.79
N 33 Adj Pct 1.72 9.95 3.28 85.05
Pct 23.91 Coeff. -5.53 1.26 -11.21 15.49
1 A Little Smaller Percent 9.52 4.76 19.05 66.67
N 42 Adj Pct 7.19 6.12 19.41 67.27
Pct 30.43 Coeff. -.05 -2.57 4.92 -2.30
3 About the Same Percent 9.52 9.52 9.52 71.43
N 42 Adj Pct 12.17 8.88 13.46 65.49
Pct 30.43 Coeff. 4.92 .18 -1.03 -4.07
5 A Little Larger Percent 6.67 13.33 13.33 66.67
N 15 Adj Pct 9.27 7.72 11.76 71.25
Pct 10.87 Coeff. 2.03 -.97 -2.74 1.68
8 Much Larger Percent .00 .00 66.67 33.33
N 3 Adj Pct -1.00 -.84 93.34 8.50
Pct 2.17 Coeff. -8.24 -9.54 78.85 -61.07
9 Don't Know Percent .00 33.33 33.33 33.33
N 3 Adj Pct -1.97 42.73 18.21 41.03
Pct 2.17 Coeff. -9.22 34.04 3.72 -28.54
V251 OCCUPATION B
1 2 3 5
Small Compact Mid-Size Large
ETA-Squared .0466 .0595 .0524 .0630
BETA-Squared .1160 .0665 .0690 .0211
Generalized ETA-Squared = .0574
Bivariate THETA = .6957
Code Y 1 2 3 5
Small Compact Mid-Size Large
0 Sales Percent 4.55 4.55 9.09 81.82
N 44 Adj Pct 7.72 9.71 9.35 73.22
Pct 31.88 Coeff. .47 1.01 -5.14 3.65
1 Professional Percent 11.11 22.22 22.22 44.44
N 9 Adj Pct -12.06 9.75 30.23 72.09
Pct 6.52 Coeff. -19.31 1.05 15.73 2.52
2 Manager Percent 22.22 .00 11.11 66.67
N 9 Adj Pct 31.11 -6.69 10.67 64.91
Pct 6.52 Coeff. 23.87 -15.38 -3.82 -4.66
3 Clerical Percent .00 20.00 .00 80.00
N 10 Adj Pct 1.85 20.35 -1.28 79.08
Pct 7.25 Coeff. -5.39 11.65 -15.77 9.51
4 Craftsman Percent 8.33 16.67 16.67 58.33
N 12 Adj Pct 7.47 13.43 13.28 65.83
Pct 8.70 Coeff. .22 4.73 -1.22 -3.74
5 Operator Percent 7.69 7.69 23.08 61.54
N 13 Adj Pct 12.99 8.22 17.71 61.08
Pct 9.42 Coeff. 5.74 -.47 3.22 -8.49
6 Service Percent .00 8.33 25.00 66.67
N 12 Adj Pct 1.51 5.83 30.90 61.76
Pct 8.70 Coeff. -5.74 -2.87 16.41 -7.80
7 Laborer Percent 13.33 13.33 13.33 60.00
N 15 Adj Pct 8.60 16.34 14.03 61.04
Pct 10.87 Coeff. 1.35 7.65 -.47 -8.53
8 On Welfare Percent 8.33 .00 16.67 75.00
N 12 Adj Pct 8.12 -.27 13.67 78.48
Pct 8.70 Coeff. .88 -8.97 -.82 8.91
9 Retired Percent .00 .00 50.00 50.00
N 2 Adj Pct -16.33 -18.96 49.18 86.11
Pct 1.45 Coeff. -23.57 -27.66 34.68 16.55
*** MULTIVARIATE STATISTICS ***
GENERALIZED R-SQUARED .3207 MULTIVARIATE THETA .8043
CASES CORRECTLY CLASSED
1 2 3 5
Small Compact Mid-Size Large
N 2.000 7.000 9.000 93.000
PROPORTION .200 .583 .450 .969
ACTUAL(rows) vs. PREDICTED(columns) CLASSIFICATION MATRIX
| 1| 2| 3| 5|
| Small| Compact|Mid-Size| Large| Totals
|--------|--------|--------|--------|
Small 1| 2| 1| 1| 6| 10
ROW %| 20.0| 10.0| 10.0| 60.0| 100.0
|--------|--------|--------|--------|
Compact 2| 0| 7| 0| 5| 12
ROW %| .0| 58.3| .0| 41.7| 100.0
|--------|--------|--------|--------|
Mid-Size 3| 0| 1| 9| 10| 20
ROW %| .0| 5.0| 45.0| 50.0| 100.0
|--------|--------|--------|--------|
Large 5| 0| 1| 2| 93| 96
ROW %| .0| 1.0| 2.1| 96.9| 100.0
|--------|--------|--------|--------|
Totals 2 10 12 114 138
ROW % 1.4 7.2 8.7 82.6 100.0