Note: This write-up contains only an outline of the options and features of the SEARCH command. Before attempting to use SEARCH, you should read about the technique in Searching for Structure (Sonquist, et al., 1974).
RESIDUAL RECODE CONTROL STATEMENT OUTPUT
Example 1: Investigates income (V268) using ANALYSIS=MEANS
Example 2: CHI analysis on variable V46.
(Adapted with minor changes from: http://www.isr.umich.edu/src/search/search_document.html)
SEARCH is a binary segmentation procedure used to develop a predictive model for a dependent variable. It searches among a set of predictor variables for those predictors which most increase the researcher's ability to account for the variance or distribution of a dependent variable. The question, "what dichotomous split on which single predictor variable will give us a maximum improvement in our ability to predict values of the dependent variable?," embedded in an iterative scheme, is the basis for the algorithm used in this command.
SEARCH divides the sample, through a series of binary splits, into a mutually exclusive series of subgroups. They are chosen so that, at each step in the procedure, the split into the two new subgroups accounts for more of the variance or distribution (reduces the predictive error more) than a split into any other pair of subgroups. The predictor variables may be ordinally or nominally scaled. The dependent variable may be continuous or categorical. SEARCH is an elaboration of the Osiris III AID and THAID programs.
Research questions are often of the type "What is the effect of X on Y?" But the answer requires answering a larger question "What set of variables and their combinations seems to affect Y?" With SEARCH a variable X that seems to have an overall effect may have its apparent influence disappear after a few splits, with the final groups, while varying greatly as to their levels of Y, showing no effect of X. The implication is that, given other things, X does not really affect Y.
Conversely, while X may seem to have no overall effect on Y, after splitting the sample into groups that take account of other powerful factors, there may be some groups in which X has a substantial effect. Think of economists' notion of the actor at the margin. A motivating factor might affect those not constrained or compelled by other forces. Those who, other things considered, have a 40-60 percent probability of acting, might show substantial response to some motivator. Or a group with very high or very low likelihood of acting might be discouraged or encouraged by some motivator. But if X has no effect on any of the subgroups generated by Search, one has pretty good evidence that it does not matter, even in an interactive way.
The purpose of SEARCH is to allow an evaluation of many competing and probably mis-specified models. It relies on the fact that the explanatory power of any one predictor is rapidly exhausted by a few binary splits using it, so that a sequence of binary splits allowing competing predictors at each split, can search data for structure without restrictive assumptions of linearity or additivity of effects. The approach is closer to analysis of variance components than to sequential regression.
SEARCH makes a sequence of binary divisions of a dataset in such a way that each split maximally reduces the error variance or increases the information (chi-square or rank correlation). It finds the best split on each predictor, then takes the best of the best.
The process stops when additional splits are not likely really to improve predictions to a fresh sample or to the population, i.e., when the null probability from that split rises above some selected level (e.g., .05, .025, .01 or .005). Of course, having tried several possibilities for each of several predictors, the null probability is clearly understated. Alternative stopping rules can be used in any combination: minimum group size, maximum number of splits, minimum reduction in explained variance relative to the original total, or maximum null probability.
SEARCH provides for four kinds of dependent or criterion variables: means, simple regressions of Y on X, classifications, and ranks. The split criterion uses some measure of reduction in uncertainty or error, not a level of significance. The reasons to use error reduction rather than significance are that 1) particularly at the start with large numbers of cases in the groups being split, almost any split that maximizes error reduction will be highly significant, and 2) a small potential splitoff might be very highly significant because it is very extreme or very homogenous, but splitting it off would do little to improve overall predictions back to the population.
(The above discussion was adapted with minor changes from: http://www.isr.umich.edu/src/search/search_document.html)
With means the splitting criterion is reduction in unexplained error
variance from using two means, rather than the single parent group mean. With
regressions the splitting criterion is the reduction in error variance from
using two simple regressions, rather than the single parent group regression.
With classifications the splitting criterion is the likelihood-ratio chi-square,
which fits with the variance components approach. With ranks the splitting
criterion is
For each predictor one can maintain its monotonic order ("monotonic"), try each class against all the others ("select"), or reorder each time according to the criterion variable ("free"). The last should be used rarely, and only with predictors with few classes, for it involves implicitly trying many things, resulting in a bias in favor of that predictor. In addition, the combinations split off are difficult to interpret and probably idiosyncratic, as the parent groups become smaller.
One might want to reassign missing information to some large class, or, better, use a multivariate assignment procedure, for example, SEARCH itself with the chi-square option.
With monotonic predictors one tries the first class against the rest, then the first two classes against the rest, etc., making k-1 tries. With select predictors one tries each class against all the others, making k tries, but since the splitting criterion combines difference between the two new groups with both their sizes, there is an offsetting bias against the select option. The alternatives are not really independent, so the bias in favor of predictors with more classes should be small. And with at least 50 cases, adjusting the degrees of freedom would make little difference.
Predictors can also be hierarchically ranked as to when they are used. Rank 0 means compute the potential gain but do not split on that predictor. Ranks 1, 2, 3, etc., mean exhaust the rank 1 variables first, then try the rank 2 ones, then the rank 3 ones, etc. Since the program will produce recode statements to generate expected values or residuals, one can also hold aside some later-stage predictors for an analysis of the residuals.
SEARCH also has a significance test for stopping the splitting process. Given the prior searching and the possibility of sample design effects, the test is crude. Purists will object that the more classes in a predictor, the more alternatives tried, so a bias exists toward predictors with many classes. One can think of using up degrees of freedom, or adding alternative null probabilities. But adding probabilities in a Bronferroni-type correction vastly overcorrects, since the k-1 or k alternatives are not really independent. The only serious bias would come from freeing a predictor with 5 or more classes and reordering it at each stage.
The other three stopping rules, with their defaults, are: minimum final group size (default 25), minimum reduction in error variance relative to original total (default .8%), and maximum number of splits (default 25). The error variance reduction rule can be too stringent when the first few splits greatly reduce the remaining error.
The use of weights to adjust for different sampling or response rates affects variances and tests, so the program calculates an estimate of that effect for the whole sample, based on the variance of the weights, and issues a warning. Weights should be used, because if they do not make a difference, nothing is lost, but if they do, the unweighted data are biased.
Functions. SEARCH can perform the following functions:
Maximize differences in group means, group regression lines, or distributions (maximum likelihood chi-square criterion).
Rank the predictors to give them preference in the partitioning.
Sacrifice explanatory power for symmetry.
Start after a specified partial tree structure has been generated.
Missing Data. Cases with missing-data in a continuous dependent variable or a covariate are deleted automatically. Cases with missing-data in a categorical dependent variable can be excluded by using a filter statement or by specifying valid codes with the DEPV keyword. Cases with missing-data in the predictor variables are not automatically excluded. However, the filter statement and the CODES keyword may be used to exclude missing-data on predictor variables.
The major components of the printed output are specified below. For details see Searching for Structure.
Trace Printout: (Optional: See keywords PRINT=TRACE and PRINT=FULLTRACE). Can be voluminous.
The candidate groups for splitting
The group selected for splitting
All eligible splits for each predictor (optional)
The best split for each predictor
The split selected
Final Tables Printout:
The analysis of variance or distribution on final groups (except for “analysis=tau”)
The split summary
The final group summary
Summary table of
best splits for each predictor for each group (except for
“analysis=tau”)
The predictor summary table. You may request the first group (PRINT=FIRST), the final groups (PRINT=FINAL), or all groups (PRINT=TABLE). The tables are printed in reverse group order, i.e., last group first and first group last.
Group Tree Structure
A structure table with entries for each group, numbered in order and indented, so that one can easily see the pedigree of each final group and its detail. With relatively little wordprocessing one has a publishable table. It is also easy to create a branching diagram from the group summary table.
RESIDUAL RECODE CONTROL STATEMENT OUTPUT
RECODE control statements to determine group numbers and residual values from raw data may be written to the file assigned to RESIDUAL (see the keywords GNUM and RESIDUALS). These statements may be used with LISTDATA to list the group numbers and residuals or with TRANS to create a permanent residuals dataset. They also may be used with SEARCH to perform a second stage search for structure. In running a second-stage SEARCH, place the RECODE statements generated by the first-stage SEARCH after any RECODE statements required to perform the first-stage SEARCH.
The dependent variable may be continuous or categorical. Predictor variables may be ordinal or nominal scales.
1. Maximum number of predictors: 200.
2. Maximum predictor value: 31.
3. Maximum number of categorical variable codes: 400.
4. Maximum number of predefined splits: 49.
5. To perform its analysis, SEARCH must write records to a scratch file with record length based on the number of predictor codes. To make this more efficient, always specify the list of codes if less than 0-9--see DEPV keyword description.
Filter Statement (optional)
Job Title
Parameter Statement
ANALYSIS=MEAN|REGRESSION|CHI|TAU Analysis
type (see Searching for Structure).
MEAN Means analysis.
REGR Regression analysis
CHI Chi analysis
TAU Ranks
Default: ANALYSIS=MEAN. Note: ANALYSIS=CHI with a single
dependent variable implies the default list of codes 0-9 within missing-data
tests.
COV=variable number
The covariate variable number. Must
be specified for REGR analyses.
DEPV=variable number|(variable list)|(Vn/list of
codes)
The dependent variable or variables. If a list of variables is given the
analysis is a done on the distribution if the variables (see Searching for
Structure).
A list of codes or variable list may only be supplied for ANALYSIS=CHI
or ANALYSIS=TAU. If a list of codes is supplied (e.g., DEPV= V7/1,2,4-7), no
missing data tests are made for the dependent variable and only the codes
listed are used in analysis.
Default: none, DEPV must be specified (see note under ANALYSIS
keyword).
ESTIMATE=variable number|variable list
Variable(s) for estimates or expected values. For a categorical dependent
variables or a distribution set of dependent variables, a
ast of variables representing the expected
distribution for the case.
EXPL=x
Minimum percentage increase in explanatory power required for a split.
Default: EXPL=0.8
GROUP=variable number
Variable number for final group number. Required if RESIDUALS specified; omit
otherwise.
IDVAR=variable number
Identification variable to print with each case classified as an outlier.
Default: dependent variable.
MAX=n
Maximum number of partitions.
Default: MAX=25.
MIN=n
Minimum number of
cases in one group.
Default: MIN=25.
NULL=n
Maximum probability that there is really no gain from the split.
Default: No significance test.
OUTDISTANCE=n
Number of standard deviations from the parent group mean defining an
outlier. Outliers are reported but not excluded from the analysis.
Outliers could be excluded in subsequent runs by filtering. Only useful if
PRINT=TRACE is also used.
Default: OUTD=5.0
PRINT=(DICT|CODEBK,TRACE,FULLTRACE,TABLE,FIRST, FINAL,TREE)
DICT Print the input dictionary.
CODEBK Print the input dictionary and codebook records.
TRACE: Print the trace of splits for each predictor for each split.
FULLTR: Print the full trace of splits for each predictor, including eligible but suboptimal splits.
TABLE: Print all the predictor summary tables.
FIRST: Print the predictor summary tables for the first group.
FINAL: Print the predictor summary tables for the final groups.
TREE: Print the hierarchical tree diagram.
RECODE=n Use RECODE n, previously entered via the RECODE command.
RESIDUALS=variable number|variable list
If you want to generate residuals, specify the residuals variable number or
numbers. For a multiple or categorical dependent variable,
"residuals" consist of a set of variables representing the deviation
of the case from the expected pattern. (Note: A two-stage analysis can be
performed by using the residuals from one analysis as the dependent variable(s)
for a subsequent analysis.)
SYMMETRY=n
The amount of explanatory power one is willing to lose
in order to have symmetry, expressed as a percentage.
Default: SYMMETRY=0.
WTVAR=n Use variable n as a weight variable.
Predictor Statements
Supply one set of parameters for each group of predictors which may be described with the same parameter values.
VARS=(variable numbers)
Use the variables specified in the list. If you want RECODE R-type variables
you must list them explicitly.
Default: none, VARS must be supplied.
M|F|S The predictor constraint.
M: Predictors are considered to be "monotonic," i.e., the codes of the predictors are to be kept adjacent during the partition scan.
F: Predictor codes are considered to be "free."
S: Predictor codes will be "selected" and separated from the remaining codes in forming trial partitions.
Default: M.
CODES=maxcode|(list of codes)
Either the value of the largest acceptable code or a list of acceptable codes.
Codes may range from 0 to 31. Cases outside the range 0 to 31 are discarded.
Default: CODES(0-9).
RANK=n Assigned rank. Rank 1 predictors are used before rank 2, rank 2 before rank
3, etc. A zero rank indicates that statistics are to be computed for the
predictors, but they are not to be used in the partitioning.
Default: RANK=1.
Predefined
If predefined splits are desired, supply one set of parameters for each predefined split.
GNUM=n Number of
the group to be split. Groups are specified in ascending order, where the
entire original sample is group 1. Each set of parameters forms two new groups.
Default: none, GNUM must be supplied.
VAR=variable
number
Predictor variable used to make the split.
Default: none, VAR must be supplied.
CODES=(list) List of the
predictor codes defining the first subgroup. All other codes will belong to the
second subgroup.
Default: none, CODES must be specified.
Splitting criteria
There can be four splitting criteria, based on the dependent variable type:
Means
Regressions
Classifications
Ranks
The splitting criterion in each case is the reduction in ignorance (error variance, etc.) or increase in information. Terms like classification and regression trees should be replaced by binary segmentation or unrestricted analysis of variance components, or searching for structure. With rich bodies of data, many non-linearities and non-additivities possible, and many competing theories, the usual restrictions and assumptions that one is testing a single model are not appropriate. What does remain, however, is a systematic, pre-stated searching strategy that is reproducible, not a free ransacking.
Means. For means the splitting criterion is the reduction in error variance, that is, the sum of squares around the mean, using two subgroup means instead of one parent group mean.
Regressions. For regressions (y=a+bx) the splitting criterion is the reduction in error variance from using two regressions rather than one.
Classifications. For classifications (categorical dependent variable), the splitting criterion is the likelihood-ratio chi-square for dividing the parent group into two subgroups.
Ranks. For rankings (ordered
dependent variable), the splitting criterion is
Stopping Rules
There are four stopping rules, each with a default option:
Maximum number of splits. Default: 25.
Minimum number in any final group. Default: 25.
Minimum reduction in error, relative to the
original total. Default: 0.8 percent.
Maximum null probability. Default: none, no
significance test.
A combination of the minimum number in any final group and the minimum reduction in error is a primitive significance test, but a more formal test is possible. Assuming that the minimum in any final group is 15 or more, the degrees of freedom for any test will be over 30, large enough to assure reasonable normality, and a Z-ratio (ratio of the gain from a split relative to its standard error) would be 2.33 for a maximum probability that there is nothing there (null hypothesis) of .01. The loss from trying several splits is small if predictor order is maintained, or each class is only tried against all the others (k-1 or k).
For the tau-b option, we cannot define a minimum reduction in error, relative to the original total, so we use a minimum tau-b value for each split. Even with means a minimum error reduction can cause difficulty if the first few splits account for a large fraction of the variance, and the "significance level" however fraudulent, is perhaps a better stopping rule.
For the means and ranks criteria, the maximum null probability stopping rule is based on Z, the ratio of the gain from a split to its standard error, using the normal distribution for the null probabilities. For the regression option, we use an f-test to get the null probabilities, and for the chi option, we use the chi-squared distribution.
We do not multiply the null probabilities by the number of alternatives tried (the Bronferroni correction), since for monotonic predictors or select predictors with fewer than 10 categories, the alternatives are few enough and not really independent. We suggest not using the "free" option with more than three or four categories.
Agresti, Alan (1996), Introduction to
Categorical Data Analysis,
Dunn, Olive Jean, and Virginia A. Clark (1974), Applied
Statistics: Analysis of Variance and Regression,
Chow, G. (1960), "Test of Equality between Sets of Coefficients in Two Linear Regressions," Econometrica, 29:591-605.
Gibbons, Jean Dickinson (1997), Nonparametric Methods for
Quantitative Analysis, 3rd edition,
Hays, William (1988), Statistics, 4th edition,
Klem, Laura (1974), "Formulas and
Statistical References," in Osiris III,
Volume 5,
Sonquist, J. A., E. L. Baker and J. N. Morgan
(1974), Searching for Structure, revised edition,
Example 1: Investigates income (V268) using ANALYSIS=MEANS
File assignments: dictin=scf.dic datain=scf.dat
Page title: SEARCH SAMPLE SETUP for Means Analysis, Predefined split
parameter statement: depv=v268 outd=2 idvar=v3 analysis=means expl=.1 min=25
predictor statements: v=v32 codes=(0-8)
v=v37,v251,v30
predefined split: gnum=1 var=v37 codes=1
end
*** SEARCH - SEARCHING FOR STRUCTURE ***
ANALYSIS TYPE: MEANS
Using input dictionary: D:\PROJECTS\TESTDATA\SCF.DIC
Using input data file: D:\PROJECTS\TESTDATA\SCF.DAT
Number of variables: 6
Variables containing invalid characters will be assigned missing-data code 1
The data are not weighted
Dependent variables: 268
Predictor variables: 32 37 251 30
The number of cases rejected is 1:
1 for code outside range
The number of cases is 326
The partitioning ends with 9 final groups
The variation explained is 38.2 percent
One-way Analysis of Final Groups
Source Variation DF
Explained .701177E+10 8
Error .113438E+11 317
Total .183555E+11 325
Group 1, N=326
Mean(Y)=10451.0, Var(Y)=.564786E+08, Variation=.183555E+11
Into Group 2, Codes 1
And Group 3, Codes 0,2-9
Group 2, N=299
Mean(Y)=10528.3, Var(Y)=.570540E+08, Variation=.170021E+11
Into Group 4, Codes 1
And Group 5, Codes 2-5
Group 4, N=221
Mean(Y)=12449.9, Var(Y)=.571999E+08, Variation=.125840E+11
Into Group 6, Codes 1-5
And Group 7, Codes 6-8
Group 6, N=171
Mean(Y)=10932.9, Var(Y)=.430128E+08, Variation=.731217E+10
Into Group 8, Codes 0
And Group 9, Codes 1-9
Group 9, N=142
Mean(Y)=12230.1, Var(Y)=.402303E+08, Variation=.567247E+10
Into Group 10, Codes 1-3
And Group 11, Codes 4-9
Group 11, N=115
Mean(Y)=11393.4, Var(Y)=.249652E+08, Variation=.284603E+10
Into Group 12, Codes 4-6
And Group 13, Codes 7-9
Group 12, N=69
Mean(Y)=11929.2, Var(Y)=.212965E+08, Variation=.144816E+10
Into Group 14, Codes 1-3
And Group 15, Codes 4,5
Group 5, N=78
Mean(Y)=5083.86, Var(Y)=.167531E+08, Variation=.128999E+10
Into Group 16, Codes 0
And Group 17, Codes 1,2,4-9
Final Group Summary Table
Group 3, N=27
Mean(Y)=9594.30, Var(Y)=.512249E+08, Variation=.133185E+10
Group 7, N=50
Mean(Y)=17638.2, Var(Y)=.720890E+08, Variation=.353236E+10
Group 8, N=29
Mean(Y)=4580.97, Var(Y)=.823915E+07, Variation=.230696E+09
Group 10, N=27
Mean(Y)=15793.6, Var(Y)=.924261E+08, Variation=.240308E+10
Group 13, N=46
Mean(Y)=10589.8, Var(Y)=.299634E+08, Variation=.134835E+10
Group 14, N=28
Mean(Y)=13030.6, Var(Y)=.309307E+08, Variation=.835128E+09
Group 15, N=41
Mean(Y)=11177.0, Var(Y)=.138968E+08, Variation=.555873E+09
Group 16, N=35
Mean(Y)=3383.49, Var(Y)=.515942E+07, Variation=.175420E+09
Group 17, N=43
Mean(Y)=6467.88, Var(Y)=.221668E+08, Variation=.931006E+09
Percent Total Variation Explained by Best
1 2 3* 4 5 6 7* 8* 9 10*
V32 12.00 11.90 0.00 9.48 0.86 3.62 0.00 0.00 0.68 0.00
V37 0.12 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
V251 18.12 16.90 0.00 9.14 1.00 7.68 0.00 0.00 2.31 0.00
V30 17.92 17.04 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Percent Total Variation Explained by Best Split for Each Group (*=Final Groups) - continued
11 12 13* 14* 15* 16* 17*
V32 0.16 0.31 0.00 0.00 0.00 0.00 0.00
V37 0.00 0.00 0.00 0.00 0.00 0.00 0.00
V251 0.27 0.01 0.00 0.00 0.00 0.00 0.00
V30 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Group TREE Structure
Group 1: All Cases
N=326, Mean(Y)=10451.0
Group 2 V37: RACE, Codes 1
N=299, Mean(Y)=10528.3
Group 4 V30: MARITAL STATUS, Codes 1
N=221, Mean(Y)=12449.9
Group 6 V32: EDUC OF HEAD, Codes 1-5
N=171, Mean(Y)=10932.9
Group 8 V251: OCCUPATION B, Codes 0
N=29, Mean(Y)=4580.97
Group 9 V251: OCCUPATION B, Codes 1-9
N=142, Mean(Y)=12230.1
Group 10 V251: OCCUPATION B, Codes 1-3
N=27, Mean(Y)=15793.6
Group 11 V251: OCCUPATION B, Codes 4-9
N=115, Mean(Y)=11393.4
Group 12 V251: OCCUPATION B, Codes 4-6
N=69, Mean(Y)=11929.2
Group 14 V32: EDUC OF HEAD, Codes 1-3
N=28, Mean(Y)=13030.6
Group 15 V32: EDUC OF HEAD, Codes 4,5
N=41, Mean(Y)=11177.0
Group 13 V251: OCCUPATION B, Codes 7-9
N=46, Mean(Y)=10589.8
Group 7 V32: EDUC OF HEAD, Codes 6-8
N=50, Mean(Y)=17638.2
Group 5 V30: MARITAL STATUS, Codes 2-5
N=78, Mean(Y)=5083.86
Group 16 V251: OCCUPATION B, Codes 0
N=35, Mean(Y)=3383.49
Group 17 V251: OCCUPATION B, Codes 1,2,4-9
N=43, Mean(Y)=6467.88
Group 3 V37: RACE, Codes 0,2-9
N=27, Mean(Y)=9594.30
Example 2: CHI analysis on variable V46.
File assignments: dictin=scf.dic datain=scf.dat
Page title: SEARCH SAMPLE SETUP, No predefined split
parameter statement: depv=v46 outd=2 idvar=v3 analysis=chi
predictor statements: v=v32 codes=(0-8)
v=v37,v251,v30 f
end
*** SEARCH - SEARCHING FOR STRUCTURE ***
ANALYSIS TYPE: CHI
Using input dictionary: D:\PROJECTS\TESTDATA\SCF.DIC
Using input data file: D:\PROJECTS\TESTDATA\SCF.DAT
Number of variables: 6
Variables containing invalid characters will be assigned missing-data code 1
The data are not weighted
Dependent variables: 46
Predictor variables: 32 37 251 30
The number of cases rejected is 1:
1 for code outside range
The number of cases is 326
The partitioning ends with 2 final groups
The variation explained is 2.4 percent
One-way Analysis of Final Groups
Source Variation DF
Explained 19.1247 3
Error 775.167 320
Total 794.292 323
Group 1, N=326, Variation=794.292
Into Group 2, Codes 1-3
And Group 3, Codes 4-8
Final Group Summary Table
Group 2, N=146, Variation=362.606
Group 3, N=180, Variation=412.561
Percent Total Variation Explained by Best
1 2* 3*
V32 2.41 0.18 0.56
V37 0.66 0.00 0.00
Percent Total Variation Explained by Best Split for Each Group (*=Final Groups) - continued
1 2* 3*
V251 1.53 0.45 0.52
V30 0.56 0.61 0.33
DEPENDENT VARIABLE PERCENT DISTRIBUTION FOR EACH GROUP (* = FINAL GROUPS)
1 2* 3*
25.46 9.44 0.00
1 49.39 8.33 0.00
2 12.58 0.00 0.00
3 12.58 0.00 0.00
4 15.75 0.00 0.00
5 50.00 0.00 0.00
6 16.44 0.00 0.00
7 17.81 0.00 0.00
8 33.33 0.00 0.00
9 48.89 0.00 0.00
Group TREE Structure
Group 1: All Cases
N=326, Code(%)= 0(0.00) 1(0.25) 2(0.00) 3(0.49) 4(0.00) 5(0.13) 6(0.00)
7(0.00) 8(0.13) 9(0.00)
Group 2 V32: EDUC OF HEAD, Codes 1-3
N=146, Code(%)= 0(0.00) 1(0.16) 2(0.00) 3(0.50) 4(0.00) 5(0.16) 6(0.00)
7(0.00) 8(0.18) 9(0.00)
Group 3 V32: EDUC OF HEAD, Codes 4-8
N=180, Code(%)= 0(0.00) 1(0.33) 2(0.00) 3(0.49) 4(0.00) 5(0.09) 6(0.00)
7(0.00) 8(0.08) 9(0.00)