The purpose of our statistical analysis of Measuring Up 2000 is to determine if there are broad patterns across the states in the types of factors that predict aggregate state performance. We are particularly interested in separating the different influences on performance from broad environmental, demographic, and economic factors, in order to see how performance is associated with these measurable "mega-factors." To reach this aim, we gathered state-level data on 23 measures, organized them into three clusters of types of "drivers," and subjected them to a series of regression analyses to learn the relationships between them and the grades in Measuring Up. The three clusters are:
- Economic and demographic influences, reflected in data on the labor force, income, race and ethnicity, and the age distribution of the population;
- Institutional design or structural influences, as these are measured in data on the number and types of postsecondary institutions in the state; and
- Influences from funding levels, reflected in total resource availability and sources of revenue for all of education, from K-12 through postsecondary education.
The overall analytical framework guiding the analysis is that these clusters differ from one another in the type of state policy needed to influence performance within them. Understanding how these influences affect performance can suggest directions for policy, as well as the most effective aim of policy-toward students, institutions, sectors, or the connections between higher education, K-12 schools, and the economy. For instance, performance problems attributable to economic and environmental factors require policy interventions that connect education with other aspects of social and economic policy. Performance that is more directly associated with funding or linked to system design requires different interventions.
The variables we identified and the way we clustered them are shown in Table 2. Some of these were drawn from the contextual data collected for Measuring Up, but we could not use the same measures that were incorporated into the grades in Measuring Up, since the point of the analysis was to test the relationships between the dependent and independent variables. Some of these are fairly standard measures within higher education, but a few were constructed for this analysis. Appen-dix I describes the data sources that we used for the measures, and the complete listing of state scores on each of the different measures is provided in Appendix II.
It is important to recall that these statistical tests will be an imperfect measure of the relative influences on performance, since the tests measure only the associations (relationships) between variables. It is always tempting to infer that associations mean influences, and that influences imply some degree of causation. From the tests alone, however, we cannot know the direction of influence or the degree of causation between these clusters and the performance measures in the report card. In addition, the data are so highly aggregated that it is best to think of the results as pointing to broad relationships that might be explored further in future research, in which variables may be representative of a cluster of characteristics rather than important in themselves.
Because of the limited number of observations (50), we could use only three to four independent variables in each equation. Therefore, an exploratory analysis using bivariate correlation matrices and scattergrams was used to help narrow down the variables that appeared to be most related to and relevant for each performance indicator. The analysis looked at which of the independent variables were highly correlated with each other-both within clusters and between clusters-to avoid entering two highly correlated variables into an equation together. These are reported in Table 3. In particular, note that: (1) most of the variables in the funding category are highly correlated with each other (some are nearly identical in a statistical sense); (2) per capita income is correlated with several of the funding variables; and (3) the percentage attending private, not-for-profit four-year institutions is highly correlated with several of the funding variables, including aid per student, average tuition and fees, and tuition revenue as a percentage of total revenue.
The exploratory analysis also identified the variables that made sense as predictor variables in theoretical terms and had the strongest bivariate correlations with each performance measure. Table 4 (following page) shows the variables in each cluster that have the highest bivariate correlations with each performance measure.
The results of the exploratory analysis were used to choose the independent variables included in the equations for each performance measure. In most cases, two or more variables for a particular performance measure were highly correlated with each other; therefore, several equations were chosen using different combinations of the variables. Backward elimination regressions3 were run on various combinations of three to four of these variables (given the correlations among the independent variables), for each performance measure. Combinations of variables were selected such that the variable(s) represented at least two of the clusters for each combination.
3 The regression models were reduced in order to eliminate variables that did not add to each model's ability to explain the variation in the dependent variable. In the backward elimination method, all of the independent variables are entered, then at each step the variable that changes R-squared the least is removed. The procedure continues until the removal of any variable in the model results in a meaningful change in R-squared. "Stepped out" variables are those that were removed from the equation through this process.