Ana səhifə

U. S. Department of Education Office of Planning, Evaluation and Policy Development


Yüklə 1.39 Mb.
səhifə7/12
tarix26.06.2016
ölçüsü1.39 Mb.
1   2   3   4   5   6   7   8   9   ...   12

B. analytic issues

1. Estimation of Program Impacts


Some policymakers and program operators may be most interested in learning about the effects of offering Upward Bound services to eligible applicants. Others may be more interested in learning about the effects of actually participating in Upward Bound. The evaluation literature refers to the first impact as the “intended to treat effect” (ITT) and the second impact as the “complier average causal effect” (CACE) (see Angrist, Imbens, and Rubin 1996). Our estimates of the ITT are based on a comparison of students randomly assigned to the treatment group with students randomly assigned to the control group. Our estimates of the CACE are based on a comparison of Upward Bound participants to nonparticipants, using the outcome of random assignment, treatment or control status, as an instrumental variable to predict program participation.

Both sets of effects are estimated from the same analysis sample. The difference between the CACE and ITT effects can be illustrated by thinking about two hypothetical programs that are equally effective for participants, but are not equally effective in their ability to induce students to participate. Suppose two programs (A and B) provide the same set of program services for participants. Program A is very effective at inducing participation and has a 100 percent participation rate, whereas only 50 percent of students participate in Program B. The CACE for these two programs would be the same, but the ITT effect for Program A would be twice as large as that for Program B, reflecting the fact that it has a larger effect on the average applicant, because it encourages more of them to actually participate in program services.



a. Estimating the Impacts of the Opportunity to Participate in Upward Bound

The ITT effect addresses the policy question, “What is the effect of the program on its intended beneficiaries—in this case, eligible applicants to the Upward Bound program?” The ITT effect is the more comprehensive measure of the program’s effect because it captures both the effect of the offer of program services and the effect of the services on those who receive them. Because eligible applicants were randomly chosen to be given the opportunity to participate in Upward Bound, we can obtain an unbiased estimate of this effect by computing the difference between the average outcome for the treatment group (those offered the opportunity to participate in Upward Bound) and the average outcome for the control group (those not offered the opportunity). In practice, we computed program effects by estimating a regression model with the outcome of interest as the dependent variable. Further details on our estimation procedures are presented later in this section.



b. Estimating the Impacts of Upward Bound Participation

The CACE answers a slightly different policy question: “What is the effect of the program on those most likely to be affected—that is, eligible applicants who actually participated in the program?” Because individuals who choose to participate in Upward Bound are self-selected from those who are given the opportunity and may have different characteristics from those who are given the opportunity and choose not to participate, estimating the CACE requires assumptions about the relationships between student characteristics, participation, and outcomes. Appendix E presents more details on those assumptions and estimation of the CACE. The estimates of the CACE, which capture the effects of actual participation in the program, account for both treatment group no-shows and control group crossovers.5

ITT estimates are more interesting in evaluations of existing programs, while CACE estimates are more interesting in the context of demonstrations designed to test the effectiveness of a particular bundle of services. We present both types of estimates in the text to facilitate easy comparison. Overall, the two sets of estimates tell similar stories, with nearly identical patterns of significant effects. Because the participation rate of the treatment group in Upward Bound or Upward Bound Math Science was 85 percent, and about 14 percent of the control group members received services from either program, the CACE estimates are approximately 41 percent higher than the ITT estimates.6

c. Regression-adjusted Estimates

To compute the effects of Upward Bound, we use a regression-adjusted approach in which we estimate a statistical model that predicts the outcome of interest as a function of treatment status and a number of baseline characteristics. We identify a core set of baseline characteristics as control variables based on the criteria that they have predictive power in regression models for key outcomes. The same set of variables is used to estimate impacts for all outcome measures. Inclusion of the baseline characteristics as control variables in the regression model allows us to (1) increase the precision with which we estimated the effects and (2) adjust for chance differences between the treatment and control groups on these characteristics. Appendix E provides more details about our estimation procedures.

Table II.2 lists the control variables that are included in the regression specifications, along with the reference groups.7 In addition to the variables listed in the table, our regression model included an intercept and a binary variable indicating whether the sample member was in the treatment group. Because all of the outcome measures presented in this report are binary

variables, we estimated logistic regression models. Our method for estimating the standard errors of impact estimates takes into account the stratification of projects and the clustering of students within projects.


2. Subgroup Analysis


To assess whether Upward Bound is more effective for some eligible applicants than for others, we estimated impacts for subgroups based on a set of applicant characteristics.8 These subgroups are listed in Table II.3.9

Chapter IV devotes special attention to characteristics of eligible applicants that may be of particular interest in identifying students likely to benefit from Upward Bound because the characteristics are indicative of students’ risk of poor academic outcomes. In the main text, we discuss the effects on the subgroups defined by the first four characteristics in Table II.3; the

results for the remaining subgroups are presented in Appendix I. In general, subgroups are based on characteristics of applicants at the time they applied for Upward Bound. However, three subgroup characteristics are defined based on ninth-grade transcripts: level of ninth-grade mathematics class, ninth-grade GPA, and the academic performance index. For students who applied to Upward Bound in eighth and ninth grades, our measures of these characteristics could be affected by Upward Bound participation if Upward Bound has an immediate effect on high school course taking and grades.

The grade at which a student applies to participate in Upward Bound may give us some insight into the motivation and attitudes of a student; those who give consideration to the program and apply earlier may be more motivated to pursue higher education. Similarly, to measure educational expectations directly, our baseline survey before random assignment asked students to indicate the highest level of education they expected to achieve. For the evaluation, we classified sample members as having lower educational expectations if they did not expect to obtain at least a bachelor’s degree. Previous reports have found relatively large effects of Upward Bound for eligible applicants with lower educational expectations as defined by this measure (Myers and Schirm 1999; Myers et al. 2004). However, when recruiting for Upward Bound, projects may find it difficult to target students based on this measure of educational expectations. If applicants are aware that their likelihood of admission to Upward Bound is influenced by their response to a question asking about educational expectations, some applicants may modify their responses to increase their chances of acceptance.

We therefore also consider measures of educational performance which are less likely to be affected by such influences. The first of these is the level of mathematics class taken in ninth grade, and in particular whether the sample member took a course below algebra or algebra or above. Approximately two-thirds of sample members took algebra or above and one-third took a course below algebra. This measure may be interpreted as a more readily available assessment of educational expectations, as students who do not take algebra or above in ninth grade would generally not be “on track” to attend a four-year college or university. The level of mathematics class taken in ninth grade reflects, however, not only expectations but also prior academic performance. Among control group members, 60 percent of those who took algebra or above in ninth grade attended a four-year college or university within four to six years after high school graduation, compared with just 30 percent of those who took a course below algebra. Our second measure of academic performance early in high school is grade point average (GPA) in ninth grade, separating sample members into higher GPA (greater than 2.5) and lower GPA (less than 2.5) groups, with approximately half of the sample members in each group. We sought measures that would be highly correlated with educational expectations, and therefore potentially useful for targeting; that said, several of the variables were, a priori, potentially interesting for purposes of targeting even if they are not highly correlated with expectations.

In addition to the subgroups based on characteristics of Upward Bound applicants, we also analyze program effects by subgroups based on project-level characteristics. They are listed in Table II.4. Data on several of the project level characteristics are derived from the Survey of Upward Bound Grantees, conducted in 1993–94 as part of this evaluation (for details, see

Fasciano and Jacobson, 1997). We present the estimated effects for the subgroups defined above in Appendix I.

3. Use of Weights to Account for Sample Design and Survey Nonresponse


The sample members included in our analyses were weighted using standard weighting procedures to ensure that they represent all eligible applicants to Upward Bound nationwide. The weights account for three probabilities: (1) the probability that the sample member’s Upward Bound project was selected for the study, (2) the probability that the sample member was selected to be given the opportunity to participate in Upward Bound, and (3) the probability that the sample member responded to the fifth follow-up survey. Appendix A describes the weighting procedures in more detail.10

Use of the weights is necessary for four reasons. First, without the weights, the projects selected for the evaluation will not be representative of the universe of Upward Bound projects that existed during the 1992–93 academic year. Due to the highly variable selection probabilities, the unweighted sample of projects does not resemble the full population of projects from which the sample was selected; therefore, unweighted estimates do not measure the impacts of the national Upward Bound program and, more generally, have no readily apparent interpretation.11 Second, the weights are needed for generalizing the findings to the national population of students targeted by the Upward Bound program. Third, due to unequal selection probabilities of projects, we need to use the weights to appropriately account for the imprecision in impact estimates that results from the sampling design. Without the weights, we would obtain not only biased point estimates of impacts but also underestimated standard errors. Fourth, the use of weights reduces the possibility of missing data bias in the estimated impacts.

Given that the evaluation is based on a sample of eligible Upward Bound applicants, the estimated program effects presented in this report contain sampling error. To determine whether the estimated program effects can be attributed to the true effects of Upward Bound rather than to statistical chance, we tested whether the estimates are statistically significant—significantly different from 0—at three levels commonly used in conducting such tests. More specifically, we conducted two-tailed tests of the null hypothesis that “Upward Bound had no effect” against the alternative that “Upward Bound had an effect” at the 0.10 level, the 0.05 level, and the 0.01 level. Throughout the report, the term “significant” is used to denote statistical significance, not substantive significance. In the chapters that follow, we note that Upward Bound had an effect on a particular outcome only if the estimated effect on the outcome measure is statistically significant. If the estimated effect of Upward Bound on an outcome variable is not statistically significant, the true effect of Upward Bound cannot necessarily be assumed to be 0.

The power analysis conducted for this evaluation in Myers et. al. (1993) suggests that for any binary outcome measure, the study sample design allows for the detection of an impact of 0.10 to 0.15 standard deviations, depending on the effective sample size available for the analysis. This means that the sample design allows for the detection of an impact of 4 to 6 percentage points for postsecondary enrollment (assuming a 75 percent enrollment rate) and 5 to 7 percentage points for completion of a postsecondary credential (assuming a 30 percent completion rate). Under the assumptions of these power calculations, the study would identify an impact that is significant at the 5 percent level 80 percent of the time; however, we may find smaller impacts that are significant, as well as larger impacts that are not. The education evaluation literature (see, for example, Cohen 1988; Lipsey and Wilson, 1993) suggests that impacts of these sizes can be considered small, and the Department of Education’s What Works Clearinghouse requires an effect size larger than 0.25 for an impact to be regarded as substantively important. Therefore, even if different from 0, the true effect is likely to be small if the estimated effect is not statistically significant, because the sample was designed to detect program effects of a small size.


4. Potential Threats to the Study Design


Evaluations based on random assignment face potential threats that can bias the resulting effect estimates. For this study, the bias is likely to be small, as the random assignment protocol was implemented correctly (see Myers et. al. 1993). Analysis of baseline data for the treatment and control groups found small differences in some baseline characteristics between the two groups that arose by chance (see Myers and Schirm 1997).

A potential threat to the validity of the study is crossover or control group contamination. After randomly assigning students to the treatment and control groups, we discovered that project directors at some of the 67 projects conducting random assignment had allowed 29 control group members (2.2 percent) to participate in regular Upward Bound. In the follow-up student surveys, an alternative source of information on Upward Bound participation, 14 additional control group members (1.1 percent) reported that they had participated in Upward Bound projects, including projects not in the evaluation and thus not conducting random assignment. Upward Bound Math-Science participation was reported by 148 control group members, including 11 who had previously reported that they had participated in regular Upward Bound, and 137 (10.4 percent) who did not; see Appendix E for details on how we identify participation in Upward Bound and Upward Bound Math-Science programs. All told, 43 control group members (3.3 percent) participated in regular Upward Bound, and 180 (13.6 percent) participated in either Upward Bound or Upward Bound Math Science. In our ITT analysis, we maintained each control group member’s original status in order to preserve the comparability between the two groups due to random assignment; in our CACE analysis, all 180 were treated as noncompliers.12

The control group may have been affected by their experience of applying for Upward Bound and being turned away in a way that is different under the evaluation than under normal circumstances. Control group members could have been demoralized by not receiving the opportunity to enroll in the program; on the other hand, they may have been referred to other programs when they were notified that they would not be able to participate in Upward Bound. In either case, the “intervention”—a demoralizing rejection or a helpful referral—is likely to be much less intense than the Upward Bound program itself.13 Therefore, any potential bias in the effect estimates is likely to be small relative to the potential effects of the program.

1   2   3   4   5   6   7   8   9   ...   12


Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©atelim.com 2016
rəhbərliyinə müraciət