CASE STUDY #1 From the Headlines:
In April 27 2016 the New York Times described the latest results from the 2015 National Assessment of Educational Progress (NAEP) as follows:
"The results, from the National Assessment of Educational Progress, showed a drop in the percentage of students in private and public schools who are considered prepared for college-level work in reading and math. In 2013, the last time the test was given, 39 percent of students were estimated to be ready in math and 38 percent in reading; in 2015, 37 percent were judged prepared in each subject.
“This trend of stagnating scores is worrisome,” said Terry Mazany, the chairman of the governing board for the test...
The math tests are scored from zero to 300, and in 12th grade, the average dropped to 152 in 2015 from 153 in 2013, a statistically significant decline. " (Zernike, 2016)
How do you interpret these results? Do they have practical significance? Should we be bemoaning a decline in student performance?
SOLUTION: These results have no practical importance for policy or practice. This is a classic case where a big deal is made of small differences simply because a large sample caused them to be statistically significant. At the same time, one must be vigilant to see if there is additional decline on the next NAEP test.
CASE STUDY #2 The Effectiveness of Cognitive Tutor
Is the Cognitive Tutor curriculum for improving student performance in Algebra I in both middle and high schools? Clearly, passing Algebra is a major hurdle for many students and is a gateway to developing the higher level math and science skills that are of increasing importance. As a result, increasing the passing rates in Algebra is also a national priority to as critical to increasing equity. This curriculum combines traditional types of materials with the latest of technology, and individualized intelligent math tutor.
In order to assess the practical significance of this intervention for your school(s), consider the following research:
Pane, J. F., Griffin, B. A., McCaffrey, D. F., and Karam, R. (2014). Effectiveness of cognitive tutor algebra I at scale. Educational Evaluation and Policy Analysis, 36(2), 127-144.
Step 1. Download the article either from your universities library or from http://www.ncpeapublications.org/index.php/doc-center) site.
Step 2. Read the entire article but ignore all technical terms that you do not understand.
Step 3. Critique the study by.
a. Listing the positives of the study’s methodology
b. Identifying what the key missing piece(s) of data are in the results provided is?
c. Assess the results provided and whether you found it sufficiently compelling to consider adopting it for your schools—and explain why in terms of the specifics of the results.
A major positive of this study is that it has a high level of transparency and reflectiveness. The researchers clearly present the data in an impartial and open manner. They point out negative results for the intervention. There are also places where they point out that a conclusion of theirs can be interpreted differently. For example, in the first column on page 40, in trying to equate what an ES of .2 means in terms of national expected growth, and they note that it is comparable, they do note that their ES did not include summer loss which the expected national summer loss did.
A second positive is that no members of the sample were dropped even when there was not all the information needed to perfectly match them. In addition, the sample seems to be large and diverse enough to make the study relevant.
A third positive is how they reported the outcome data. Like the study in the previous case study the researchers adjusted final scores via Analysis of Covariance. They are clearly making such adjustment more carefully and openly discuss the problems associated with such adjustment and try to use advanced techniques, and three different methods of adjustment to minimize potential problems. Table 4 & 5 show that there is no difference in outcomes between the three different levels of adjustment (Models 2,3 & 4).
A fourth positive is that they analyze the results separately for middle and high school. As a result it is clear that there is no rationale for middle schools to adopt this intervention. As a result, the rest of the analysis will focus just on the high school results.
A fifth positive is that they conducted the study over two cohorts of students so that teachers were able to gain a year of experience, and indeed the high school results were better for Cohort II.
A sixth positive is that the researchers report both the unweighted results, i.e., the “no covariate” Model 1 in Table 4, and the fully weighted Model 4. They acknowledge on page 139 (Column 2) that the treatment effects for the unweighted results were “substantially lower than for the models with adjustments.” One minor criticism is would have been better for the researchers to simply state that “there was no significant treatment effect for the unweighted results.”
As an aside, one interesting thing to note is that the researchers thought that the weighted high school results improved in Cohort II because the teachers reverted back to traditional approaches to teaching.
There is one big problem. There is no data on the actual post-test performance of the experimental and comparison groups. Everything is once again a description of the relative differences. We do not know how well the students did in Algebra or on any math test. How many passed? How many moved on to the next level of math? Why are we not told how the students actually did on the Algebra Proficiency Exam in a non-standardized form? Why not tell us how many of the 32 items on the post-teach each group got correct?
Instead, they try to make the ES of .2 seem important by:
· First indicating that it is equivalent to students move from the 50th to 58th percentile. But that does not mean that anyone scored at the 58th percentile. Did they move from the 10th to the 18th? Did the lower performing students make any equivalent percentile gains?
· Second, they compare this ES to typical achievement gains nationally in the affected grades. Not only was the ES reported in this study slightly smaller it did not include summer loss so it is even smaller than what is typically achieved nationally.
In any event, this study compares the tradition of trying to make an ES at the smallest end of Cohen’s range seem more important that it actually is.
Should you adopt the program?
c. Given that:
· The program is expensive,
· The intervention was not compared to traditional individualized drill and practice software,
· The progress was less than what students typically achieve nationally,
· The low ES that does not meet recommended minimums for potential practical significance, and
· No absolute data on how the students actually did,
I would not recommend that anyone adopt this program based on this research.