Saturday, July 2, 2016

New Article by Gene Glass on the Limits of Meta-analysis in Education

Gene glass is one of education's most accomplished and perceptive quantitative methodologist. He is the developer of meta-analysis, the statistical method for consolidating the findings of related studies into a single result.

In a recent article in the journal Educational Researcher*, Dr. Glass notes the following interesting conclusions about meta-analysis.

  • Meta-analysis is unique in that it is possibly the only education born widely-used quantitative methodological tool that has been adopted by medical research. Typically education research seeks to emulate what medical research is doing to gain credibility as a science. In this case, medical research adopted this education methodology and now uses it to an even greater extent then education.
  • Meta-analysis has not produced incontrovertible findings that can lead to education policy.

While the first conclusion is interesting and a coup for education research, the second is a bombshell.  It is rare that the inventor of something points out its limitation after an extended period of use. The problem is that unlike medical research where related studies/drug trials tend to find similar effects, in education the variability of findings within a given meta-analysis is usually much larger than the overall effect. This is yet another problem with relying on small effect sizes in education to drive policy and conclusions that something is effective.

While the existence of large variation within a given meta-analysis is problematic, it also represents an opportunity. One can look inside the accompanying table that lists the characteristics of all the studies in the meta-analysis to identify those studies with the biggest effect sizes and then look at them to see if there are any useful common characteristics. That is the gist of the the section "How to Review a Meta-Analysis" in Chapter 3 of the text. Of course, it may be that the studies with the highest Effect Sizes are at the early elementary grades, or very short term interventions using a measure developed by the researcher, which will be of little help if you are trying to raise the scores of a middle or high school on more standardized measures. However, you never know whether the results can help improve your practice till you look into the most promising studies.

* Glass, G. V. (2016). One Hundred Years of Research Prudent Aspirations. Educational Researcher, 45(2), 69-72.

Wednesday, June 8, 2016

The Importance of Quantitative Data Mining to Improve Practice

One way to use quantitative research to improve practice is to use the simple techniques for critiquing quantitative research in the top journals in terms of its practical significance; primarily in terms of how experimental students did on an actual basis. This is the emphasis in the first part of the alternative text Authentic Quantitative Analysis for Leadership Decision-Making and EdD Dissertations. Another less discussed way of using quantitative data is "Data Mining." While data mining is extensively employed in the business world to improve organizational performance, it also has wide applicability in education. Data mining simply means finding a key metric amidst all the data flowing through and within a school or district to initiate improvement action and to then monitor changes to that metric.

The importance of data mining comes across in a recent story on National Public Radio by Elissa Nadworny entitled:  What One District's Data Mining Did For Chronic Absence.

This story details how the superintendent asked the question: Do we have a serious chronic truancy problem?

This leads to the questions:

  • How should chronic truancy be defined?
  • How to access that data?

The subsequent analysis showed out that there was a huge unaddressed chronic truancy problem—40% of students fell into that category. The rest of the story deals with the steps taken both initially and subsequently to address the problem.

Throughout the improvement effort those involved continued to monitor progress on the metric. Several years later the number of chronic absentees had been cut in half. This case study has the following implications for EdD programs:

  • Data Mining is an important form of quantitative analysis that those in EdD programs are likely to apply in practice,
  • Such analyses do not require advanced statistics at any stage of the process, and
  • If you establish appropriate practices that are actively monitored and iteratively improved upon, within several years you can produce BIG improvements.

In terms of the last point, there is no need to calculate statistical significance or effect sizes. You have produced a large, highly visible improvement that does not require any statistic to tell you if the improvement is significant. It clearly is--and these are the types of improvements schools need to strive for.

At the same time, data mining by itself does not produce improvement. You need to incorporate research findings and local knowledge to develop an appropriate action plan. At the same time, if leaders do not have a data mining perspective/impulse, problems can easily be overlooked. It also becomes easy to overlook that a plan of action to solve a problem is not having the intended effect. In this example, the strategy used by the district had no effect in the first year and they then had to improvise changes to the action plan.

What this means is that quantitative methods courses in EdD programs should spend some time on applied data mining.  Topics could include:

  • Identifying and defining key metrics of improvement,
  • Having students collect data around a metric of their choice for their school(s), 
  • Evaluating the results and developing an initial action plan, and
  • Methods of communicating the quantitative data to the entire community in an easy to understand fashion that can be used to mobilize a broad-scale response to problems.  

The URL for the story is:

Tuesday, May 31, 2016

What is the fundamental Basis of Scientific Thought/Discovery? Implications for the EdD

My work on research methodology talks a lot about the fundamental nature of science and scientific discovery. This is because much of what I read in education about the role of theory and how to interpret the practical implications of research results seems to me to be idealized, narrow, and artificial. Education conceptions of the role of theory and the applications of research outcomes to practice appears to me to have become inbred in our wonderful profession with a rhetoric and philosophy that appears to have evolved in a fashion that is increasingly disjointed from the actual fundamental nature of science. This disjointedness is true for education in general and carries over to how theory, research, and science is presented in EdD programs.

Therefore, in order to understand the real role of theory and quantitative analysis I read a great deal about the fundamental nature of science and scientific discovery as conceived and practiced in the physical and medical sciences—as opposed to relying on prior education research texts and existing discussions about the role of theory in education. As a result, my text on Authentic Quantitative Analysis... has a great deal of original perspective on the role of theory and how to apply quantitative results to leadership practice that differs from existing conventional wisdom in education and in EdD programs.

Right now I am reading one of the most fascinating and unique books on the latest research findings in physics. The book, Seven Brief Lessons on Physics by Carlo Rovelli, is unique in that it is short, poetic, and non-technical—and still manages to explain in easy to understand fashion the most important events and ideas in physics about the nature of the universe we exist in. If you want to understand quantum mechanics, Einstein' thought experiments, etc. in a quick and interesting read this is the book for you.

The insight from this book that sparked this post is what the author claims is the fundamental essence that drives scientific discovery. Is it theory? Is it evidence? NO! The author indicates that:

...before experiments, measurements, mathematics, and rigorous deductions, science is above all about visions. Science begins with a vision. Scientific thought is fed by a capacity to "see" things differently differently than they have previously been seen.

This is why Chapter 4 of my text emphasizes the creative and metaphorical nature of scientific discovery and Part II talks about Design-Based Dissertations as an important option in EdD programs. The latter focuses of envisioning and trying new and unique approaches to solving problems regardless of whether they are grounded in existing academic theory or research.  

Wednesday, May 18, 2016

A Reassessment of the Role of Theory by a Leading Researcher: Implications for EdD Dissertations

David Berliner is one of the top education researchers and helped define the field of educational psychology. He is a former president of the American Educational Research Association, and co-author of the books (among others) of  Educational Psychology, Handbook of Educational Psychology, and Manufactured Crisis: Myth, Fraud, and the Attack on America's Public Schools. Dr. Berliner typically extols the tradition perspective of education research. This is why I was surprised to see him describe the role of theory in a very non-traditional fashion in a recent autobiography (in Education Review) summarizing what he has learned in his long and distinguished career. Dr. Berliner concludes that:
"Theory may be overrated. The journals and the scholarly community value “theory.” But I have done a lot of research on teachers and teaching without much theory to guide me. We deal with the practical in education, and the practical is filled with complexity, some of which is hard to fit into psychological or any other social science theory...
A good question is a good question, and should be pursued. Working from a Piagetian or Vygotskian theory is nice, and thinking about the world from a Freirean position, or asking what would Derrida say, is also to be lauded. But in my research career, a good question sensibly answered is worth its weight in gold. So I have come to believe that dust-bowl empiricism is too often dismissed as inadequate, and that theory in education research is too often over emphasized. I am more impressed with the quality of the question asked and the attempt to answer it, and less impressed with the quality of the theory."  

Implications for EdD Dissertations

Dr. Berliner's statements point to the need to rethink the traditional perspective of the role of theory in an applied program such as the EdD. The question is how to do that. It is clearly appropriate for EdD programs to require/expect students to exhibit knowledge of theory and to demonstrate the ability to apply theory to problems in class assignments and exams. However, should students be required to provide a theoretical rational for the questions they ask in an EdD dissertation or the approach chosen to study the problem? Dr. Berliner's perspective and Chapters 4,5 of the above alternative quantitative methodology text for EdD programs suggest that theory and theoretical justifications should not be required as a basis for EdD dissertation work; though students should have the option for using it as a basis. Rather the focus should be on asking good questions, i.e., ones that have clearly empirical evidence of their importance and/or that they have not been previously asked in adequate fashion. Methodology and approach should be demonstrably unique—unless you are trying to replicate or extend the results from a prior study.     

The URL for this autobiography is 

Friday, May 6, 2016

Are the Recommendations of the What Works Clearinghouse Useful/Reliable for Practitioners?

In 2002 the US Congress established the What Works Clearinghouse in the US Department of Education’s Institute of Education Science. What Works Clearinghouse’s role is to set rigorous quantitative standards based on the best science to provide guidance for practitioners as to what works by assessing the quality of research evidence supporting a given intervention. The What Works Clearinghouse then issues a rating on whether the evidence behind an intervention meets its standard of evidence or does not. The What Works Clearinghouse performs the same function in education that the Food and Drug Administration (FDA) does in medicine.

Can leaders trust the recommendations of the What Works Clearinghouse? Are they valid?  

The above text on quantitative analysis raises questions about the quality of the WWC's reviews (see Chapter 3). Specifically the text criticizes WWC for validating the research behind 'Success for All' when there was lots of published contrary evidence, and for concluding that 'charter schools' are more effective than 'traditional public schools' despite the tiny Effect Sizes (ESs). Are these problems anomalies—or are there deep seated problems with WWC's recommendations?

An amazing recent article suggests that the problems with the recommendations of the WWC are deep seated. Ginsburg and Smith (2016) examined the evidence for all the math programs certified by the What Works Clearinghouse as having evidence of effectiveness as provided by the most rigorous “gold standard” research design—Randomized Controlled Trials (RCT). They reviewed all 18 math programs that had been certified by WCC which contained 27 approved RCT studies. They found 12 potential threats to the usefulness of these studies, and concluded that “…none of the RCT’s provides useful information for consumers wishing to make informed judgments about what mathematics curriculum to purchase." 

One of the key problems across the What Works Clearinghouse math studies was that where Ginsburg and Smith (2016) were able to determine the error effects of a threat, the error generated by even one of those threats was at least as great as the Effect Size favoring the treatment group. In others words, the 'gold-standard' of research is not so golden.


  • Leaders cannot not trust the recommendations of the What Works Clearinghouse (WWC). Leaders need to conduct their own due diligence using the techniques in the book—both on research in general and on the recommendations of the WWC. 
  • The fact that the WWC requires the most rigorous research methodology and statistical evidence, and there are still a dozen potential "threats" to the results, suggests that he real world of practice is too complex for the traditional experimental approach and its reliance on relative measures of performance.  In other words, it is virtually impossible to establish full internal validity in applied experimental research regardless of how rigorous the research standards are. (Simpler alternatives for assessing the effectiveness of interventions are discussed in Chapter 5 of my methodology book.)   

Ginsburg, A., & Smith, M.S., (2016). Do randomized control trials meet the “Gold Standard”? A study of the usefulness of RCTs in the What Works Clearinghouse. The URL to this article is:  

Monday, May 2, 2016

Can Theory Emerge From Practice? Does Scientific Discovery Only Occur at a Young Age?

The book Authentic Quantitative Analysis for Leadership Decision-Making starts with discussions about the nature of science and then uses that as the basis of discussing the relationship between theory and application in science in Chapters 1, 2. The discussions about the relationship between theory and practice in the book are unique and counter-intuitive in terms of how they are usually discussed in EdD programs. Some of the innovative nature of the discussion talks about:

  • the importance of personal theories of action along with academic theories, and 
  • the application of theory to practice does not necessarily mean that the practice will be more effective than one that is not based on academic theory. 

Chapter 4 goes a step further and argues that while the importance of theory is universally discussed as a key to improving practice, this ignores the fact that important academic theory can emerge from a successful practice that was based on an alternative mode of scientific discovery; e.g., accident or metaphor.

This is why I was glad to see a recent article about Iva Babushka, a 90-year old mathematician. This article noted that while applications such as cryptography have emerged from highly abstract number theory, "Conversely, many elegant and aesthetically pleasing mathematical theories have emerged from the most utilitarian applications." In other words, even abstract mathematical theory can emerge from the experience of successful practice.

The bottom line is that EdD programs should encourage students to explore innovative approaches to improving practice even if there is no theoretical justification.

And, by the way, one other take away from this article for us older faculty and students in EdD programs, is that the most important work of this mathematician occurred when he was 70. This contradicts the often cited notion that science and math is a young "Man's" game and that folks are over the hill by 30 or 40. In other words, faculty and students over the age of 40, in any field, still make important discoveries. Science is also finally acknowledging and honoring the important contributions of women. So let's embrace as a positive the fact that EdD programs tend to work with more experienced, older students than PhD programs, and also view as a plus the highly diverse nature of our students.        

Saturday, April 30, 2016

Happenings at the 2016 AERA Conference—Part II

Having a chance to meet and talk with EdD faculty from around the country is a wonderful chance to listen and learn, Several mentioned that they taught quantitative methods to both PhD and EdD students at the same time. Given that my book, Authentic Quantitative Analysis for Leadership Decision-Making is geared specifically for EdD students, I suggested that they differentiate the course somewhat. They could use this book for the EdD students, and use Chapters 1-4, and the chapter on the literature review with the PhD students, while also having a more traditional text specifically for the PhD students that delved deeper into methodology and internal validity. So some of the reading assignments could overlap and others be differentiated.

The thing that the folks I met were most impressed with about the methods in the book was the simplistic beauty of being able to teach EdD students to critique the most sophisticated quantitative experimental research by looking for just three numbers in order to determine the practical significance of the findings.

The most negative comment I encountered is where one person, after looking at the table of contents, was insulted that the book covered her entire course in just 18 pages. I noted that the book might give her some ideas of additional things she could incorporate into her course. However, she looked at me as though I was an idiot and walked off in a huff. Fortunately, that was the exception. Everyone else was very open to the notion that there was a need to reform how quantitative methods were taught, and that there was now a resource that would now provide them with the resource and credibility to do so. Furthermore, everyone understood that once students understood how to critique quantitative research all faculty could integrate more such research into their courses.

A final suggestion is that all faculty should encourage their students who attend AERA or any professional conference to spend some time in the exhibitor area. It is a place where they can see the latest publications from many publishers and meet with editors to discuss publication ideas.

Wednesday, April 27, 2016

Two New Case Studies for Chapter 3

Here are two additional cases that can be incorporated as assignments after covering Chapter 3—with suggested solutions:

CASE STUDY #1 From the Headlines:

In April 27 2016 the New York Times described the latest results from the 2015 National Assessment of Educational Progress (NAEP) as follows:

"The results, from the National Assessment of Educational Progress, showed a drop in the percentage of students in private and public schools who are considered prepared for college-level work in reading and math. In 2013, the last time the test was given, 39 percent of students were estimated to be ready in math and 38 percent in reading; in 2015, 37 percent were judged prepared in each subject.

“This trend of stagnating scores is worrisome,” said Terry Mazany, the chairman of the governing board for the test... 

The math tests are scored from zero to 300, and in 12th grade, the average dropped to 152 in 2015 from 153 in 2013, a statistically significant decline. " (Zernike, 2016)

How do you interpret these results? Do they have practical significance? Should we be bemoaning a decline in student performance?

SOLUTION: These results have no practical importance for policy or practice. This is a classic case where a big deal is made of small differences simply because a large sample caused them to be statistically significant. At the same time, one must be vigilant to see if there is additional decline on the next NAEP test.

CASE STUDY #2 The Effectiveness of Cognitive Tutor

Is the Cognitive Tutor curriculum for improving student performance in Algebra I in both middle and high schools? Clearly, passing Algebra is a major hurdle for many students and is a gateway to developing the higher level math and science skills that are of increasing importance. As a result, increasing the passing rates in Algebra is also a national priority to as critical to increasing equity. This curriculum combines traditional types of materials with the latest of technology, and individualized intelligent math tutor.

In order to assess the practical significance of this intervention for your school(s), consider the following research:

Pane, J. F., Griffin, B. A., McCaffrey, D. F., and Karam, R. (2014). Effectiveness of cognitive   tutor algebra I at scale. Educational Evaluation and Policy Analysis, 36(2), 127-144.

Step 1. Download the article either from your universities library or from site.

Step 2. Read the entire article but ignore all technical terms that you do not understand.

Step 3. Critique the study by.

a. Listing the positives of the study’s methodology
b. Identifying what the key missing piece(s) of data are in the results provided is?
c. Assess the results provided and whether you found it sufficiently compelling to consider adopting it for your schools—and explain why in terms of the specifics of the results.


a. Positives:

A major positive of this study is that it has a high level of transparency and reflectiveness. The researchers clearly present the data in an impartial and open manner. They point out negative results for the intervention. There are also places where they point out that a conclusion of theirs can be interpreted differently. For example, in the first column on page 40, in trying to equate what an ES of .2 means in terms of national expected growth, and they note that it is comparable, they do note that their ES did not include summer loss which the expected national summer loss did.
A second positive is that no members of the sample were dropped even when there was not all the information needed to perfectly match them. In addition, the sample seems to be large and diverse enough to make the study relevant.
A third positive is how they reported the outcome data. Like the study in the previous case study the researchers adjusted final scores via Analysis of Covariance. They are clearly making such adjustment more carefully and openly discuss the problems associated with such adjustment and try to use advanced techniques, and three different methods of adjustment to minimize potential problems. Table 4 & 5 show that there is no difference in outcomes between the three different levels of adjustment (Models 2,3 & 4).
A fourth positive is that they analyze the results separately for middle and high school. As a result it is clear that there is no rationale for middle schools to adopt this intervention. As a result, the rest of the analysis will focus just on the high school results. 
A fifth positive is that they conducted the study over two cohorts of students so that teachers were able to gain a year of experience, and indeed the high school results were better for Cohort II.
A sixth positive is that the researchers report both the unweighted results, i.e., the “no covariate” Model 1 in Table 4, and the fully weighted Model 4. They acknowledge on page 139 (Column 2) that the treatment effects for the unweighted results were “substantially lower than for the models with adjustments.” One minor criticism is would have been better for the researchers to simply state that “there was no significant treatment effect for the unweighted results.”     
As an aside, one interesting thing to note is that the researchers thought that the weighted high school results improved in Cohort II because the teachers reverted back to traditional approaches to teaching.
b.  Problem:
There is one big problem. There is no data on the actual post-test performance of the experimental and comparison groups. Everything is once again a description of the relative differences. We do not know how well the students did in Algebra or on any math test. How many passed? How many moved on to the next level of math? Why are we not told how the students actually did on the Algebra Proficiency Exam in a non-standardized form? Why not tell us how many of the 32 items on the post-teach each group got correct?
Instead, they try to make the ES of .2 seem important by:
·       First indicating that it is equivalent to students move from the 50th to 58th percentile. But that does not mean that anyone scored at the 58th percentile. Did they move from the 10th to the 18th? Did the lower performing students make any equivalent percentile gains?
·       Second, they compare this ES to typical achievement gains nationally in the affected grades. Not only was the ES reported in this study slightly smaller it did not include summer loss so it is even smaller than what is typically achieved nationally. 
In any event, this study compares the tradition of trying to make an ES at the smallest end of Cohen’s range seem more important that it actually is.
Should you adopt the program?
c.   Given that:
·       The program is expensive,
·       The intervention was not compared to traditional individualized drill and practice software,
·       The progress was less than what students typically achieve nationally,
·       The low ES that does not meet recommended minimums for potential practical significance, and
·       No absolute data on how the students actually did,
I would not recommend that anyone adopt this program based on this research.

Sunday, April 17, 2016

Happenings at the 2016 AERA Conference

I just returned from the 2016 AERA (American Education Research Association) conference in Washington DC. I set up a booth in the exhibitor area to highlight the book (Authentic Quantitative Analysis for Leadership Decision-Making). In a bit if irony I was across from the Harvard Education Press booth and they probably had a 100 books on display—my booth only had the one. At the same time, mine was the only booth dedicated to the EdD, and I was pleasantly surprised at the large number of people who stopped by.

I must have had conversations with individuals from 60-70 EdD programs from around the country. These conversations confirmed that many are concerned about the state of how quantitative research is taught. There is a sense that something is wrong as students are increasingly turning to qualitative research. There is nothing wrong with qualitative research if it is being used for the right reason—as opposed to students feeling that quantitative methods are too difficult and that they cannot master it, as well as not seeing the relevancy of the traditional complex quantitative methods for their practice.

There was a tremendous response to the ideas in the book and many were drawn to moving quantitative methods from being a course on statistics to one that focuses on leadership decision-making. It is also becoming clearer the forms of statistical analyses used for PhD programs to test theory and those used to inform leadership decision-making are different. It is not that the statistics are different, but the degree of statistical methodological control and criteria for interpreting the results are different. The big problem for practice is that the forms of statistical analyses typically found in published quantitative research tend to over-estimate the importance of the findings for improving practice in the real world. In other words, the methods used in published research on the effectiveness of practices being tested are overly complex and unintelligible, and then in the end the results are misleading.

When I talk to professors who specialize in policy and practice , but who are not methodologists, about these problems the typical comment is that they do not get involved with, or understand, quantitative research. The result is that we have as a profession have abdicated responsibility for making important decisions about what is effective and left that to statisticians. The statisticians/methodologists have developed powerful techniques that enable them to declare small differences as having important practical importance. But the reality is that these do not have actual real world importance (see the earlier post about the problems that small differences are causing in psychology).

That is why the focus of the book is on much simpler and more accurate ways to determine the practical importance of quantitative research evidence. It is not just that this is important for making better decisions as to what practices are likely to be effective in your settings, it is important for our profession to retake responsibility for interpreting evidence as to effective practices. This book is a critical tool for enabling each and every one of us who are the ones who best understand the dynamics within schools and the needs and abilities of students and teachers to stop being cowed by quantitative evidence and to critically embrace the valuable information contained within quantitative data.