Greetings and welcome to Volume III of our EdHistory 101 Project, our periodic revisiting of key historical events, personalities and perspectives that continue to shape our ideas of school. Knowing our nation’s educational history—the forces, beliefs, and language of other times—presents opportunities for us to reconsider the roles we need school to play for the social and intellectual well-being of our young people. It’s a way for us to assess the prevailing values, to re-frame our work and, with persistence, to re-imagine the systems, culture, the practices that cement the grammar of schooling.

In this edition Wayne Ogden looks back at the teacher evaluation mania of a decade ago, wonders what’s changed, and offers some wisdom we’ve been slow to take advantage of.

I’ve been moved to republish the core of an article I wrote some time ago for two main reasons. The first is because there is a now a large, new generation of school administrators and teachers for whom teacher evaluation is a highly challenging craft endeavor. Second, because as a profession and as policy makers, we have yet to deal with the long-standing dilemmas within this broad set of issues. In a recent survey, network principals in one New England state told us that, on a 1-to-10 scale, the effort, time and mental energy required to comply with teacher evaluation requirements was between 7-8 and its rewards in terms of actually improving practice and building culture were a 4 at best.

Screen Shot 2021-08-13 at 8.21.01 AM.png

So, for that newer generation of folks observing teaching and doing their best to support good practices, and for the policy people who think about these issues, here is an EdHistory Project update of my earlier piece. I hope it’s helpful. I’ve included my ten recommendations to help get it right. Read on.

—Wayne

Where did the furor over teacher evaluation go? In 2012 it became the behemoth that overtook American schools. Now? It’s hardly on the radar screen. What happened?

To understand the trajectory, it’s important to recall that way back in 2001 Congress passed the “No Child Left Behind Act” with great fanfare and out of some desperation. Politicians and policy makers believed that something radical had to be done to improve our schools and rescue our economy from the ravages of a poorly educated workforce. When President Bush signed that act into law in early 2002 it was the most sweeping educational reform initiative since the days of President Lyndon Johnson. That law had bi-partisan support, promising radical changes in student performance and school accountability.

But a decade later at the mid-point of the first Obama administration, when students’ performance on standardized tests showed no substantial gain, our public schools again appeared resistant to improvement. Policy makers searched for the causes and teacher quality, understandably, came quickly to light. Studies from the New Teacher Project, among others, surfaced that school districts were habitually rating more than 90 percent of their teachers as “satisfactory”. Most districts also routinely failed to factor in performance in any way when making decisions about promotion, professional learning, pay scales, or termination.

In think tanks, and hence in Washington, the party line became, “it must be the teacher’s fault”, and its cousin, “why aren’t principals evaluating out those bad teachers?” The single-minded emphasis on testing as an achievement strategy begun with NCLB was supplemented by a new fixation on teacher evaluation and administrative management. Those of us in the business at that time –teachers, building and district administrators, and trainers- remember the flurry well.

The Obama administration’s economic stimulus bill passed by Congress included roughly $115 billion in education spending, of which $4.35 billion was allotted for competitive state grants through a program called Race to the Top. This program encouraged states and districts to revamp their teacher and principal evaluation policies and to begin to use evaluation results in making personnel decisions. States and participating districts were to begin evaluating teachers and principals using multiple measures, including an emphasis on “student growth”. That term was, and remains, defined to mean a change in student achievement as measured on statewide assessments and other measures that were “comparable across schools and classrooms.” Under Education Secretary Arnie Duncan, those requirements and definitions were codified in subsequent grant competitions, beginning in 2011.

State legislators and education policymakers across the nation quickly proposed dramatic changes in areas that had long been viewed as immaterial. An article from Education Next cited that the number of states requiring objective measures of student achievement to be included in teacher evaluations nearly tripled from 2009 to 2015, and the number of states requiring districts to consider teacher evaluations in tenure decisions grew from 0 to 23 over that same period.

Screen Shot 2021-08-13 at 8.23.07 AM.png

Strictly tying teacher evaluation, student performance, and merit pay provided an appealing formula for success. Many of our nation’s most prominent national and regional funders and businesspeople jumped on the bandwagon of a compelling mental model. Substantial private money followed public funds in support of these notions. Huge amounts were spent on the development of “new and improved evaluation programs” and hundreds of millions more on professional development to train educators on how these new teacher accountability systems were going to work.

Virtually all other professional development activities ground to a halt for two years, perhaps more in some areas, as training of administrators and teachers to implement new evaluation instruments and management techniques swamped all other needs and plans. It became the singular focus of states, districts, schools and the associated educational professionals for three years. Finally our public schools were going to produce higher performing students.

So, what did all of this spending and fixation on accountability accomplish? As the studies and evidence has rolled in, to this day, not a great deal. But it’s important to begin by taking a broader look.

School districts have struggled with teacher evaluation for years, showing limited progress in developing systems that actually result in instructional improvement and increased student learning. In a historical pattern, state departments of education regularly developed new models of teacher evaluation only to replace or embellish them every few years. In some states, teacher unions collaborated with their departments of education on new evaluation models only to retreat from them within a few years as they actually went into full-scale implementation.

The way evaluations have played out in far too many schools and districts is that an evaluator, whose real expertise has become the area of administrative management, data gathering and reporting, budget development, family engagement or student affairs, announces a date at which time she/he will come to observe a teacher at work with the class, usually for a good chunk of a class, sometimes a whole “period”. Depending on the contractual/labor agreements, the teacher may or may not have submitted a plan of what would occur in the class. This “formal observation” is usually supplemented by a few more “informal”, i.e. un-announced visits, that are shorter in duration, and which may or may not provide the opportunity for more data gathering on the teacher’s performance.

What it can too often feel like at the school level is inauthentic and inadequate, at best, and bad opera at its worst --a pretend panorama of the teacher’s daily routines and practices, mired in forms, dates for discussions about the findings, replies, claims, and counter-claims, within which the essence of teacher performance and support thereof, diminishes substantially as the days go by. From the teacher standpoint, the process too often provides little insight into the real dilemmas of teaching, and little by way of getting at the nuanced formulas for sustaining energetic teaching and learning in demanding settings, made even more stressful by the layers of testing and test prep.

Screen Shot 2021-08-13 at 8.23.55 AM.png

Part of the teacher quality conversation was computer billionaire and amateur education wonk Bill Gates, who was driven to try his hand at influencing the teacher evaluation game. If at the time you had visited the Measures of Effective Teaching (MET) project, the Gates Foundation’s newly established thought center for work in this arena, you would have found that teacher quality was defined in terms of the ability of a teacher to “produce gains in student achievement”. The project, wishing to be viewed as expansive in their thinking about the metrics involved, posed different ways to measure this ability. Along with direct measurement -the scores- there were alluded to a host of other indicators - observations, surveys, the occasional portfolio. But the core idea of teacher “effectiveness” remained defined as student achievement, and student achievement was and remains some sort of test performance.

Three years after an expenditure of more than 180 million (in 2012 dollars) in Gates Foundation and other funds, and an estimated 50 million dollar per year to implement and maintain the initiative, The Washington Post headlined, “Another Gates-funded education reform project, starting with mountains of cash and sky-high promises, is crashing to Earth”. A project in Hillsborough, Florida Public Schools was one of many throughout our country that focused on teacher evaluation as the path to improved student performance and better schools. (https://www.washingtonpost.com/news/answer-sheet/wp/2015/11/03/bill-gates-spent-a-fortune-to-build-it-now-a-florida-school-system-is-getting-rid-of-it/ )

I don’t want to demean philanthropy in education. Hardly. We need all the help we can get. If its spent in the right places and the right ways. I want to highlight that when a government agency, a funder, or other well-intentioned people make what seem to be knee-jerk, simplistic reactions to highly complicated problems such as addressing the factors that will improve student learning, those initiatives are likely to fail. I’m reminded of a great quote from educator Dean Shareski: “I’m not anti-education policy. I’m anti-simple.” These are incredibly complicated issues with many flavors of history, economics, culture, and politics. As the old but elusive saying goes, complicated problems demand complicated solutions.

At least two of our nation’s largest school systems (Los Angeles and New York) and the media that served those metropolitan areas thought that publishing a ranking of teachers by their composite evaluation scores, politely called teacher data reports in NY, might do the trick (in the form of public humiliation). As scholar and researcher Linda Darling-Hammond noted at the time in a Phi Delta Kappan article, (link here) “a teacher’s effectiveness is determined by numerous school and non-school factors that a ‘value-added’ analysis typically doesn’t or can’t take into account. These might include variables such as the impact of peer culture, students’ prior teachers and schools, summer learning loss, access to tutors, and even the nature of the tests used to measure achievement.”

Those initiatives that were spawned under Race to the Top focused on the development of complex educator evaluation systems, relying on explicitly-written teacher performance rubrics. These rubrics employ intricately-crafted instructional elements describing all possible teaching behaviors arrayed on a rating grid, similar to the old-fashioned teacher checklist evaluations, now on steroids. Whereas older checklists simply referred to broad categories of instructional competence (e.g. classroom management, questioning techniques, use of higher order thinking skills, etc.) the new rubrics were intended to be both comprehensive and explicit in determining the “evidence” that would suggest a particular type of instructional strategy is “exemplary”, “proficient”, “needs improvement” or “unsatisfactory”. Such rubrics are available on the websites of almost every state’s public education department.

That type of educator evaluation system alleges that it draws upon the “science of teaching” to help inform and modernize efforts around instructional improvement. However, of significant note, one often-used rubric model was originally designed (see Charlotte Danielson) to foster the professional development of teachers and instructional improvement in a collegial environment, not to “judge” or “rate” a teacher as “competent” or “incompetent” for the purposes of hiring, firing, retention or promotion. And as my ERC colleague, Larry Myatt often says, the effectiveness of what teachers do in the classroom can more easily be understood and examined if we watch the students, not the minutiae of teacher behavior. The idea of teachers as the sole and dominant actors in any classroom setting is part of a century-old model –a paradigm hyper-focused on the culture of teaching. What we still need now, as then, is to understand the creation of a culture of learning and how the teacher’s role must change in a departure from the industrial values of content delivery.

So, back to my initial question --why is getting teacher evaluation right still so difficult and elusive? Why have so many initiatives, tools, and procedures failed to achieve their desired result? I believe that it’s because we begin the conversations with the wrong question. If we start with questions about judging teachers and their teaching, we’re already on an impossible path. We need to re-frame, and to ask ourselves, what it is that will lead us on the path to the best teacher growth and improvement? And how does that contribute to the best kinds of student learning? What conditions and practices will result in the instructional excellence that we all desire for our kids?

Screen Shot 2021-08-13 at 8.25.05 AM.png

Here’s that same list of ten things I thought we should give a sustained try, and I still think they deserve a shot, in no particular order:

1: School districts must support and engage administrators to differentiate instructional leadership from operational functions. Principals who spend their time on bus schedules, budgets and bullies will never have adequate time to devote to developing and supporting master teachers around issues of instructional excellence. Districts can and should hire sufficient staff that doesn’t need advanced degrees, licensure, and big salaries to perform routine, non-academic tasks.

2: Beginning teachers should be apprenticed for a full school year to a master teacher in a co-teaching scenario (not as un understudy) for at least half of each school day. This “fellowship” would also include collaborating with other teachers in the school and district, looking at student and teacher work, discussing how best to address typical challenges and dilemmas of teaching. The other half of the day could be in full teaching service.

3: Fairly compensate master teachers at a higher level somewhat than their colleagues, but do not select them on the basis of seniority, rather as a result of thoughtful process involving administrative, peer and student feedback.

4: Master teachers should be become part of the school’s leadership team, involved in problem-solving, culture building and setting the institution’s course.

5: Schools must also judiciously hire corollary staff to free teachers from operational and “custodial” duties that distract them from their essential function as architects of learning, their students and their own.

6: Summers should include at least two weeks of pertinent, differentiated professional growth activities for teachers and districts. These high-quality collaborative growth activities must help to put to rest the one-size-fits-all teacher “PD” that changes focus and content each year based on the flimsy trends we’ve named above, often emanating from what the “district” thinks all teachers need.

7: A team of school leaders and administrators should be regularly observing teachers in all phases of their work (classrooms, student conferences, team meetings, etc.). No system of professional growth can work when someone is observed and a conversation convened only a few times per year. Administrators should engage with teachers in content areas to determine the kinds of routines, support, critique and provocation that are needed this year in this school.

8: Require and assist teachers in soliciting and using student and parent feedback on a regular basis, as part of a broader conversation about how learning is taking place. The practice of gathering and using such significant qualitative data remains almost wholly absent in most districts.

9: Take a page from the world of art, design and architecture and develop faculty cultures where critique and feedback are positive, frequent and earnestly solicited, not seen as negative and unwelcome. Video recording and analysis of teaching and learning provides the perfect medium to discuss the subtleties of the craft, the instructional leader’s real work.

10: Provide every first and second-year school leader with an external coach. See my coaching new leaders (link here)

Supporting teachers, including evaluating and discussing their work, are essential and powerful elements in a great schoolhouse. But it’s going to take an investment in helping schools become alive and responsive again, and policy-makers need to get out of the way if they can’t do better than the past twenty years. Rather than busying principals in getting rid of bad teachers, re-framing the issue as the need to support learning and those responsible for learning is critical, as is involving all parties in a conversation about the structural and organizational changes we need to make in our schools on behalf of children and families.

Screen Shot 2021-08-13 at 8.27.07 AM.png

Wayne Ogden is co-founder of ERC, a former teacher, high school principal, and superintendent. He specializes in coaching school leaders and was a co-author of The Skillful Leader, a handbook for administrators in supporting improved teaching.

We’re hoping you’ll join our EdHistory 101 Project reading group. Or to form your own. We invite you to copy the link text to read and discuss it with colleagues, to make time to refresh our role as the intellectual hubs of our communities. Watch out, minds at work!

Prior EdHistory Project 101 Volumes:

http://www.educationresourcesconsortium.org/news/2017/2/18/ellwood-cubberley-1868-1941

http://www.educationresourcesconsortium.org/news/2018/12/8/volume-2-erc-edhistory-101-project