INTRODUCTION

Supporting teachers and principals by recognizing strong performance and helping them grow is more urgent than ever.

 
Results of the 2022 National Assessment of Educational Progress reveal alarming results: Since 2019, scores declined substantially for all students, while disparities widened for students already most affected by opportunity gaps.582 As districts and states help students recover in the wake of a global pandemic, supporting teachers and principals by recognizing strong performance, and helping them grow and improve where necessary, is more urgent than ever.

Strong teacher and principal evaluation systems have the potential to help teachers and principals improve their practice, to exit teachers who are perennially ineffective, to retain teachers who are effective and learn from them, and to increase the overall quality of a district's teacher workforce.583

As states respond to widespread concerns (both real and perceived) about teacher shortages and resignations, evaluation systems also have a role to play: Schools need access to fair, valid evaluation systems to help identify and retain highly effective teachers and principals, as well as to support those who are struggling.

As is true of all policies, implementation matters. A recent working paper from the Annenberg Institute generated disappointment and reflection after researchers found that, on the whole, states' changes to evaluation systems have not yielded the student outcomes they had hoped for.584 These findings stood in stark contrast to the well-documented success that systems like those in Dallas, Denver, the District of Columbia, Chicago, and the state of Tennessee have had building strong evaluation systems that directly contributed to improved student learning and higher teacher quality.585

But a closer look at Annenberg's research gives reason for optimism. Researchers found bright spots across the country (obscured by the larger trend) where exemplary evaluation systems made a significant impact on student achievement. These exemplary evaluation systems had a number of evidence-based practices in common, including the use of multiple measures to evaluate teachers' effectiveness (particularly student growth and student surveys), meaningful differentiation between teachers, regular and sustained opportunities for observation and feedback, guaranteed written feedback, and alignment with professional learning.586 These findings add to the evidence that evaluation, done well, can make a difference for students and educators.587

Given the importance of state policies to set the conditions for successful evaluation systems, NCTQ has regularly collected data, starting in 2011, to chart states' progress in adopting evidence-based evaluation practices. In this report, we analyze statewide policies for teacher and principal evaluations in the 50 states and the District of Columbia, using data collected in fall of 2021 and verified by states in early 2022, in order to answer the following questions:

  • What role does the state play in teacher and principal evaluation design?
  • What components are included in a teacher or principal's evaluation?
  • When, where, how, and by whom are evaluations conducted?
  • Are evaluations used for support and improvement?
 
FINDINGS
 
 

States have largely retreated or stalled in adopting evidence-based teacher and principal evaluation policies.


 
Since our last analysis in 2019, states have largely retreated or stalled in adopting evidence-based teacher and principal evaluation policies that support student learning. While state evaluation systems did experience disruptions throughout the pandemic, this pattern follows a trend that began as early as 2016. Since then, states have continued to move away from including measures of student academic growth as part of evaluations, and several have dropped the use of student surveys as well.

While several states have made progress in adopting effective practices like annual observations, most still have significant room to improve when it comes to requiring practices that support teachers' growth and development, such as requiring annual feedback for all teachers. Even as some states have lowered standards for entry into the profession, too many still do not require the basic support structures necessary to help new teachers improve, such as additional observation and feedback that begin early in the year.

States have also lost ground or failed to make progress in measuring meaningful outcomes for principals, continuing a trend away from factoring student academic growth and survey results into principal evaluations.
 
 
SECTION 1
[teacher] 

Teacher evaluation

 
Only 10 states require a teacher evaluation system that is the same statewide. Fourteen states allow districts to opt in or out of their statewide teacher evaluation system, and in 27 states the district designs their own evaluation systems based on criteria laid out by the state. While there are a number of reasons that a state might allow an evaluation system that is not uniform across all districts, without at least some common elements, consistency and comparability are limited.

Figure 1.

What role does the state play in teacher evaluation design?



[tcomponents]

What components are included in a teacher's evaluation?

 
 
Research has shown that it takes multiple sources of information to provide a fair and accurate understanding of a teacher's performance, and that evaluations based on multiple measures are more likely to be reliable and predictive.588 In 2018, NCTQ studied four large school districts and two states where evaluation had led to meaningful improvements in teacher quality. All six evaluation systems measured different facets of teachers' effectiveness from varied perspectives (e.g., student growth as measured by state assessments, observations by school administrators, and student surveys).589

Common elements of an evaluation using multiple measures might include formal observations; measures of students' academic growth, including on state assessments; and student survey data. Of these elements, we find that states are generally far more reliant on observations, and have significantly decreased the use of any other sources of evidence, particularly those tied to quantitative measures of student learning.

Observations

Observations (particularly when they are based on a clearly defined rubric) provide a rich source of information about multiple aspects of a teacher's skills and impact on students, and are a useful starting point for providing actionable, specific, and relevant feedback.590 Recent survey data suggests that most teachers find observations helpful in improving their instructional practice.591 Seven states (the District of Columbia, Missouri, Montana, Nebraska, New Hampshire, North Dakota, and Vermont) currently do not require observations as part of a teacher's evaluation. Twenty-two states require observations, but do not specify the percentage of a teacher's evaluation made up by observations. Of the 22 states that do specify a percentage, just under one half specify that observations make up the bulk (75% or more) of an educator's evaluation score.

While observations are a critical factor in any feedback cycle, there are well-documented limitations to their usefulness and reliability in understanding teacher performance, including patterns of bias592—all the more reason for states to include multiple measures, carefully weighing a range of evidence to provide feedback and evaluate performance.

Figure 2.

What percentage do observations account for in a teacher's overall evaluation score?

Note: Although Delaware does require observations, the state is currently transitioning key evaluation system policies and has not announced the percentage that they will account for in a teacher's evaluation score.

Measures of student growth

As part of an effective evaluation system, observations should be considered together with measures of student academic growth, which might include measures like student learning objectives (SLOs), district assessments, statewide assessments, or other shared measures.

Helping students to grow academically is core to a teacher's role593 and should be a component of any evaluation. Evaluations are also more likely to be valid measures of a teacher's performance when quantitative measures of student learning are combined with qualitative measures like observations.595

States have continued to lose ground on including measures of student growth in evaluations. Between 2019 and 2022, four states—Indiana, Mississippi,686 North Dakota, and Oregon—dropped requirements for including objective measures of student growth in teachers' evaluations. Of the 30 states that use measures of student growth, 19 specify the percentage of a teacher's evaluation that growth should comprise, ranging from 10% to 50%.

Figure 3.

Are measures of student growth required as part of a teacher's evaluation score?


 
Figure 4.

How much of a teacher's evaluation score comes from measures of student growth?

Note: Delaware is not currently included in this figure, as the state is currently transitioning key evaluation system policies.

Figure 5.

How many states' teacher evaluation systems require measures of student growth?

Note: These figures include states with explicit requirements in policy for student growth, regardless of the status of implementation.


State assessments to measure student learning

Between 2019 and 2022, the number of states where statewide assessments are required or explicitly allowed in evaluations decreased from 27 to 23. Alabama, New York, and Virginia added state tests as required or explicitly allowed measures, while Arizona, Delaware, Indiana, North Dakota, Oregon, South Carolina, and West Virginia dropped state assessments as a required or explicit measure of student growth. While pandemic disruptions may have prompted or accelerated a move away from use of state assessment data in at least a few states, many had already announced these changes by the early winter of 2020.599

Pandemic Disruptions to State Assessments
All 50 states and the District of Columbia were granted waivers from the U.S. Department of Education to forgo statewide assessments in the spring of 2020 in response to the onset of the pandemic. This had a significant impact on state teacher evaluation systems that include statewide assessments of student learning. As a result, many districts and states paused evaluations altogether or excluded state tests from teachers' evaluations.597 Those changes continued into the 2020-2021 school year, with disruptions still affecting statewide assessments in spring of 2021.598

Beyond these fluctuations, a longer-term trend is also clear: States continue to back away from using measures of student growth and using valid and reliable assessments of student learning in evaluation. Between 2015 and 2022, 14 states dropped requirements or allowances for the use of statewide assessments in evaluation. Without shared quantitative measures, it is more challenging for states to accurately assess the equitable distribution of effective teachers across districts and student populations. Moreover, a lack of shared measures also means that educators statewide are not held to the same expectations for student learning.

Figure 6.

Do states explicitly allow or require data from state standardized tests in teacher evaluations?



Student surveys

Another common component of effective teacher evaluation systems are student surveys, which give students a chance to give feedback on their teachers' classroom climate and instructional skills. Research shows that student survey ratings are positively correlated with learning gains, and that they are an accurate and consistent measure of teacher quality.600 Despite this, states have lost some ground in the use of student surveys: Three fewer states require or explicitly allow the use of student surveys than did in 2019.

Figure 7.

What is the role of student surveys in teacher evaluation?



Evaluation rating categories

An evaluation rating system with three categories or more is important to meaningfully distinguish performance, and evidence shows that binary systems favor nearly all teachers being rated satisfactory.601 While 37 states use a system that includes three or more rating categories in order to differentiate performance (with the majority, 31, using a four category system), 14 use either a binary system or do not specify rating categories. Three states still use a five category system: North Carolina, Oklahoma, and Tennessee.
 

[twhenwherehowwho]

When, where, how, and by whom are evaluations conducted?


 
Regular feedback is a critical element in helping teachers grow their skills and promote positive student outcomes. Annual evaluations for all teachers is a key feature of successful evaluation systems that improve teacher effectiveness and increase student learning.602

Evaluation frequency

Twenty-two states require districts to evaluate all teachers every year, and the majority of states (37) only require that probationary603 teachers receive an evaluation once a year. Only eight states (Alabama, Hawaii, Illinois, Maryland, Massachusetts, Ohio, Rhode Island, and Washington) require that teachers with low performance ratings receive additional evaluations. Two additional states, Texas and Wyoming, explicitly make additional evaluations for low-performing teachers an option in their state policies.

Figure 8.

How many states require all teachers to be evaluated annually?



Figure 9.

Are all non-probationary teachers evaluated annually?



Figure 10.

How frequently are probationary teachers required to receive an evaluation?



Observation frequency

It is widely accepted that opportunities for expert observation, feedback, and practice are important for all teachers, but particularly new teachers. Research suggests that more than one observation is necessary to accurately assess teacher performance as part of an evaluation.604 At least one recent study found that teachers who are observed four or more times per year report a more positive view of their evaluation system, compared to those who are observed less often.605

Observations are also more likely to yield reliable information about a teacher's performance when teachers receive more of them, particularly when they are conducted by more than one observer.606 Yet only 14 states require all teachers be observed multiple times each year; an additional 16 states require multiple observations for early career/probationary teachers only. When it comes to ensuring that new teachers are set up for success from the beginning, only 17 states require that new teachers receive observation and feedback early in the school year. Five states (Iowa, Maryland, New Jersey, New York, and South Carolina) require use of multiple observers, while an additional 15 states allow but do not require their use.

Figure 11.

Do states require teachers to be observed multiple times per year?



Observer qualifications

Only 19 states articulate specific certification requirements for observers, while 38 require some training for evaluators. Statewide policies like these are a lever to set a standard that all teachers are observed by a knowledgeable observer and are well calibrated to the observation protocol or rubric—two elements particularly critical to effective evaluations.607
 

Video and recorded observations

Prior to the onset of the pandemic and subsequent shift to remote instruction, four states (Massachusetts, Michigan, New Jersey, and New York) allowed some form of virtual observation for evaluations.608 Since April 2020, that number has more than tripled, as 10 additional states made changes to allow virtual observations in response to the use of remote instruction609 during the pandemic. Of the states with policies to accommodate virtual observation (our analysis included recorded observations, or live observations of teachers in a virtual/hybrid learning environment), some (Oklahoma, for example) specifically articulate that virtual observation is only to be used in a virtual learning environment. Others, like Massachusetts, Michigan, New Jersey, and New York (all of which enacted this flexibility pre-pandemic), allow for self-recording. New Jersey, building on the success of a pilot program for highly effective teachers, provides flexibility for tenured teachers who have received a "highly effective" rating on their most recent summative evaluation to replace one traditional, announced observation with a number of alternative activities, including videotaping a lesson and providing a reflection on that lesson.620

The flexibility of using technology to gather evidence, intended to support continued feedback and growth during an exceptional circumstance, could have long-term benefits, should districts and states choose to expand the use of self-recording to in-person teaching. In a recent study of 400 teachers, researchers found that when teachers videotaped themselves delivering a lesson and then watched the footage and discussed with an observer later, they reported more positive feelings about the observation and feedback process, and had a higher retention rate than peers not selected for videotaped observations.621 The flexibility to permit video observations may allow districts to adopt a practice that has better buy-in from educators, and has the potential added benefit of helping manage observers' time, which is often a challenge with in-person observations.
 

[tsupport]

Are evaluations used for support and improvement?


 
Evaluations should be connected to timely, specific, and actionable feedback, and give teachers opportunities for growth and chances to demonstrate improvement. As schools take on the urgent work of helping students recover from the pandemic, this is especially important. Yet far too many states still do not require that teachers receive any feedback after an observation, or that evaluations will be used to provide targeted growth and support. States also miss critical opportunities to use evaluation data at the state level to drive system-level improvement in how teachers deemed effective are distributed across the state.

Observation and evaluation feedback

Too many states still do not explicitly require feedback to be provided to teachers after an observation: 19 states do not have a statewide policy that requires feedback to teachers in any form (whether written, in-person or otherwise), while two states specifically designate observation feedback as optional.

Further, some states still do not explicitly require feedback to be provided to a teacher as part of their evaluation overall: eight states (including Alabama, Alaska, the District of Columbia, Iowa, Minnesota, Montana, New Hampshire, and Vermont) do not require teachers to receive feedback either written or in-person, after an evaluation.

Figure 12.

What feedback do states require after observations?



Connection to professional development opportunities and improvement plans

Research suggests that observations are more likely to positively impact teachers' effectiveness when they are connected directly to professional development opportunities.622 Yet 20 states do not explicitly connect evaluation results to professional development, missing a critical opportunity to require aligned support to help teachers to improve. Further, since 2019, at least three states (Delaware, New Mexico, and Oklahoma) have dropped policies connecting teacher evaluations to improvement plans.

Evaluation data

It is also critical that states collect and publish aggregate data on teacher evaluation. This data is key to understanding the distribution of teacher effectiveness across schools and communities—a pattern that has long been inequitable, resulting in students of color and low-income students consistently having lower access to the most effective teachers.623

As of December 2021, only 13 states had published school-level data on teacher effectiveness. Several states provide notable exceptions: Colorado, for example, publishes data on the distribution of effective teachers at the state, district, and school levels, and analyzes patterns in how effective teachers are distributed based on student demographics.624 Similarly, both Arkansas and Kentucky publish school report cards that include information about teacher effectiveness. These 13 states bring a level of transparency about the teacher evaluation data and teacher performance that could help direct resources and support where they are most needed.

Figure 13.

Do states publish school-level data on teacher performance?

Source: State of the States 2021: State Reporting of Teacher Supply and Demand Data, National Council on Teacher Quality

SECTION 2
[principal] 

Principal evaluation


The research is clear: Strong school leaders create strong schools.625 As states continue to look for ways to combat teacher turnover and help students recover academically, principals are key leaders of this work in their schools, and their evaluations should reflect that.

Principals have an important role to play in school quality, particularly in their support for and management of teachers. Evidence has shown a relationship between principal effectiveness and student academic outcomes:626 A recent meta-analysis estimated the impact of having an effective principal on student learning was nearly as large as having an effective teacher.627

Principals also play a role in teacher recruitment and retention,628 retaining effective teachers and exiting consistently low-performing teachers,629 and shaping teachers' experiences of school climate.630 They also influence students' perception of school climate, student attendance, in-school discipline, and parents' perceptions of the school.631 Given their critical role to both students and teachers, principals must receive meaningful feedback and opportunities for support through comprehensive evaluations. As with teachers, these systems can also serve to identify exemplary principals from whom others can learn, to support those who are struggling, and to ultimately exit leaders who do not improve with time and support.
 

[pdesign]

What role does the state play in designing principal evaluations?


 
Fourteen states set all criteria for principal evaluations, while 21 states set minimum criteria for what is included, and 16 states play no role in designing principal evaluations. As is the case for teacher evaluations, some flexibility for districts in designing evaluations may be useful; however, setting no shared standards or measures defining the central elements of a principal's job risks differing expectations, inconsistent attention to the core responsibilities of the job, and highly varied evaluation implementation in different communities across the state.

Figure 14.

Does the state set evaluation criteria for principals?



[pcomponents]

What makes up a principal's evaluation?

 
 

Objective measures of student growth

While research is clear that principals play a central role in student learning outcomes,632 considering different ways to measure this impact is an evolving matter. Recently, a working paper called into question the extent to which growth in student learning measured by current "value-added" models can be attributed to a principal during that same school year, suggesting growth measures for principals might lag more than one school year.633 This suggests that further study is needed to vet potential adjustments to principal value-added models, but it remains critical that states measure student learning, and new research reinforces that it is vital to use multiple measures of effectiveness to understand principal performance.

Twenty-seven states require measures of student growth in principal evaluation, while 24 do not. These numbers have steadily fallen since 2015, when 43 states required measures of student growth. Since 2019, Indiana, Maine, New Mexico, North Dakota, Oregon, and South Dakota removed requirements to include measures of student growth in principal evaluation. Interestingly, fewer states require measures of student growth to be included in principals' evaluations (27) compared to teachers' (30).

Figure 15.

Do states require measures of student growth in principal evaluations?



State assessments to measure student learning

Research suggests that principals have a major impact, both direct and indirect, on student achievement.634 Only 10 states factor state assessments into a principal's evaluation score, compared to 12 that require these tests to be reflected in teachers' evaluation scores.

Surveys

Principals play a key role in influencing the overall climate of a school, and at least one study has concluded that a principal's biggest influence on student learning is mediated through their ability to create a positive school climate.635 Other studies have validated the importance of principals' leadership and influence on school climate to retaining teachers, too.636 Fostering healthy school climates that re-engage and support students in the wake of several years of widespread trauma and disruption is critical; so too is fostering a school climate that prevents teacher burnout and motivates teachers to stay.

Given this, survey data from students, teachers, and the wider school community can be a valuable tool in helping provide feedback to principals and to measure a principal's success. Twenty-eight states explicitly allow or require surveys to be included in principal evaluations in some form. (For a breakdown of what kind of surveys are permitted, see Figure 16.) The state of Michigan, for instance, requires a mix of feedback from students, teachers, and parents all be included in a principal's evaluation score. Since 2019, at least one state that had previously required surveys for principal evaluations, Georgia, dropped this requirement.

Figure 16.

What types of surveys are required or explicitly allowed as part of a principal's evaluation score?

Note: Credit is given to states that require input from students, parents, teachers, and peers, and this feedback may be in the form of a survey.


Figure 17.

What is the role of surveys in principal evaluations?

Note: Surveys may include student, parent, teacher, and/or peer surveys. Credit is given to states that require input from students, parents, teachers, and peers, and this feedback may be in the form of a survey.

Link to instructional leadership

Much of the conversation on principal quality in recent years has centered around enhancing the role of principals as instructional leaders, setting a standard for strong instruction across the school, and helping teachers meet that standard. The urgency of instructional leadership has only heightened in the wake of the pandemic, but evidence suggests that far too few principals feel that they are able to fulfill that aspect of their role, given the many competing demands on their time.637 Despite the importance of clarifying a principal's role in instructional leadership, many states have failed to use evaluation to signal that it is a priority: 18 states still do not explicitly link principal evaluations to their role as instructional leaders by including specific criteria related to this role in their evaluations.
 

[pwhen]

How often are principals evaluated?


 
Like teachers, principals need to receive regular, actionable feedback and formal evaluations of their performance. Thirty states require that principals are evaluated each year, while eight set the frequency of evaluation based on principals' years of experience, with states more likely to require evaluation in the early years of their career. One implementation challenge that some states may face is that principal employment contract cycles and evaluation cycles do not line up. For example, in some districts, principals may be on a three-year employment cycle but a two-year evaluation schedule, meaning that feedback cycles and employment decisions are not aligned.

Figure 18.

How frequently are principal evaluations required?



[psupport]

Are principal evaluations used for support and improvement?


 
If evaluation systems are designed to help principals hone their practice and improve student learning, then they must be linked to improvement systems. Yet too few states require that principals with less-than-effective ratings are placed on improvement plans: 22 states require improvement plans, while 29 either do not require improvement plans as remediation for low ratings or do not have a system of improvement plans at all. Since 2019, Georgia, Mississippi, Ohio, and Virginia have added new requirements, while Iowa, Nevada, New Mexico, South Carolina, and Utah removed requirements for improvement plans for less-than-effective principals.638 These policy shifts have resulted in one fewer state overall requiring improvement plans for principals deemed ineffective.

Figure 19.

Do states require improvement plans for principals with less-than-effective ratings?

Note: Four states (Massachusetts, Michigan, Nevada, and Washington) require that high-performing principals are evaluated less frequently.

[recs]
RECOMMENDATIONS


Evidence supports key policies and practices in evaluation that improve teacher and principal skills and ultimately student outcomes: use of multiple measures (including student surveys and academic growth measures), regular opportunities for feedback, and more. (For a comprehensive list of policy conditions that standout systems have in common, see Figure 20.) States have a central role to play in both setting policy conditions and supporting effective implementation. We recommend high-leverage state policies and practices below.

Figure 20.

Components of a strong evaluation system

   
Multiple measures
Student surveys
Objective measures of student growth
At least three rating categories
Annual observations and evaluations for all teachers
Professional development tied to evaluation
Written feedback after each observation
   
Source: Putman, H., Ross, E., & Walsh, K. (2018).



 

Policy recommendations

   

Focus on student growth

Coming out of pandemic disruptions, states should begin with a renewed commitment to accelerating academic growth, and reflect the importance of this goal when designing teacher and principal evaluations. Student growth should be included as part of a range of evidence-based multiple measures, like surveys and observations. In response to concerns over gaps in available student data, states may consider temporary adjustments to their student growth model, such as expanding the years within the model, rather than eliminate it.
 

Require multiple observations, regular feedback, and annual evaluations

Multiple observations, regular feedback, and required annual evaluations for all teachers are important elements of effective evaluation systems that contribute to increased student learning.639 High-quality evaluation can be part of a comprehensive effort to ensure that all new teachers receive the regular feedback they need. Research suggests that frequent observations that are followed by timely, specific feedback have a discernible impact on teachers' improved practice.640 States play an important role in setting requirements for the timing and content of evaluations.

Support new teachers

At a time when retaining and supporting effective teachers is taking on even more urgency, all teachers (but especially new teachers) deserve supportive, actionable feedback and opportunities for growth and development on a regular basis. Data consistently shows that early career teachers have the highest attrition rate, yet they are too often left alone in their classrooms with little support. 641 States can require that novice teachers receive more opportunities for observation and feedback, as they do in Delaware, New Mexico, Ohio, West Virginia, and Wisconsin, all of which require four observations for novice teachers each year.

Consistent evaluations with multiple opportunities to see teachers' practice and provide feedback contribute to teachers' growth and development. This is particularly important during a time when many states have lowered standards for entry into the teaching profession, allowing some new teachers to take on responsibility for classrooms without demonstrating they have mastered the knowledge and skills necessary to be successful.642 If states choose to pursue policies that allow less-prepared candidates to enter, then they must simultaneously commit to policies that support and evaluate these new teachers early and regularly.

Collect and publish statewide data

Collecting and publishing effectiveness data is critical to understanding which students have access to impactful teachers (and which students do not). There is widespread and long-standing evidence that effective teachers are distributed inequitably,643 and evaluation data is critical to identifying and ameliorating these gaps.644 States can begin to right these inequities by using evaluation data to see where they exist. Following the example set by states like Colorado, Arkansas, and Kentucky, states can develop systems that support improvement. Further, when states collect evaluation data, they hold districts accountable for providing teachers with feedback and evaluation—a core responsibility of state agencies.

Measure what matters for principals

Principals' impact on both school climate and educator satisfaction has a demonstrated relationship to improved student outcomes, teacher success, and retention,645 and they should be measured accordingly. To gain a comprehensive picture of a principal's impact, states should consider multiple measures of principal effectiveness, including surveys and measures of student learning. States can also explore new methods to measure student learning in principal evaluation, such as including multiple years of data in a principal's evaluation,646 or temporarily adjusting current growth models to account for any gaps in student data.

Design systems with consequences

Evaluation systems should recognize strong performance through incentives like sizable bonuses647 and provide real opportunities for teachers and principals to improve. For teachers and principals who do not improve over time, state policies should provide a clear process to exit.
 

   
  

How can states support quality implementation?


Beyond setting overarching policy conditions, states may also be searching for additional levers to ensure that their evaluation systems are implemented well. Though much of the work of making evaluation meaningful and effective happens in schools and districts, states have a range of tools available to them to support quality implementation. Of the systems that have had meaningful success, what most had in common were sustained investment that lasted beyond one system leader; a commitment to getting all stakeholders invested in the system; and a commitment and follow-through to iterating and evaluating the evaluation system, and improving over time. Below, we offer additional steps for states working to improve how their evaluation system operate in practice:

Analyze and act on statewide data

In order to understand how their evaluation systems are working, states should collect, analyze, and report on evaluation ratings. States can use this data to answer (and ultimately, act on) key questions about the relationship between evaluation data and student growth and achievement; identify inequities for teachers (see below); and target the inequitable distribution of teacher talent. States like Colorado have made progress in collecting and analyzing evaluation data from across the state in order to better understand the distribution and assignment of teachers.

Address disproportionate impact

State and local policymakers must take issues of potential racial bias in evaluation systems seriously—an effort vital to fundamental fairness and equity, and to ensuring that states support and retain the diverse teacher workforce that their students deserve. Recent evidence suggests that gaps exist in evaluation scores for teachers of color—in some cases, researchers have traced these disparities to observer bias, finding that white observers systematically assign Black teachers lower ratings, or that observers were more likely to assign higher scores to teachers of the same race in general,649 while others have found that racial gaps in observation scores could be traced directly back to the student populations that teachers of color were more likely to teach in the first place.650
  • Researchers have emphasized that their findings are not cause to discontinue evaluation, but to continuously improve systems, increasing fairness, equity, and trust.651 While there is not yet strong evidence on interventions that work to ameliorate systemic bias in evaluation systems as a whole, researchers have suggested that using an evaluation system with multiple measures could partially mitigate this risk.654 To address observer bias, states could explore efforts to diversify the general pool of observers, increase the overall validity and reliability of observations by using multiple observers over the course of multiple observations,655 and calibrate observations (see recommendation below).
  • Before taking action, states need to understand the existing data. They should begin by requiring that districts submit teacher evaluation data annually, disaggregated by teacher subgroups, and analyze that data at the state level to understand impact and identify any disproportionate effects related to teacher demographic characteristics.656 In service of transparency, systems can follow the lead of District of Columbia Public Schools, which has published highly detailed data on the trends in equity and mitigating implicit bias in its evaluation system.657

Collect user feedback

States should aim to understand what both principals and teachers think about the usefulness of the evaluation systems they use, and get their view of how evaluation systems are implemented. Idaho, for instance, has conducted annual surveys of K-12 administrators to gauge how well they implement evaluation requirements like timing and number of observations. While the state’s most recent survey found troubling gaps in practice, it has been able to document some improvement over several years.658

Focus on continuous improvement

Using the feedback they collect, states can adjust evaluation systems and make improvements over time to their evaluation policies and practices. Tennessee stands out as a notable example of a state that made marked improvement in building trust as it worked to refine its evaluation system. As detailed in NCTQ’s 2018 report Making a Difference, Tennessee continually refined its evaluation system in response to educator feedback, meeting with over 7,500 educators to incorporate their input. Ultimately, the state made impressive gains in teachers’ beliefs about their evaluation systems: In 2012, only 38% of Tennessee teachers surveyed said that their school’s teacher evaluation process led to improvements in their teaching, a number that rose to 72% by 2018.659

Sponsor statewide evaluator training

In order to influence the quality of evaluator preparation, state agencies can also provide access to statewide training to all evaluators, though this would depend on having statewide common expectations or common elements within the evaluation system. Delaware, for instance, provides annual statewide training for all evaluators.660

Certify and calibrate observer skills

In order to promote access to effective evaluators, states can require that observers demonstrate their knowledge and skill as an observer through a state certification process, and provide resources to evaluators in order to calibrate ratings effectively. Texas, as part of the state Teacher Incentive Allotment, partners with TexasTech as a third-party reviewer of evaluation data to validate its accuracy. Massachusetts offers an optional resource known as the Online Platform for Teaching and Informed Calibration(OPTIC) to support educators and evaluators to build a shared understanding of high-quality instruction and improve the feedback that teachers receive. OPTIC uses video and interactive displays as part of a dynamic calibration training experience for both evaluators and teachers aligned to the standards for effective teacher practice and the standards for student learning.

Link to teacher preparation

To increase alignment between expectations for pre-service teachers and in-service teachers, states can require that all teacher preparation programs in the state use measures of performance for teacher candidates that are aligned to the state’s professional teaching standards and evaluation standards. Massachusetts, for instance, uses the Massachusetts Candidate Assessment of Performance (CAP), a practice-based assessment that aligns the evaluation of pre-service candidates to the teacher evaluation of in-service teachers. In a 2019 study, the CAP was found to be predictive of future scores on teachers’ in-service evaluation scores.661

[data]
DATA


Download full dataset

Download the full teacher and principal evaluation policy data collected by NCTQ and used in this analysis. 

[ack]
ACKNOWLEDGEMENTS


Authors
Abigail Swisher and Dr. Patricia Saenz-Armstrong

Data collection and analysis
Kelli Lakis and Lisa Staresina

Project leadership
Dr. Heather Peske, NCTQ President
Shannon Holston, Chief of Policy and Programs
Hannah Putman, Managing Director of Research

Communications and advocacy
Nicole Gerber, Ashley Kincaid, Andrea Browne Taylor, and Shayna Levitan

Reviewers
Special thanks to the following individuals for providing review and feedback on this project. Inclusion does not imply endorsement.

Dr. Matthew A. Kraft
Associate Professor of Education and Economics
Brown University

Ron Noble, Jr.
Assistant Superintendent
Methuen Public Schools

Project funders
This report is based on research funded by the following foundations. The findings and conclusions contained within are those of the authors and do not necessarily reflect positions or policies of the project funders.

Daniels Fund
The Joyce Foundation

Suggested citation: Swisher, A. & Saenz-Armstrong, P. (2022). State of the States 2022: Teacher and Principal Evaluation Policies. Washington, D.C.: National Council on Teacher Quality.


Explore these other NCTQ state policy reports
 
State of the States 2022: Teacher Compensation Strategies

How do states use strategic teacher compensation, such as differentiated pay for hard-to-staff schools and subjects, performance pay, and pay for prior work experience to attract and retain great teachers to where they are most needed?

State of the States 2021: State Reporting of Teacher Supply and Demand Data

What data do states collect and report on the teacher labor market? Do states connect data on supply and demand to better understand and address teacher shortages?

State of the States 2021: Teacher Preparation Policy
What are state policy trends that govern some of the most essential aspects of teacher preparation, from reading and content knowledge licensure exams to admissions and basic skills test requirements?

State Policy Brief 2022: Ensuring Students' Equitable Access to Qualified and Effective Teachers
How have states responded to a 2015 federal law that they collect and report on the equitable distribution of teacher talent across their schools?
[endnotes]
Symbol

Get More Updates On Our Research