Frequency of Evaluation and Observation:
Massachusetts

Teacher and Principal Evaluation Policy

Note

The data and analysis on this page is from 2019. View and download the most recent policy data and analysis on Frequency of Evaluation and Observation in Massachusetts from the State of the States 2022: Teacher and Principal Evaluation Policies report.

Goal

The state should require annual evaluations of all teachers. The bar for this goal was raised in 2017.

Meets goal in part
Suggested Citation:
National Council on Teacher Quality. (2017). Frequency of Evaluation and Observation: Massachusetts results. State Teacher Policy Database. [Data set].
Retrieved from: https://www.nctq.org/yearbook/state/MA-Frequency-of-Evaluation-and-Observation-77

Analysis of Massachusetts's policies

Frequency of Evaluations: Massachusetts does not ensure that all teachers are evaluated annually. Veteran teachers who receive a rating of exemplary or proficient coupled with a moderate or high impact on student learning must only be evaluated once every two years. All other teachers, including probationary teachers, must be evaluated annually.

Multiple Observations: Massachusetts requires observations, but a required number is not specified. 

Feedback for New Teachers: Massachusetts requires formative assessments, which provide feedback on performance, at midyear.

Citation

Recommendations for Massachusetts

Require annual formal evaluations for all teachers.
All teachers in Massachusetts should be evaluated annually, even those who score proficient or above with at least a moderate impact on student learning on the state's summative evaluation. Rather than treated as mere formalities, these teacher evaluations should serve as important tools for rewarding good teachers, helping average teachers improve and holding weak teachers accountable for poor performance.

Base evaluations on multiple observations.
To guarantee that annual evaluations are based on an adequate collection of information, Massachusetts should require multiple observations for all teachers, even those who have nonprobationary status. 

State response to our analysis

Massachusetts recognized the factual accuracy of this analysis. The state added that its Elementary and Secondary Education (ESE) Model System represents the nonnegotiables in every district, and the guidance it provides serves as best practices for local districts to reference as they make school policy decisions. Massachusetts recognizes the uniqueness of the state's Educator Evaluation System and the latitude it provides school districts as they implement policy within a local context.

Massachusetts also noted, with regard to the frequency and duration of observations, that it has steered away from being overly prescriptive. Its regulations require "unannounced observations of practice for any duration." Brief, unannounced observations followed by feedback give educators the opportunity to receive frequent and quality feedback. For evaluators, short observations followed by brief feedback are a realistic and efficient approach relative to their capacity and workload.








Updated: December 2017

Last word

NCTQ agrees that brief, unannounced observations followed by feedback can be a powerful approach to providing teachers with the feedback and support they need to improve their practice. However, Massachusetts is urged to solidify this best practice by requiring districts to require multiple observations for all teachers.

How we graded

7C: Frequency of Evaluation and Observation 

  • Summative Ratings for All Teachers: The state should require that all teachers receive an annual summative rating.
  • Multiple Observations for All Teachers: The state should require that all teachers receive multiple formal observations that provide feedback.
  • Timely Feedback for Probationary Teachers: The state should require that all probationary teachers receive an observation within the first few months of the school year.
Summative Ratings for Probationary Teachers
One-quarter of the total goal score is earned based on the following:

  • One-quarter credit: The state will earn one-quarter of a point if it requires that all probationary teachers receive an annual summative rating. 
Summative Rating for Nonprobationary Teachers
One-quarter of the total goal score is earned based on the following:

  • One-quarter credit: The state will earn one-quarter of a point if it requires that all nonprobationary teachers receive an annual summative rating. 
Multiple Observations for All Teachers
One-quarter of the total goal score is earned based on the following:

  • One-quarter credit: The state will earn one-quarter of a point if it requires that all teachers receive multiple observations. 
Timely Feedback for Probationary Teachers
One-quarter of the total goal score is earned based on the following:

  • One-quarter credit: The state will earn one-quarter of a point if it requires that all nonprobationary teachers are observed and receive feedback during the first half of the school year.

Research rationale

Observations serve several purposes, including to provide actionable feedback to teachers and to provide a summative rating that can be used in staffing decisions. Observations can be a rich source of information for teachers, giving them useful feedback to improve their practice.

Multiple data sources should be used in teacher evaluation, including multiple observations by more than one observer.[1] Teacher observations conducted by principals that occur once or twice a year and consist of rating teachers on observable behaviors and characteristics have not proved valid.[2] Research widely finds that the nature of their role as both instructional leaders and summative judges inhibits principals' ability to reliably serve as evaluators.[3] In contrast, observations conducted by peers and other observers with subject knowledge are valid and reliable.[4] Additionally, teacher observations are more effective when they occur in tandem with aligned professional development.[5]

Observations are especially important for new teachers. In the absence of good metrics for determining who will be an effective teacher before he or she begins to teach,[6] it is critical that schools and districts closely monitor the performance of new teachers.[7] States should specifically require that new teachers receive an observation early in the school year. Early feedback may be especially essential for new teachers, given that teachers' performance in their first year is a strong predictor of their performance in later years.[8]

Student reports of teacher quality are a unique and largely untapped source of rich data.[9] Research finds that student input on teacher quality adds value to teacher evaluation systems. Research also finds teachers prefer evaluation systems that include student survey data.[10] Students' first-hand reports of classroom elements (e.g., textbooks, homework, instruction), teacher-student communication, assignments, and daily classroom operations may provide teachers with credible information about their impact in the classroom, as well as serve as a tool for formative evaluation.[11] Student perceptions of learning environments can be reliable and predictive of learning.[12] Including student surveys in teacher evaluation systems strengthens the ability to identify teachers' effects on outcomes beyond standardized test scores.[13] In addition, teacher evaluation systems that include student survey data, which are somewhat correlated with teachers' student growth measures,[14] are stronger, more reliable, and more valid than those that rely solely on administrator reports and observations.[15]


[1] Glass, G. V. (1974). A review of three methods determining teacher effectiveness. In H. J. Walberg (Ed.), Evaluating educational performance (pp. 11-32). Beverly Hills, CA: Sage.; Travers, R. M. W. (1981). Criteria of good teaching. In J. Millman (Ed.), Handbook of teacher evaluation (pp. 14-22). Beverly Hills, CA: Sage.; Xu, S. & Sinclair, R. L. (2002). Improving teacher evaluation for increasing student learning. Paper presented at the annual meeting of the AERA, New Orleans, LA.
[2] Peterson, K. D. (2004). Research on school teacher evaluation. NASSP Bulletin, 88(639), 60-79.; The New Teacher Project. (2009). The widget effect: Our national failure to acknowledge and act on differences in teacher effectiveness. Retrieved from http://files.eric.ed.gov/fulltext/ED515656.pdf&sa=D&ust=1508185360843000&usg=AFQjCNG_FOzv9usICvWem-xNf0Ny71KcMg; Ellet, C. D. & Teddlie, C. (2003). Teacher evaluation, teacher effectiveness, and school effectiveness: Perspectives from the USA. Journal of Personnel Evaluation in Education, 17(1), 101-128.; Good, T. L., & Mulryan, C. (1990). Teacher ratings: A call for teacher control and self-evaluation. In J. Millman & L. Darling-Hammond (Eds.) The new teacher handbook of teacher evaluation: Assessing elementary and secondary school teachers. Newbury Park, CA: Sage.; Darling-Hammond, L. (1986). A proposal for evaluation in the teaching profession. The Elementary School Journal, 86(4), 530-551.; Hazi, H. M., & Arredondo Rucinski, D. (2009). Teacher evaluation as a policy target for improved learning: A fifty-state review of statute and regulatory action since NCLB. Education Policy Analysis Archives, 17(5).; Jacob, B. A., & Lefgren, L. (2008). Can principals identify effective teachers? Evidence on subject performance evaluation in education. Journal of Labor Economics, 26(1), 101-136.; Peterson, K. D. (2000). Teacher evaluation: A comprehensive guide to new directions and practices (2nd ed.). Thousand Oaks: Corwin Press.; Stiggins, R. J., & Bridgeford, N. J. (1985). Performance assessment for teacher development. Educational Evaluation and Policy Analysis, 7(1), 85-97.
[3] Jordan School District (1995). Jordan Performance Appraisal System. Sandy, UT: Jordan School District, Utah.; Lortie, D. (1975). Schoolteacher: A sociological study. Chicago, IL: University of Chicago Press.; Waller, W. (1932). The sociology of teaching. New York, NY: Wiley & Sons.; Popham, W. J. (1988). The dysfunctional marriage of formative and summative teacher evaluation. Journal of Personnel Evaluation in Education, 1(3), 269-273.; Hunter, M. (1988). Effecting a reconciliation between supervision and evaluation: A reply to Popham. Journal of Personnel Evaluation in Education, 1(3), 275-79.; Ellett, C .D. (1987). Emerging teacher performance assessment practices: Implications for the instructional supervision role of school principals. In W. Greenfeld (Ed.), Instructional leadership: Concepts, issues, and controversies (pp. 302-327). Boston, MA: Allyn and Bacon.; Scriven, M. (1988). Duty-based teacher evaluation. Journal of Personnel Evaluation in Education, 1(4), 319-334.; Stronge, J. H., Helm, V. M., & Tucker, P. D. (1995). Evaluation handbook for professional support personnel. Michigan: CREATE, The Evaluation Center, 1-91.; Cook, M. A., & Richards, H. C. (1972). Dimensions of principal and supervisor ratings of teacher behavior. Journal of Experimental Education, 41(2), 11-14.
[4] Peterson, K. (2004). Research on school teacher evaluation. NASSP Bulletin, 88(639), 60-79.; Hill, H., & Grossman, P. (2013). Learning from teacher observations: Challenges and opportunities posed by new teacher evaluation systems. Harvard Educational Review, 83(2), 371-384.
[5] Shaha, S. H., Glassett, K. F., & Copas, A. (2015). The impact of teacher observations with coordinated professional development on student performance: A 27-state program evaluation. Journal of College Teaching & Learning, 12(1), 55.
[6] For review on limited data on new teachers, see: Chingos, M. M., & Peterson, P. E. (2011). It's easier to pick a good teacher than to train one: Familiar and new results on the correlates of teacher effectiveness. Economics of Education Review, 30(3), 449-465.
[7] Staiger, D. O., & Rockoff, J. E. (2010). Searching for effective teachers with imperfect information. The Journal of Economic Perspectives, 24(3), 97-117.
[8] Atteberry, A., Loeb, S., & Wyckoff, J. (2015). Do first impressions matter? Predicting early career teacher effectiveness. AERA Open, 1(4), 1-23.
[9] Aleamoni, L. M. (1999). Student rating myths versus research facts from 1924 to 1998. Journal of Personnel Evaluation in Education, 13, 153-166.; Peterson, K.D. (2000). Teacher evaluation: A comprehensive guide to new directions and practices. (2nd ed.). Thousand Oaks, CA: Corwin Press.
[10] Peterson, K.D. (2000). Teacher evaluation: A comprehensive guide to new directions and practices. (2nd ed.). Thousand Oaks, CA: Corwin Press.; Peterson, K. D., Wahlquist, C., & Bone, K. (2000). Student surveys for school teacher evaluation. Journal of Personnel Evaluation in Education, 14(2), 135-153.; Peterson, K. D. (2004). Research on school teacher evaluation. NASSP Bulletin, 88(639), 60-79.; Stronge, J., & Ostrander, L. (1997). Client surveys in teacher evaluation. In J. H. Stronge (Ed.), Evaluating teaching: A guide to current thinking and best practice (pp. 129-161). Thousand Oaks, CA: Corwin Press.
[11] Peterson, K.D. (2000). Teacher evaluation: A comprehensive guide to new directions and practices. (2nd ed.). Thousand Oaks, CA: Corwin Press.; Aleamoni, L. M. (1981). Student ratings of instruction. In J. Millman (Ed.), Handbook of teacher evaluation (pp. 110-145). Beverly Hills, CA: Sage.; Aleamoni, L. M. (1987). Student rating myths versus research facts. Journal of Personnel Evaluation in Education, 1, 111-119.; Aleamoni, L. M. (1999). Student rating myths versus research facts from 1924 to 1998. Journal of Personnel Evaluation in Education, 13, 153-166.; McGreal, T. L. (1983). Successful teacher evaluation. Alexandria, VA: Association for Supervision and Curriculum Development.; Peterson, K. D., Stevens, D., & Driscoll, A. (1990). Primary grade student reports for teacher evaluation. Journal of Personnel Evaluation in Education, 4, 165-173.; Wallace, T. L., Kelcey, B., & Ruzek, E. (2016). What can student perception surveys tell us about teaching? Empirically testing the underlying structure of the Tripod student perception survey. American Educational Research Journal, 53(6), 1834-1868.
[12] Fauth, B., Decristan, J., Rieser, S., Klieme, E., & Büttner, G. (2014). Student ratings of teaching quality in primary school: Dimensions and prediction of student outcomes. Learning and Instruction, 29, 1-9.; Wagner, W., Gollner, R., Helmke, A., Trautwein, U., & Ludtke, O. (2013). Construct validity of student perceptions of instructional quality is high, but not perfect: Dimensionality and generalizability of domain-independent assessments. Learning and Instruction, 28, 1-11.; Kane, T. J., & Cantrell, S. (2010). Learning about teaching: Initial findings from the measures of effective teaching project. Seattle, WA: The Bill & Melinda Gates Foundation.
[13] Kane, T. J., McCaffrey, D. F., Miller, T., & Staiger, D. O. (2013). Have we identified effective teachers? Validating measures of effective teaching using random assignment. MET Project. Seattle, WA: The Bill & Melinda Gates Foundation.
[14] Wallace, T. L., Kelcey, B., & Ruzek, E. (2016). What can student perception surveys tell us about teaching? Empirically testing the underlying structure of the Tripod student perception survey. American Educational Research Journal, 53(6), 1834-1868.
[15] Peterson, K. D., & Stevens, D. (1988). Student reports for school teacher evaluation. Journal of Personnel Evaluation in Education, 1, 259-267.; Stronge, J., & Ostrander, L. (1997). Client surveys in teacher evaluation. In J. H. Stronge (Ed.), Evaluating teaching: A guide to current thinking and best practice (pp. 129-161). Thousand Oaks, CA: Corwin Press.