Evaluation of Effectiveness: Wisconsin

2015 Identifying Effective Teachers Policy

Goal

The state should require instructional effectiveness to be the preponderant criterion of any teacher evaluation.

Meets a small part
Suggested Citation:
National Council on Teacher Quality. (2015). Evaluation of Effectiveness: Wisconsin results. State Teacher Policy Database. [Data set].
Retrieved from: https://www.nctq.org/yearbook/state/WI-Evaluation-of-Effectiveness-71

Analysis of Wisconsin's policies

Wisconsin requires some evidence of student learning in its teacher evaluations. The state has developed the Educator Effectiveness System. Districts may use alternative models if equivalent, but they must be approved.

Fifty percent of the total evaluation score is based on student outcomes. The student outcomes portion is comprised of one student learning outcome (SLO) goal (95 percent) and schoolwide value-added or graduation rate (5 percent). The SLOs are self-scored by the teacher being evaluated. 

The remaining 50 percent is based on professional practice, which includes classroom observations. 

Wisconsin no longer utilizes multiple rating categories in its evaluation system. Instead, the state now reports a teacher's practice and student outcomes scores on a graph, with the axes representing these two scores. 

Citation

Recommendations for Wisconsin

Ensure that student growth measures are objective.
Although Wisconsin requires that 50 percent of the evaluation score is comprised of student outcomes, the fact that 95 percent of that portion consists of a self-scored SLO undermines the value of incorporating student data into an evaluation score, thereby rejecting better sources of objective evidence. The state is encouraged to rethink its SLO scoring policy and only allow objective student growth data to factor into a teacher evaluation rating. 

Utilize rating categories that meaningfully differentiate among various levels of teacher performance.
To ensure that the evaluation instrument accurately differentiates among levels of teacher performance, Wisconsin should require districts to utilize multiple rating categories, such as highly effective, effective, needs improvement and ineffective. The state's new system merely provides a "general sense of effectiveness," which is inadequate and does not allow any sort of meaningful differentiation among levels of performance. 

State response to our analysis

Wisconsin was helpful in providing NCTQ with the facts necessary for this analysis. The state added that it has modified its SLO process, including the scoring of SLOs, as well as its entire Student Outcome portion of its Educator Effectiveness System. Regarding its new method of reporting teachers' scores, Wisconsin noted that "the point plotted for the individual educator illustrates graphically strengths and areas for growth."  In a subsequent response, the state asserted that while teachers do self-score the SLO as part of the reflection process, this is not the score which contributes to their final evaluation. The evaluator reviews all SLO evidence of student progress and teacher process (including the accuracy of the teacher’s self-score) when providing the final, holistic score.

Last word

According to documentation regarding the SLO process submitted by the state, it will use the same data and measures as outlined in the above analysis. However, the shift in process addresses the issues of goal setting and feedback regarding the teacher's implementation progress and its impact on student progress. 

How we graded

Research rationale

Value-added analysis connects student data to teacher data to measure achievement and performance.
Value-added models are an important tool for measuring student achievement and school effectiveness. These models measure individual students' learning gains, controlling for students' previous knowledge. They can also control for students' background characteristics. In the area of teacher quality, value-added models offer a fairer and potentially more meaningful way to evaluate a teacher's effectiveness than other methods schools use.

For example, at one time a school might have known only that its fifth-grade teacher, Mrs. Jones, consistently had students who did not score at grade level on standardized assessments of reading. With value-added analysis, the school can learn that Mrs. Jones' students were reading on a third-grade level when they entered her class, and that they were above a fourth-grade performance level at the end of the school year. While not yet reaching appropriate grade level, Mrs. Jones' students had made more than a year's progress in her class. Because of value-added data, the school can see that she is an effective teacher.Teachers should be judged primarily by their impact on students.

While many factors should be considered in formally evaluating a teacher, nothing is more important than effectiveness in the classroom.
Unfortunately, districts have used many evaluation instruments, including some mandated by states, that are structured so that teachers can earn a satisfactory rating without any evidence that they are sufficiently advancing student learning in the classroom. It is often enough that teachers appear to be trying, not that they are necessarily succeeding.

Many evaluation instruments give as much weight, or more, to factors that lack any direct correlation with student performance—for example, taking professional development courses, assuming extra duties such as sponsoring a club or mentoring and getting along well with colleagues. Some instruments hesitate to hold teachers accountable for student progress. Teacher evaluation instruments should include factors that combine both human judgment and objective measures of student learning.

Evaluation of Effectiveness: Supporting Research
Reports strongly suggest that most current teacher evaluations are largely a meaningless process, failing to identify the strongest and weakest teachers. The New Teacher Project's report, "Hiring, Assignment, and Transfer in Chicago Public Schools", July 2007 at: http://www.tntp.org/files/TNTPAnalysis-Chicago.pdf, found that the CPS teacher performance evaluation system at that time did not distinguish strong performers and was ineffective at identifying poor performers and dismissing them from Chicago schools. See also Lars Lefgren and Brian Jacobs, "When Principals Rate Teachers," Education Next, Volume 6, No. 2, Spring 2006, pp.59-69. Similar findings were reported for a larger sample in The New Teacher Project's The Widget Effect (2009) at: http://widgeteffect.org/.  See also MET Project (2010). Learning about teaching: Initial findings from the measures of effective teaching project. Seattle, WA: Bill & Melinda Gates Foundation.

A Pacific Research Institute study found that in California, between 1990 and 1999, only 227 teacher dismissal cases reached the final phase of termination hearings. The authors write: "If all these cases occurred in one year, it would represent one-tenth of 1 percent of tenured teachers in the state. Yet, this number was spread out over an entire decade." In Los Angeles alone, over the same time period, only one teacher went through the dismissal process from start to finish. See Pamela A. Riley, et al., "Contract for Failure," Pacific Research Institute (2002).

That the vast majority of districts have no teachers deserving of an unsatisfactory rating does not seem to correlate with our knowledge of most professions that routinely have individuals in them who are not well suited to the job. Nor do these teacher ratings seem to correlate with school performance, suggesting teacher evaluations are not a meaningful measure of teacher effectiveness. For more information on the reliability of many evaluation systems, particularly the binary systems used by the vast majority of school districts, see S. Glazerman, D. Goldhaber, S. Loeb, S. Raudenbush, D. Staiger, and G. Whitehurst, "Evaluating Teachers: The Important Role of Value-Added." The Brookings Brown Center Task Group on Teacher Quality, 2010. 

There is growing evidence suggesting that standards-based teacher evaluations that include multiple measures of teacher effectiveness—both objective and subjective measures—correlate with teacher improvement and student achievement. For example see T. Kane, E. Taylor, J. Tyler, and A. Wooten, "Evaluating Teacher Effectiveness." Education Next, Volume 11, No. 3, Summer 2011, pp.55-60; E. Taylor and J. Tyler, "The Effect of Evaluation on Performance: Evidence from Longitudinal Student Achievement Data of Mid-Career Teachers." NBER Working Paper No. 16877, March 2011; as well as H. Heneman III, A. Milanowski, S. Kimball, and A. Odden, "CPRE Policy Brief: Standards-based Teacher Evaluation as a Foundation for Knowledge- and Skill-based Pay," Consortium for Policy Research, March 2006.