a comprehensive review of 2,500 teacher
programs in the U.S. rated on 19 standards
find the best in teacher prep
By Institution
By Location


nctq methodology for teacher prep review

The NCTQ Teacher Prep Review evaluates the quality of programs that provide preservice preparation of  teachers. The Review database includes both traditional and non-traditional programs.46

Traditional programs: The Teacher Prep Review evaluates a total of 2,481 undergraduate and graduate elementary, secondary, and special education programs offered by education schools in 1,117 public and private institutions of higher education.47

Non-traditional programs:
Program overviews are provided for 85 non-traditional secondary programs, expanding to 146 non-traditional elementary and secondary programs in future editions.

Total sample: In total, over the six releases that comprise this edition of the Review, the Teacher Prep Review will post evaluations of 2,627 teacher preparation programs. These are the programs referred to as "the sample." The approximately 240 institutions that produce fewer than 20 teachers annually (and together produce less than 1 percent of the nation's public school teacher candidates) are not included in the sample. In past years, the Review has evaluated prep programs responsible for producing 99 percent of traditionally prepared teachers, and 85 percent of all teachers including those from non-traditional programs. We are expanding our reach to include even more traditional and non-traditional programs, and will update these numbers following the 2018 releases of the Teacher Prep Review.

The Review also includes ratings for 85 non-traditional secondary programs, and will expand this sample to include more secondary programs and non-traditional elementary programs in future editions. The methodology used for evaluation of non-traditional programs is available here.

Our methodology is largely unchanged since the Review's inception. For example, we were very systematic in selecting traditional programs for evaluation at each of the institutions in the sample at the outset of the Review. Because the sample has remained the same, no new traditional program selection has taken place with the exception of adding additional special education prep programs. More information about program selection is included here.

Similarly, for traditional teacher preparation program evaluation, our data collection and verification methods are unchanged, although over the past several editions of the Review, we have used more open records requests of school districts.
The development of the NCTQ standards and methodology was accomplished deliberately over a period of nine years, with 13 pilot studies of 1,597 programs in all 50 states and the District of Columbia, and field tests of 41 standards in all.52 We've written a primer on traditional teacher preparation to provide some important background information. For definitions of key terms, see our glossary.

An appeal to institutions to provide data for each edition of the Review is sent months in advance of the release date, and data are accepted for several months. All institutions were invited to submit new data for all standards, regardless of whether they had been evaluated on those standards in past editions of the Review.

Standards are a crucial governing feature of every institution involved in education, including teacher preparation programs. NCTQ's standards focus on what programs should do to prepare teachers to teach to the high level required by college- and career-readiness standards, and their application is measurable.

NCTQ developed these criteria for policies and practices to raise the level of training of the nation's teacher workforce through a number of different sources.

To the extent that high-quality research can inform how teachers should be prepared, NCTQ uses that research to formulate standards. Unfortunately, research in education that connects preparation practices to teacher effectiveness is both limited and inconsistent in quality. Our standards for the Teacher Prep Review are also based on the consensus opinions of internal and external experts, the best practices of other nations, the states with the highest performing students, and, most importantly, what superintendents and principals around the country tell us they look for in the new teachers they hire. The standards have been refined over ten years by 13 national and state studies, and by consultation with experts on NCTQ's Technical Panel. Because many were developed before increasingly rigorous state student learning standards had been implemented, they have also been honed to ensure alignment with those standards.

We continue to develop new standards, including one on Rigor of Grading that was applied to undergraduate teacher preparation programs in a separate report released in fall 2014, and one on Fundamentals of Instruction, which was developed in a new report in winter 2016 and will be used to evaluate secondary programs in the future.

For each of our standards, we have developed a rationale that lays out the support found in research and other sources. These rationales can be found in the "Understanding Our Standard" documents we have created for every standard for teacher preparation used in the Teacher Prep Review.

We welcome an ongoing discussion with others--teachers, teacher educators, prep program leaders, state policymakers, and accrediting bodies--about the best way to evaluate teacher preparation program quality.
We have established a wide array of techniques to collect and validate the data we need for the Teacher Prep Review. As always, our chief concern was ensuring that we obtained valid data that accurately reflected the training these institutions provide teacher candidates.

To determine what data we needed from institutions and to gather those data for program evaluation, we began by analyzing each program, and reviewing university catalogs and other program material posted publicly by the institution. By this means we identified general education and professional course requirements, along with course descriptions.53

After a comprehensive review of this publicly posted material, we asked the institutions for materials such as syllabi for particular courses,54 information on graduate and employer surveys, and material related to student teaching placements.

These are the methods NCTQ uses to collect data:

1. Open-records requests to institutions. All 50 states and the District of Columbia have open-records laws (also known as "sunshine," "freedom of information act" or "FOIA" laws) that require public agencies to turn over documents upon request by an individual or organization. Except in Pennsylvania and Illinois, public universities are almost universally considered public agencies under these laws.55 But although these institutions are publicly approved to prepare public school teachers, teacher preparation programs at private institutions are not. To collect data for the Review, we send out an individualized request to each of the state's programs, asking them to work with us. If they declined, or did not respond after 10 days, we follow up with a formal open-records request listing the documents, including the course syllabi, we required. We made open-records requests of 375 public institutions that initially chose not to work with us.

2. Open-records requests to school districts. Teacher preparation programs partner with one or more school districts to arrange for student teaching as the crucial apprenticeship experience candidates need before taking the reins of a classroom. Programs often provide student teaching handbooks to districts and sign formal contracts or memoranda of understanding with districts that set forth the criteria and processes by which mentor teachers are chosen. To capture this material, we sent open-records requests to more than 1,000 districts across the country for the Review.

3. Online searches. We judiciously search online for information we need for the Teacher Prep Review. Professors post syllabi and programs put up student teaching handbooks on institutional websites—all of this material is publicly accessible. We also periodically collect information on each semester's textbook listings from institutions' online bookstores. We do not use a syllabus that is posted online unless we can confirm it is valid and current using dates, required reading that matches bookstore records, and other information.

4. Campus outreach. Because we need such an extensive array of documents for our evaluation (see Fig. 6 for a full list of the data needed for each standard) and because of the resistance we face, the methods outlined above are insufficient, particularly for private institutions. So we have reached out to people on campuses, particularly faculty, to ask them to provide us with the documents we need.

Data validation
Regardless of the source, each and every document we receive has to be carefully checked to determine whether it is valid. The syllabi should clearly list the course number and, where appropriate, section number, as well as the professor's name. For courses where we analyze textbooks (reading and elementary math), the syllabi also need to have a list of assigned textbooks.
Trained general analysts working under the supervision of our team leaders perform these thorough checks. At times we have to go back to institutions that have supplied us with documents in response to an open-records request to obtain more complete versions of documents we had requested.

Data analysis
Teacher prep programs must document their standard policies and procedures because they and their institutions need to communicate with their students, and/or because programs are regulated entities that must interact regularly with various institutions (state agencies, accrediting bodies, and local school districts, among others). Our evaluations are largely based on the documents containing policies and procedures. Descriptions of policies and procedures provided to us by institutions in lieu of the actual policy statements are not accepted as data that can satisfy any part of a standard.

For example, we often find cover letters from institutions accompanying submitted data to be very helpful in navigating through the many files provided, but statements in the letters are not used in analysis unless they are corroborated by language in official documents.

Our evaluations can be described as "low inference." Analysts are trained to look only for evidence that teacher preparation programs have particular features related to admissions, content preparation, and professional preparation. For example, when evaluating observation forms that provide feedback to teacher candidates on their use of classroom management techniques in student teaching placements, analysts determine whether the forms contain references to specific techniques. Analysts do not attempt to ascertain whether anything—for example, about the nature of rubrics or instructions to university supervisors conducting observations—will lead to valid and reliable feedback on classroom management. However, it is indisputable that a teacher candidate is more likely to receive feedback on a specific management technique if it is explicitly noted on the feedback form than if it is not noted at all. Our evaluations can therefore distinguish programs that are potentially better-designed to provide great training from those that are less well-designed.

Scoring processes
Our scoring processes place the full collection of documents relevant for evaluation at the disposal of an analyst after a very methodical and systematic process of coding and sorting. Analysts have been trained to follow a very detailed and systematic standard-specific protocol to make a "yes" or "no" decision about whether each of a standard's indicators is satisfied.56 (Scoring methodologies are included in the Understanding Our Standard documents, available here.) When an indicator is satisfied, the analyst has to identify the relevant data and document the source. If the indicator is not satisfied but there is information that bears on the indicator, the analyst has to identify the data that are "next closest" to satisfying the indicator and document the source. If there are no data related to the indicator, the analyst has to make an explicit statement to that effect. All data entered in our database are automatically annotated with the date and the analyst's name. The figure below provides a guide to possible scores by standard.

Programs are evaluated on standards by one general analyst. However, for each standard, we randomly select a sample of programs (generally twenty percent of programs evaluated on the standard) 57for evaluation by a second analyst. The figure below provides a graphic depiction of this process for the Student Teaching Standard.

In each case, based on the indicator evaluations, a standard grade between "A" and "F" corresponding to a range of scores from "meets standard" to "does not meet standard," is automatically generated. For selected standards, programs that not only meet but go above and beyond the requirements of the standard earn "strong design" on that standard, represented as an "A+."

For the sample of programs that are analyzed by two analysts, when the score produced by both analysts is identical, the analysis of one is chosen randomly by the database to represent the final score. As is explained in greater depth in the description of the RevStat management system posted here, any difference of one grade in program scores based on evaluations by two analysts (for example, one evaluation leading to a score of "B" and one leading to a score of "A") leads to "coding up," an automatic awarding of the higher of the two scores. Any difference of two or more levels in scores triggers an "exceeds variance" signal that requires team leader investigation and resolution.58 Instances of excessive variances are monitored through the RevStat process; whenever variances approach 10 percent, action is taken to improve fidelity to scoring protocols or to modify the scoring process as necessary.

State context.
States regulate teacher preparation programs extensively. A teacher preparation program must show that it meets its state's standards to earn approval to train and recommend candidates for licensure and must undergo reapproval every five to seven years thereafter.

We therefore thoroughly examine all relevant state regulations as part of our scoring processes for every standard. We begin with the findings of our comprehensive State Teacher Policy Yearbook and investigate further when necessary. In considering state regulations, we follow these general principles:

  • Give credit for building on strong regulations: We give credit to programs explicitly affirming state regulations that improve program quality. In Texas, for example, programs that affirm that they only admit applicants who achieve scores on the Texas Higher Education Assessment (THEA) that exceed by any amount the state's initial thresholds meet the Selection Criteria Standard.59

  • Hold programs responsible for ensuring candidates are prepared: The ambiguity and complexity of state regulations do not relieve programs of doing what is necessary to make sure that their graduates are well equipped to help students learn. For example, 21 states offer only K-12 certification for special education teachers. Programs in those states have an obligation to make sure that their special education candidates have adequate content knowledge, so we evaluate programs for content preparation for both the elementary and secondary grades.
The impact of state regulations on our analysis.
To provide a more detailed sense of how state regulations impact our analysis, we provide examples below of two standards where context is crucial, and two standards where it has no impact.

  • State regulations on expectations for secondary teacher subject knowledge

    Ratings for two of our traditional teacher preparation standards—Secondary Content in the Sciences and Secondary Content in the Social Sciences (as well as the analogs of these standards that we apply to non-traditional programs) are deeply informed by the state regulatory context in which programs are embedded. The starting point of our analysis is the state's licensing test regime: Does it test all subject matter that a secondary science or social sciences teacher will need to know for all the subjects he or she could be assigned to teach? The more comprehensive a state's testing regime, the less possibility that a secondary teacher will be assigned to teach a course without knowing his or her subject. Where there are gaps in testing, we scrutinize the content of coursework that programs require of their candidates.

    For "unitary" subjects such as math, tests are generally an adequate guide to content preparation: Math teacher candidates who are tested only in math can generally only teach math classes. For the social sciences and the sciences, however, state licensing regimes are generally not robust enough. In some states, teachers earning a license in "general science" can teach high school physics without ever having to demonstrate that they know physics. In other states, a person who majored in anthropology could teach U.S. history classes without ever taking more than one or two courses in the subject. In these cases, we take a closer look at whether programs in these states are doing what they should to prepare teachers for the courses to which they could be assigned to teach.

    A general consequence of our approach for these standards is that a state's licensing regime provides a ratings backstop for its programs: Programs generally can do no worse than the strength of their state's licensing test system, and can take steps to do better.

    (To learn more about how state context impacts these standards, see this infographic and the scoring methodologies for the Secondary Content in the Sciences and Secondary Content in the Social Science Standards.) 

  • State expectations for elementary teacher preparation in early reading and elementary mathematics

    State context plays virtually no role in our analysis for these two standards. States do generally articulate expectations for what elementary teachers need to know in these subjects, and some states have good tests for them. Nonetheless, we decided to carefully examine the preparation that programs provide candidates without regard to the regulatory framework in which programs were embedded.

    The logic behind taking an approach so different from the one taken with regard to secondary content is simple: Preparation in these subjects is a core responsibility of teacher preparation programs themselves. No liberal arts faculty members can deliver courses in how to teach children how to read. And although elementary math courses can and should be delivered by math faculty, these courses have to be specifically designed with the needs of elementary teachers in mind. A math department at an institution without an elementary teacher preparation program would not offer any courses like the ones elementary teacher candidates need to take.
Because of the limited cooperation from institutions, not all standards factor into the scores for programs. This section explains which standards are applied to what programs and how standard scores and program rankings are reported. Scores on "key standards" are used to develop the base for program rankings; scores on a "booster standard" can move a program up in rankings from this base.

Overall elementary, secondary, and special education program rankings are based only on "key" and "booster" standards, even for the programs for which we were able to score on more standards. We made this decision so that the rankings for any given type of program would be based on scores on the same standards.

Program rankings include weighted scores on individual key standards.60 In elementary program rankings, the weights of scores on the Selection Criteria Standard are heaviest, with scores on the Student Teaching Standard next heaviest, and scores on the Early Reading, Elementary Math, and Elementary Content weighted least but equally.61 In secondary program rankings, the weights of scores on the Selection Criteria Standard are heaviest, followed by Secondary Content in the Sciences and Secondary Content in the Social Sciences, both equally weighted, and then by Student Teaching, and finally by the subject-specific methods course component of the Secondary Methods Standard weighted least.62 In special education program rankings, the Selection Criteria Standard is weighted most heavily, followed by Student Teaching and then Instructional Design; Early Reading, Elementary Math, and Content for Special Education are weighted least but equally.

Program rankings can be increased, or "boosted," by scores on the Classroom Management Standard (for all programs) and the subject-specific practice component of the Secondary Methods Standard (for secondary programs).

When we lacked the adequate clear data we need to evaluate a program on a particular standard—in most instances, because the program failed to provide it—we did not score it on the standard. There are, however, instances in which the program did supply the material we requested but a score could not be determined because the materials are not clear. In both cases, the program is removed from the set of programs evaluated on the standard and the score is left blank. In no instance is a program given a score on the basis of whether it did or did not provide data.

For two standards, Early Reading and Elementary Mathematics, a method of imputing scores was developed after extensive fieldwork to ensure that a lack of complete sets of syllabi due to resistance to NCTQ's requests would not preclude a score on these critical standards.63
NCTQ's priority in all of its studies of teacher preparation has been to conduct its evaluations with integrity and to produce reliable results. Because of the scale of the Teacher Prep Review and the vast number of decision points involved in data collection, processing, and analysis, continuing to produce reliable results demanded new mechanisms and safeguards. With the development of a scoring management system component in our database, we have been able to make quality control an integral, ongoing feature of our evaluation.

RevStat: RevStat, a scoring management system that is designed to be an integral part of NCTQ's teacher preparation database, manages a variety of aspects of analysis reliability. Using RevStat, the Teacher Prep Review team tracks each standard's reliability of scores across pairs and teams of analysts at any given time and across various time periods. If reliability issues emerge, the scoring protocols and training are recalibrated appropriately.

To develop RevStat, NCTQ partnered with UPD Consulting, a national expert on education management. NCTQ and UPD modeled RevStat on the same principles as the Baltimore CitiStat and the New York City CompStat processes, which have proven effective in managing institutional performance.

Audit Panel: Although RevStat provides invaluable data on scoring processes, we wanted to ensure that we had the advice of experts who could have the broadest possible vantage point on the reliability of our work. For that reason, we invited a group of eminent education researchers to join an Audit Panel to provide technical assistance, critique our evaluation processes to date, and recommend improvements in subsequent Teacher Prep Reviews. Discussion with the panel has reassured us regarding the utility of the steps we have taken to date to ensure reliability and suggested some refinements we have adopted.

Forum for appeals of scores: While we rigorously ensure accuracy throughout our ratings process, given the thousands of rating decisions analysts make and the frequency of change by teacher prep programs, some rare errors may occur. When programs believe a rating is in error or does not reflect their current practice, they can appeal that rating through our forum process. In advance of the release of new rankings, we will provide programs with an overview sheet that includes their scores and scoring comments for each standard. After the publication of each edition of the Review, we open up an appeals process that extends for several months, through which programs can discuss or contest our findings.

In the appeals process, programs are invited to send in objections to our findings along with supporting evidence.64 We will update programs' standard scores and overall rankings on a regular schedule, and programs' Overview sheets will indicate when new material has been submitted in order to revise a score.

In the first Teacher Prep Review Forum, 49 institutions sent in direct appeals. We also addressed the objections to our findings posted by five institutions on the website of the American Association of Colleges for Teacher Education. In the second Forum, 62 teacher prep programs were involved in the appeals process. The majority of changes, summarized here, were due to programs submitting new material.

Coming soon!