Teacher development: Measuring what matters

Perspective Article

Written by: Chris Larvin, Jenny Griffiths and Luke Bocock

Published on: January 31, 2023

Professional learning

6 min read

Chris Larvin, Research Specialist, Teach First, UK

Jenny Griffiths, Knowledge & Research Manager, Teach First, UK

Luke Bocock, Head of Research & Learning, Teach First, UK

Recent research suggests that the impact of high-quality teacher development on pupil outcomes can be equivalent to a pupil taught by a teacher with 10 years of experience compared with a graduate (EPI, 2020). Despite an abundance of claims about the impact of professional development, these are rarely scrutinised with respect to their validityIn assessment, the degree to which a particular assessment m and reliabilityIn assessment, the degree to which the outcome of a particul. Anecdotes and testimonials from past participants, while an effective marketing tool, can be highly subjective and unreliable (Guskey, 2009). Similarly, traditional approaches to measuring and evaluating the impact of programmes generally rely heavily on self-reported impact. So how can we know whether or not a teacher development programme is effective?

Measuring teacher development

Most sources of information on the impact of professional development on teachers and their pupils rely on snapshots of teachers’ performance – for example, classroom observations and quantitative measures of pupils’ outcomes through summative assessment. Both of these measures are problematic. Classroom observations, frequently used to evaluate performance for the award of QTS or performance management, tell us little about development and come with demonstrable concerns over reliability and validity (Coe et al., 2014; Mihaly et al., 2013; Strong et al., 2011). As a measure of teacher performance, pupil outcomes fail to control for out-of-school contextual factors or to account for the varied impact of multiple teachers over time (Cambridge CEM, 2019; Shakeshaft et al., 2013). Studies have also failed to demonstrate the impact of classroom observation on pupil outcomes (EEF, 2018).

As a result, neither of these approaches are particularly helpful for evaluating teacher development courses, particularly at scale. Hybrid teacher development programmes have also made the gauging of impact more complex. Metrics generated by learning management systems for online units of study can give insight into experience, but tell us little about the complexity of translating teacher development content into classroom practice.

For example:

Measures of engagement, such as attendance, are poor proxies of what teachers may have learnt
Teachers’ interest and participation in their learning can provide insight into their motivation for learning but are subjective and may have no impact in the classroom
Satisfaction and other factors not related to the programme’s effectiveness are inadequate measures of evaluating the impact of a programme, and surveys can be subject to selection bias
Teacher knowledge is complex, and while recall of curriculum content is valuable, as a measure it fails to account for the application of that knowledge and so must be validated with other measures of learning (Singhghutaura, 2017)
The demonstration of a teacher’s competence to apply a skill during a training episode is insufficient evidence that they can recall and apply it to their classroom later.

Acknowledging the complexity of teacher development and the difficulty of evaluating its impact on pupil outcomes, it is apparent that the evaluation process requires triangulation of multiple sources of evidence. Evaluation of teacher development programmes should therefore consider the intended development of teachers’ knowledge, competence and decision-making and what they can apply in their classroom. These are attainable measures that can support a greater understanding of the effective mechanisms of teachers’ professional growth by programme designers and evaluators.

Developing new tools

Evaluating large-scale teacher development programmes poses a particular challenge in ensuring high levels of validity. To combat this challenge, Teach First and researchers at the University of York, led by Professor Robert Klassen, have been developing three assessment tools to use with teacher development programmes: knowledge checks to evaluate the development of teachers’ knowledge; bespoke self-beliefs inventories to track shifts in self-efficacy; and scenario-based learning (SBL) to explore decision-making. While we recognise the limitations of each approach, the combination is designed to help us better understand the effectiveness of our programmes and the development of the teachers participating in them.

Addressing the desire to understand changes in teachers’ knowledge of key aspects of the curriculum, robust knowledge checks are integrated in the start and end of a learning sequence, serving both formative and summative purposes. The automatically graded multiple-choice quizzes provide feedback to move teachers forward in their learning, while ensuring a level of challenge that continues to motivate teachers to engage with their online learning. Rather than simplistic recall questions, items have been developed that reflect greater depths of knowledge, such as requiring teachers to evaluate or assess the most appropriate options. Teachers must possess in-depth knowledge and a degree of discriminating judgement to select the correct option from a list, including plausible distractors. As shown in Figure 1, teachers’ aggregate performances can provide insight into their starting points and development across a whole year of a two-year programme. The tool also signposts areas of the curriculum where teachers do not retain key learning and where programme improvements are needed.

Figure 1: Changes in mean pre- and post-scores across six modules of the Teach First Early Career Framework programme (n = 2,500 to 5,000+)

The teacher self-beliefs inventory seeks to understand changes in teachers’ beliefs about their capabilities throughout a programme. Teachers reflect and record judgements against a scale of teacher self-efficacy, a measure of their perceived capabilities in specific areas of their teaching (Bandura, 1986). While there are existing teacher self-efficacy scales (e.g. Skaalvik and Skaalvik, 2007; Tschannen-Moran and Woolfolk Hoy, 2001), we sought to develop a novel scale that incorporates contemporary conceptions of effective teaching. Our scale was informed by professional frameworks and the structure of Evidence Based Education’s ‘Great Teaching Toolkit’ (Coe et al., 2020). The role of self-efficacy in teacher development is particularly interesting, given its influential role in teacher development and established empirical relationships with teachers’ practices, enthusiasm and commitment (Klassen and Tze, 2014). The results from this tool are analysed against demographic and group information, such as teaching phase or subject, to understand the differences in perceived capabilities and to inform subsequent programme improvements.

Teacher development requires not only knowing more and believing that they are effective, but successful decision-making in the classroom. SBL (sometimes called case-based learning or near-world simulation) is a promising area of development in teacher education, whereby programme members engage with realistic scenarios of critical incidents related to a specific area of teaching (Klassen et al., 2021). As a component within a module, a teacher is presented with a complex classroom scenario and asked to evaluate the appropriateness of several options to take in response. A panel of expert teachers has previously evaluated these options as to how appropriate they are in the classroom, with each option generating additional feedback to the teacher. In addition to providing insight into teachers’ decision-making and how this may shift throughout a programme, this tool represents an integrated reflection-feedback cycle that enables teachers to think deeply about an area of their practice, gain insight into the experts’ reasoning and re-evaluate their own perspective. This tool is built into the programme at strategic points to provide opportunities for programme members to monitor their professional development and build on these experiences in future SBL components.

From data to conclusions

When drawing conclusions from these tools, we must acknowledge their limitations. For example, a component of teachers’ improved performance shown in Figure 1 may be due to the practice effect of familiarity with quiz questions. The self-report nature of the self-beliefs inventory may reflect social desirability bias, as teachers report levels of confidence that they feel would be viewed favourably. Therefore, inferences drawn from these tools are made at cohort and subgroup levels rather than individual. This enables us to analyse and consider issues relating to potential biases relating to, for example, gender or ethnicity. Findings can then be incorporated into the iterative development refinement and validation cycle of programmes. Further explanatory qualitative evaluation can also be carried out, such as on the new NPQ for ‘Leading Teacher Development’, where focus group discussions were used to help understand whether considerable self-efficacy growth in ‘professional development’ was due to programme design or expertise gained through practical leadership experience.

Early indications are that these tools provide far greater insight into the impact of our teacher development programmes, with statistical correlations supporting the principle of triangulation. They will contribute to an increased understanding of how teachers develop across programmes throughout differing stages of their careers. Ultimately, we hope that this work will pave the way for ITTInitial teacher training - the period of academic study and and teacher development providers – schools, trainees and mentors – to have a more objective, granular understanding of what trainees have learned and what they can do, and therefore enable them to better identify further development needs.

Bandura A (1986) The explanatory and predictive scope of self-efficacy theory. Journal of Social and Clinical Psychology 4(3): 359.
Cambridge CEM (2019) Measuring progress in education. In: CEMblog. Available at: www.cem.org/blog/measuring-progress-in-education (accessed 11 November 2022).
Coe R, Aloisi C, Higgins S et al. (2014) What makes great teaching? Review of the underpinning research. Sutton Trust. Available at: www.suttontrust.com/wp-content/uploads/2014/10/What-Makes-Great-Teaching-REPORT.pdf (accessed 11 November 2022).
Coe R, Rauch CJ, Kime S et al. (2020) Great Teaching Toolkit: Evidence review. Evidence Based Education. Available at: www.cambridgeinternational.org/Images/584543-great-teaching-toolkit-evidence-review.pdf (accessed 11 November 2022).
Education Endowment Foundation (EEF) (2017) Teacher observation: Evaluation report and executive summary. Available at: https://educationendowmentfoundation.org.uk/projects-and-evaluation/projects/teacher-observation (accessed 11 November 2022).
Education Policy Institute (EPI) (2020) Evidence review: The effects of high-quality professional development on teachers and students. Available at: https://epi.org.uk/publications-and-research/effects-high-quality-professional-development (accessed 11 November 2022).
Guskey TR (2009) Closing the knowledge gap on effective professional development. Educational Horizons 87(4): 224–233.
Klassen RM, Rushby JV, Maxwell L et al. (2021) The development and testing of an online scenario-based learning activity to prepare preservice teachers for teaching placements. Teaching and Teacher Education 104: 103385.
Klassen RM and Tze VM (2014) Teachers’ self-efficacy, personality, and teaching effectiveness: A meta-analysis. Educational Research Review 12: 59–76.
Mihaly K, McCaffrey DF, Staiger DO et al. (2013) A composite estimator of effective teaching. MET Project research paper 1–51. Available at: https://usprogram.gatesfoundation.org/-/media/dataimport/resources/pdf/2016/12/met-composite-estimator-of-effective-teaching-research-paper.pdf (accessed 11 November 2022).
Shakeshaft NG, Trzaskowski M, McMillan A et al. 2013) Strong genetic influence on a UK nationwide test of educational achievement at the end of compulsory education at age 16. PLoS ONE 8(12): e80341.
Singhghutaura D (2017) Making validation clearer for teachers. Impact 1. Available at: https://my.chartered.college/impact_article/making-validation-clearer-for-classroom-teachers (accessed 11 November 2022).
Skaalvik EM and Skaalvik S (2007) Dimensions of teacher self-efficacy and relations with strain factors, perceived collective teacher efficacy, and teacher burnout. Journal of Educational Psychology 99(3): 611–625.
Strong M, Gargani J and ̌lu ÖH (2011) Do we know a successful teacher when we see one? Experiments in the identification of effective teachers. Journal of Teacher Education 62(4): 367–382.
Tschannen-Moran M and Hoy AW (2001) Teacher efficacy: Capturing an elusive construct. Teaching and Teacher Education 17(7): 783–805.

0 0 votes

Please Rate this content

1 Comment

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Yitzchak Yitzchak Freeman

Member

2 February 2023 10:23 pm

Sounds very interesting. It would be great to have links to examples of these tools.

Teacher development: Measuring what matters

Chris Larvin, Research Specialist, Teach First, UK

Jenny Griffiths, Knowledge & Research Manager, Teach First, UK

Luke Bocock, Head of Research & Learning, Teach First, UK

Measuring teacher development

Developing new tools

Figure 1: Changes in mean pre- and post-scores across six modules of the Teach First Early Career Framework programme (n = 2,500 to 5,000+)

From data to conclusions

From this issue

Issue 17: Teacher effectiveness and teacher development

Impact Articles on the same themes

From the editor

Transforming assessment principles and practices through collaboration: A case study from a primary school and university

The currency of assessment for learners with SEND

Rethinking assessment: How learner profiles can shift the debate towards equitable and meaningful holistic assessment

Assessing progress in special schools: Reviews and recommendations

Classroom assessment in flux: Unpicking empirical evidence of assessment practices

The role of frequent assessment in science education at an international school in Singapore

Teaching creativity: An international perspective on studying art in the UK

Improving academic resilience and self-efficacy through feedback: Moving from ‘what’ to ‘how’

Mind the gap: What are national assessments really telling us about vocabulary and disadvantaged students?