Impact Journal Logo

Opportunities and challenges in assessing teaching: Lessons from Germany

Written by: Stuart Kime
8 min read

Teachers should be able to meet the demands of the classroom; in the language of teaching quality research, they should have the required ‘competences’ to be effective. But what are these competences, and can they be assessed accurately? This article explores some possible approaches and challenges in defining these competences and in assessing against them through a review of a project with mathematics teachers in Germany known as the COACTIV study (Kunter et al., 2013).

It is worth noting that both teacher competence and its assessment are emotive issues. It should never be assumed that all the competences required for effective teaching can be assessed reliably: not everything can be measured. Nonetheless, teaching makes a difference to student learning, and certain key characteristics are found repeatedly in studies of teaching effectiveness and, as such, deserve consideration.

This article is written with the following caution: ‘The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor’ (Campbell, 1976). Moreover, it acknowledges the fact that a review of a single study can only ever provide food for thought, rather than definitive conclusions.

Is it possible to assess core aspects of teacher competence?

The answer is a guarded ‘yes’, and the caution comes from a simple truth: the type and timing of an assessment is rightly decided by its purpose. If we are to create measures of teaching effectiveness that are valid and reliable for a particular purpose, we must begin with a clear understanding of why we are measuring it in the first place. And that is generally the hard part.

However, evidence suggests it is possible to assess aspects of teacher competence. In the project considered in this article, researchers attempted to do just this, so it is worth understanding what they did, the associated limitations and some recommendations for how gaps might be filled.

Competences for what?

The core business of teaching may be summarised as the preparation, implementation and evaluation of effective classroom instruction, and is most often in the literature divided into three core dimensions of instructional quality (Kime, 2017; Burdsal and Harrison, 2008; Abrami and d’Apollonia, 1991):

  • Cognitive activation
  • Supportive climate
  • Classroom management

The COACTIV study found that cognitive activation of students by means of content-rich learning environments requires sophisticated pedagogical content knowledge, subject content knowledge, and general pedagogical knowledge.

Pedagogical content knowledge (PCK) and content knowledge (CK)

PCK is the knowledge of ‘making content comprehensible’ (Shulman, 1986). According to Shulman’s widely accepted explanation, this includes knowledge of the most regularly taught topics in a subject area, useful forms of representation of these ideas, and powerful analogies and explanations that make the subject comprehensible to others. Shulman added that teachers should understand students’ subject-related thinking, what makes the learning of topics easy or difficult, common conceptions and misconceptions, and strategies for tackling these. Teachers should also possess knowledge of task design (Kunter et al., 2013), the sequencing of curriculum content, and its relationship to topics taught in other subject areas.

Content knowledge (CK) – knowledge of the facts and concepts of a particular domain – is also key, as well as a robust understanding of how these are organised. A teacher must know not simply that a concept, idea or principle exists; they must also know why (Shulman, 1986).

The assessment of PCK and CK is best informed by subject specialists. In some fields (such as mathematics) instruments for this purpose exist (Hill et al., 2008), but in others they do not. To fill this gap, work needs to be done to agree on the generic skills of PCK and CK that are considered important for a classroom teacher prior to development of instruments to help assess the constructs identified.

Pedagogical and psychological content knowledge (PPK)

PPK can be defined as the knowledge required to create and optimise teaching and learning situations. Kunter et al. (2013) suggest its components include knowledge of classroom assessment, teaching methods and classroom management, as well as students’ learning processes, characteristics and the challenges these may present in the classroom.

Standards-based assessment instruments exist for PPK, such as the Praxis Series (ETS, USA); no tests of their psychometric properties have been published yet, so they are something of an unknown quantity. Results from previous PPK instruments have been mixed, so this is an area in which more work must be done. The COACTIV project, however, broke new ground by creating a multiple item-format assessment, using multiple-choice items, short-answer items and video-based items. It is suggested that this could form a useful basis for further work in this area.

Professional beliefs and professional motivation

Beliefs structure the way that people interact with the world and those around them. Definitions of ‘beliefs’ vary, but Kunter et al. (2013) adopt Voss and colleagues’ concept of ‘understandings and assumptions about phenomena or objects of the world that are felt to be true, have both implicit and explicit aspects, and influence people’s interactions with the world’. Teachers’ beliefs can include those about learning, teaching, the subject, learning to teach and the self (Calderhead, 1996).

Motivation also matters. Teaching demands focus, resilience, stamina, readiness to engage in new activities and the ability to maximise opportunities to learn. It is plausible that willingness to cope with the demands of teaching is an insufficient but necessary component of effective teaching; cognitive attributes alone are not enough. Research also indicates that enthusiasm – for the subject taught, for teaching in general and for interacting with students – is an important aspect of high-quality instruction.

The most efficient way of capturing information about teachers’ professional beliefs and motivation is to use a self-report measure (e.g. a Likert scale), which seeks agreement with statements pertaining to how teachers perceive the way in which teaching and learning work effectively for student outcomes. However, there are problems associated with a reliance on self-report measures, in particular the potential for teachers to game the assessment, particularly in contexts where its purpose (for instance, a high-stakes assessment) unintendedly incentivises such behaviour.

Professional self-regulation

Professional self-regulation is the ability of teachers to manage their personal resources in their professional context. Those teachers who have robust self-regulatory capacity are highly engaged in their work yet maintain a sensible and healthy detachment from work; they are able to conserve their personal resources (Klusmann in Kunter et al., 2013). Hobfoll (1989) suggests that people strive to protect, conserve and expand their resources (time, self-esteem, energy) and when these are threatened, or when investment of personal resources does not lead to a desired outcome, psychological stress is experienced.

One suggestion for assessing self-regulation is through an instrument such as the AVEM occupational stress and coping inventory (Schaarschmidt and Fischer, 1996). This was developed and validated in Germany and used in the COACTIV project (there is an English version), but offers one means of capturing information about this aspect of teacher competence.

Limitations to the COACTIV model

Although the terminology varies, teacher competence frameworks internationally frequently adopt three broad domains: knowledge, practice and values. The competences covered within the COACTIV framework encompass myriad areas related to teachers’ professional knowledge and values, but those interactions in the classroom typically covered under professional practices – and which are closely aligned with the core dimensions indicated by the evidence on effective teaching – require further evidence (from sources such as student perception surveys) to bring to life useable knowledge on specific dimensions of effective instruction (Fauth, Decristan, Rieser et al., 2014).

In contrast to the COACTIV model, a central premise of the CLASS conceptual framework (Pianta and Hamre, 2009) is that the observation of classroom interactions – including behaviour management, quality of feedback and language modelling – should form the core of assessing teaching quality, whether for formative or summative purposes. The authors propose that ‘teachers’ behavioral [sic] interactions with students can be (a) assessed observationally using standardized protocols, (b) analyzed [sic] systematically with regard to sources of error, (c) validated for predicting student learning, and (d) changed (improved) as a function of specific and aligned supports provided to teachers; exposure to such supports is predictive of greater student learning gains’.


Ultimately, a ‘multiple measures’ approach (adopting several well-developed instruments) may help to ensure the validity and dependability of inferences made about a teacher’s competence on whatever dimensions are identified as important within a society. Berk (2005) suggests multiple sources of evidence of teaching effectiveness which may be used:

  • Student, peer, alumni, employer and administrator ratings
  • Self-evaluation
  • Videos
  • Student interviews
  • Teaching scholarship
  • Teaching awards
  • Learning outcome measures
  • Teaching portfolio.

Different measures provide different perspectives (of the same target area in the best cases – classroom management, for instance) and increase the potential for more reliable assessment judgements to be made, but only if the inferences intended to be drawn are clearly-defined and the assessment mode and timing fit for purpose.



Abrami PC and d’Apollonia S (1991) Multidimensional students’ evaluations of teaching effectiveness – generalizability of “N=1” research: comment on Marsh (1991). Journal of Educational Psychology 83(3): 411–415.

Berk RA (2005) Survey of 12 strategies to measure teaching effectiveness. International Journal of Teaching and Learning in Higher Education 17(1): 48–62.

Burdsal C and Harrison P. (2008) Further evidence supporting the validity of both a multidimensional profile and an overall evaluation of teaching effectiveness. Assessment & Evaluation in Higher Education 33(5): 567–576.

Calderhead J (1981) Teachers: Beliefs and knowledge. In Berliner DC and Calfree RC (eds) Handbook of Educational Psychology. New York: Macmillan, pp.709–725.

Campbell DT (1976) Assessing the impact of planned social change. Available at: (accessed 21 August 2017).

Fauth B, Decristan J, Rieser S, Klieme E and Büttner G (2014) Student ratings of teaching quality in primary school: Dimensions and prediction of student outcomes. Learning and Instruction 29: 1–9.

Hill HC, Blunk ML, Charalambous CY, Lewis JM, Phelps GC, Sleep L and Ball DL (2008) Mathematical knowledge for teaching and the mathematical quality of instruction: an exploratory study. Cognition and Instruction 26(4): 430–511.

Hobfoll SE (1989). Conservation of resources: a new attempt at conceptualizing stress. American Psychologist 44(3): 513.

Kime S (2017). Student evaluation of teaching: Can it raise attainment in secondary schools? PhD Thesis, Durham University, UK.

Kunter M, Baumert J, Blum W, Klusmann U, Krauss S and Neubrand M (2013) Cognitive Activation in the Mathematics Classroom and Professional Competence of Teachers: Results from the COACTIV Project. New York: Springer Science + Business Media.

Pianta RC and Hamre BK (2009) Conceptualization, measurement, and improvement of classroom processes: standardized observation can leverage capacity. Educational Researcher 38(2): 109–119.

Schaarschmidt U and Fischer A (1996) AVEM: Arbeitsbezogene Verhaltens- und Erlebensmuster. Frankfurt: Swets Test Services.

Shulman LS (1986) Those who understand: Knowledge growth in teaching. Educational Researcher 15(2): 4–14.

      0 0 votes
      Please Rate this content
      Notify of
      Inline Feedbacks
      View all comments

      From this issue

      Impact Articles on the same themes

      Author(s): Bill Lucas