A pair of ears in every lesson? Using AI to give trainee teachers richer feedback

Perspective

Written by:

Published on: May 18, 2026

8 min read

MILES BERRY, PROFESSOR OF COMPUTING EDUCATION, UNIVERSITY OF ROEHAMPTON, UK

AI can now generate detailed, evidence-referenced feedback from a transcript of any lesson, but how does this fit with professional formation?

When ChatGPT launched in November 2022, one of the first things that many teachers did was ask it to write a lesson plan. The results were typically disappointing. ‘Write a lesson plan for Year 9 on the periodic table’ produced something generic, often shallow and rarely something that a teacher would use. My concern was what trainee teachers might do with plans such as these.

That early encounter pointed to a tension that runs through AI in initial teacher training: the risk of cognitive outsourcing. If a trainee can generate a lesson plan in seconds, then do they learn the craft of planning? Does the hard slog of constructing a bespoke lesson for a particular class, with particular pupils, bringing particular knowledge and enthusiasms, still matter?

I think that it does. But I also think that there is a role for AI in initial teacher training that is quite different from plan generation and more powerful: using AI not to do the thinking for trainees, but to reflect back on what happened when their thinking met reality, in both the lesson that they planned and the lesson that they taught.

The lesson planning question

The Teachers’ Standards (DfE, 2011) make it clear that trainees must plan and teach well-structured lessons. There is a parallel between downloading a ready-made lesson plan from a publisher and asking a language model to generate one. Both can serve as starting points, but neither constitutes lesson planning as a professional act. Just as we would be reluctant to allow trainees to teach exclusively from others’ plans, we should be cautious about AI-generated plans that short-circuit the thinking that makes planning valuable. This is the sort of cognitive outsourcing that seems detrimental to the development of a trainee’s craft skills.

The research supports this caution. Dornburg and Davin (2024) found that ChatGPT-generated foreign language lesson plans were often of reasonable surface quality but showed troubling variability and embedded historical biases, reflecting outdated pedagogical approaches. Kalenda et al. (2024) found that pre-service teachers’ confidence in ChatGPT as a planning tool decreased once they engaged in careful analysis of its outputs, suggesting that the capacity to evaluate and adapt AI-generated material critically should be a core component of teacher education. Prompt engineering helps: grounding requests in curriculum documents or exam specifications yields better results than a zero-shot prompt. But the generated plan still belongs to the model, and not to the teacher.

The research case for AI feedback on teaching

The research is more encouraging in using AI not to generate teaching but to analyse it. A growing body of work has examined whether language models and natural language processing tools can provide specific feedback on classroom practice.

Demszky et al. (2025) conducted a pre-registered randomised controlled trial with 224 mathematics and science teachers. Those who received automated feedback on their use of questions that press pupils to explain and reflect, rather than simply recall, increased their use of such questions by 20 per cent compared to a control group. The feedback came from an AI system working from classroom audio; no human observer was required. Where both human mentors and AI provided feedback on the same lessons, the AI tended to be more comprehensive, attending to the whole lesson rather than to salient moments alone. Jacobs et al. (2025) found similar results in their work on automating feedback on classroom discourse patterns, noting that AI could surface patterns in teacher talk that post-lesson conversations rarely reached.

This picture is echoed by Sert et al. (2025), who found that automatic question-detection tools, embedded in teacher education programmes, prompted reflection on classroom interaction when accompanied by structured discussion and mentoring support. AI data, on its own, is not sufficient. Nygren et al. (2025), comparing AI and expert human mentoring in simulations, found that AI feedback was more consistent and broader in scope, while human feedback was better attuned to pedagogical moments and the emotional texture of teaching. The emerging consensus is that AI feedback and human mentoring are complementary rather than competing, and that the most productive use of AI positions it as a prompt for professional conversation, and not a substitute for one: while AI feedback might be more objective, and is often more detailed, human feedback is more likely to be acted on, for both pupils and trainees (Zhang et al., 2026)

AI as critical friend

This is the role that I have been exploring with my PGCE (postgraduate certificate of education) trainees at Roehampton: AI as a critical friend – not just on the lesson plan, but also on the lesson as taught.

We already expect trainees to share their lesson plans with mentors before teaching, typically a day ahead. The mentor reads the plan, offers questions and suggestions, and the trainee refines their thinking. The cognitive work of planning remains theirs. We would not expect the mentor to write the plan.

A trainee who sits with a thoughtful commentary on their lesson – ‘here is where you introduced new vocabulary’, ‘here is where questioning was concentrated on a small number of pupils’, ‘here is one thing to try differently next time’ – is engaged in professional reflection. They did the teaching. The AI has been, in effect, a pair of ears in the room, listening carefully to the whole lesson, and is now offering what a mentor might offer, if mentors had the time.

Grounded in specific statements from the ITTECF (Initial Teacher Training and Early Career Framework; DfE, 2024) and organised against the Teachers’ Standards (DfE, 2011), the feedback can point to a particular moment in the lesson and say: ‘Here is where a little more wait time, as described in ITTECF 4n, might have given more pupils the chance to formulate a response’ or ‘This is a good example of 1h, acknowledging pupil effort and emphasising progress’. The feedback is developmental, evidence-referenced and entirely non-judgemental. Very few mentors have the time to give feedback of this depth and specificity on every lesson.

The system in practice

The tool that I have developed for this is a straightforward web application, designed to work on a phone or laptop browser. A trainee uploads or records audio of their lesson; the app reduces background noise and transcribes it using OpenAI’s Whisper. The original audio is then discarded immediately. The transcript is shown to the trainee, who can edit it, removing personally identifiable information or anything raising safeguarding concerns, before submitting it for feedback. No account or login is required. Nothing is stored.

The feedback is generated by prompting Anthropic’s Claude Haiku model, chosen for its speed, cost-effectiveness and Anthropic’s approach to responsible AI development (Anthropic, 2026), using a structured knowledge base drawn from the ITTECF and Teachers’ Standards. The response opens with a summary of the lesson and its phases, identifies two or three strengths with specific ITTECF references, names areas for development with concrete suggestions and closes with a single priority next step. The whole thing is clearly caveated: this is AI-generated, unreviewed by any human and is intended as the basis of a discussion with the mentor, and not a replacement for that conversation.

There is no access to transcripts or to the feedback generated, other than for the trainee at the time of use. Qualitative responses from trainees have been positive: the feedback is described consistently as more detailed than mentor feedback, and usefully specific. In my own trials, running transcripts of lessons that I have observed against the same system, the AI’s commentary has rarely contradicted the human observer’s judgement, but has regularly gone further, attending to aspects of the lesson that post-lesson conversations did not reach.

Limitations and open questions

Working from a transcript, the AI cannot see the room. Behaviour management incidents that a mentor would notice in seconds may not appear in the audio at all. Non-verbal communication, whiteboard work and the physical arrangement of the classroom are all absent. However, in most classroom subjects, a transcript is a rich record, as much teaching happens through the medium of the spoken. Regardless, feedback built on a transcript should never be mistaken for a full lesson observation.

There is also a subtler concern. One thing that I value about the post-lesson conversation – the mentor and trainee with a cup of coffee, talking through what happened – is precisely that it requires the trainee to stop and think. The cognitive work of reflecting on a lesson is itself part of professional formation. There is a risk that reading a detailed AI commentary becomes a substitute for that thinking, rather than a prompt for it. The feedback is most valuable, I suspect, when it is the starting point for a professional conversation, and not the end of one. I worry about outsourcing the reflection itself, even to a machine that may, in some respects, be better at listening than any of us.

These are questions that the field is only beginning to address (Demszky et al., 2021; Wang and Demszky, 2023). What I have so far is only proof-of-concept work; assurance that it does no harm and robust evidence of its effectiveness are needed before it can be recommended at scale.

Conclusion

AI is not going to replace the craft of teaching or the work needed in learning to teach. What it may be able to do is ensure that every lesson that a trainee teacher teaches is also, in a meaningful sense, attended to – not in the threatening, high-stakes way of a formal observation, but quietly, non-judgementally, as a critical friend who listens carefully and has some constructive thoughts. The emerging research suggests that this kind of feedback, grounded in good criteria and offered without grades or grades-adjacent language, can improve practice. Whether it does so for trainee teachers, in the conditions of initial teacher education in England, is a question worth pursuing carefully.

The examples of AI use and specific tools in this article are for context only. They do not imply endorsement or recommendation of any particular tool or approach by the Department for Education or the Chartered College of Teaching and any views stated are those of the individual. Any use of AI also needs to be carefully planned, and what is appropriate in one setting may not be elsewhere. You should always follow the DfE’s Generative AI In Education policy position and product safety standards in addition to aligning any AI use with the DfE’s latest Keeping Children Safe in Education guidance. You can also find teacher and leader toolkits on gov.uk.

Anthropic (2026) Claude’s constitution: Our vision for Claude’s character. Available at: www.anthropic.com/constitution (accessed 27 March 2026).
Demszky D, Liu J, Hill HC et al. (2025) Automated feedback improves teachers’ questioning quality in brick-and-mortar classrooms: Opportunities for further enhancement. Computers and Education 227(1): 105183.
Demszky D, Liu J, Mancenido Z et al. (2021) Measuring conversational uptake: A case study on student–teacher interactions. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, online, August 2021, pp. 1638–1653. Association for Computational Linguistics.
Department for Education (DfE) (2011) Teachers’ Standards. Available at: https://assets.publishing.service.gov.uk/media/61b73d6c8fa8f50384489c9a/Teachers__Standards_Dec_2021.pdf (accessed 31 March 2026).
Department for Education (DfE) (2024) Initial Teacher Training and Early Career Framework. Available at: https://assets.publishing.service.gov.uk/media/661d24ac08c3be25cfbd3e61/Initial_Teacher_Training_and_Early_Career_Framework.pdf (accessed 31 March 2026).
Dornburg A and Davin KJ (2024) ChatGPT in foreign language lesson plan creation: Trends, variability, and historical biases. ReCALL 37(3). Available at: www.cambridge.org/core/journals/recall/article/chatgpt-in-foreign-language-lesson-plan-creation-trends-variability-and-historical-biases/0AEBE95C6587E6ACB0D401B6FC6F5385 (accessed 31 March 2026).
Jacobs J, Suresh A, Booth BM et al. (2025) Automating feedback from recorded instructional observations: Using AI to detect and support dialogic teaching. In: Kelly S (ed) Research Handbook on Classroom Observation. Cheltenham: Edward Elgar Publishing, pp. 341–365.
Kalenda PJ, Rath L, Heidt MA et al. (2024) Pre-service teacher perceptions of ChatGPT for lesson plan generation. Journal of Educational Technology Systems 55(3). DOI: 10.1177/00472395241301388.
Nygren T, Samuelsson M, Hansson P-O et al. (2025) AI versus human feedback in mixed reality simulations: Comparing LLM and expert mentoring in preservice teacher education on controversial issues. International Journal of Artificial Intelligence in Education 35: 2856–2888.
Sert O, Aşık A and Miller P (2025) Partnering with AI in teacher education? Using an automatic question detection tool to reflect on classroom interaction. Journal of Research on Technology in Education. DOI: 10.1080/15391523.2025.2504355.
Wang R and Demszky D (2023) Is ChatGPT a good teacher coach? Measuring zero-shot performance for scoring and providing actionable insights on classroom instruction. arXiv. DOI: 10.48550/arXiv.2306.03090.
Zhang C, Hu M, Wu W et al. (2026) AI versus teacher feedback in developing pre-service teachers’ teaching self-efficacy: A quasi-experimental study in simulated teaching. Teaching and Teacher Education 172: 105355.

0 0 votes

Please Rate this content

0 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

A pair of ears in every lesson? Using AI to give trainee teachers richer feedback

MILES BERRY, PROFESSOR OF COMPUTING EDUCATION, UNIVERSITY OF ROEHAMPTON, UK

The lesson planning question

The research case for AI feedback on teaching

AI as critical friend

The system in practice

Limitations and open questions

Conclusion

From this issue

Issue 27: Innovative and creative pedagogy

Impact Articles on the same themes

From the editor

Reclaiming purpose: Disciplinary expertise and creative pedagogy as drivers of sustainable teaching

Professional development for creative pedagogies: The Art of Learning as a model for practice and pedagogical renewal

A joyful curriculum: Embedding creative practice to support pupils’ mental health

Innovating listening-led classroom practice in primary education: Sounding digital playgrounds

‘Speak Out!’ – Improving oracy across the curriculum using drama pedagogy: A case study from six primary schools

Making it count: Legitimising outdoor learning through school systems and supporting structures

Rewilding the curriculum: Narrative pedagogy and the English canon as catalysts for creative ecological thinking

From talk to text: A creative pedagogical approach for building scientific literacy in the secondary classroom

Hear Water: Creative, technology-enhanced pedagogy for the arts and nature connectedness