AI and assessment: Rethinking assessment strategies and supporting students in appropriate use of AI

Original Research

Written by:

Published on: May 13, 2024

Feedback, marking and assessment

7 min read

LYNSEY ANNE MEAKIN, SENIOR LECTURER IN EDUCATION, INSTITUTE OF EDUCATION, UNIVERSITY OF DERBY, UK

Traditional assessment methods often fall short in capturing students’ true abilities and fostering authentic learning experiences, and the introduction of AI (artificial intelligence) tools further complicates the task of accurately assessing genuine student learning.

This article will provide a brief explanation of what generative artificial intelligence is, and will consider what types of assessment it can support and where it can hinder assessment. Details about how AI can be incorporated into assessments when it is appropriate will be provided, and suggestions for how assessments can be developed to reduce the likelihood of AI being used when its use is not appropriate will also be included.

What is generative AI, what types of assessment can it support and where can it hinder assessment?

Generative artificial intelligence (GenAI) is a natural language-processing artificial intelligence system. It is a machine-learning system that has been trained on a massive dataset of text from the internet, including books, articles and websites (Bessette, 2023). GenAI uses algorithms to access this dataset and make predictions about how to string words together, putting one word in front of another based on statistical probability, much like an enhanced predictive text or the autocomplete function of a search engine (Floridi, 2023).

This ability to generate brand-new text to user prompts means that its use can hinder assessment of learning, especially if students are using it to answer an assessment question for them. Equally, though, GenAI can support learning when used legitimately – for example, to help with grammar and spelling or as a search tool to research assignment topics. GenAI can also be a support tool to help students to understand explanations of concepts and even to plan and develop an outline structure or generate ideas for a written assessment.

Considerations for assessment design

To ensure that it is our students completing assessments and demonstrating their learning, assignments and assessments should be constructed in such a way that they are challenging to complete using AI tools or by copying from external sources (Rudolph et al., 2023).

The evolving nature of AI tools and technologies requires open and ongoing communication with students around the permissibility of AI tools to ensure responsible use. It is possible to categorise AI use in assessments into three areas:

AI is not permitted 🡪 AI tools may not be used to complete any portion of the assignment
AI can be used in specific ways 🡪 AI tools may be used by students in certain ways, but not in others; this will have to be clarified to students
AI is permitted 🡪 Students are permitted and/or encouraged to use AI tools to support their learning as they complete the assignment (for example, brainstorming, planning, drafting, revising).

These expectations should be clearly communicated so that students know whether AI tools can be used, and if so, how they can be used. Educators should clearly communicate details about student responsibilities, including whether and how citation should be completed. To avoid academic misconduct, students should also be made aware of potential consequences of the misuse of AI. Perkins et al. (2023) have developed an AI Assessment Scale (AIAS), which allows for the integration of GenAI tools into educational assessment by detailing different levels of AI usage in assessments, based on the assessment learning outcomes. Use of the AIAS provides increased clarity and transparency for both students and educators, and offers an approach to assessment that embraces the opportunities of GenAI while recognising that there are situations where such tools may not be pedagogically appropriate or necessary (Perkins et al., 2023).

When it is not appropriate to use AI

Assessments can be developed to reduce the likelihood of AI being used – for example, by avoiding assessment formats that are easily addressed by GenAI, such as essays without personalised application; questions that require closed answers, such as those used in multiple-choice questions or short-answer exam questions that ask students to define, list or reproduce; and assessments requiring information that can be found on the internet.

Students could be asked to write about a highly specific and niche topic, on which GenAI may find it difficult to find relevant information, or students could be tasked with including personal experiences or perspectives in their writing, as it is difficult for AI systems to replicate this (Nowik, 2022).

Assessment design

Emphasise critical thinking. Assignments that require critical thinking, problem-solving and analysis and projects that demand students to apply their knowledge in unique ways are less likely to be generated by AI. Assignments should be created that foster students’ creative and critical-thinking abilities (Rudolph et al., 2023). For example, assessments could be created that require students to deliver presentations or performances.
Project-based learning, group projects and presentations can encourage collaborative learning and may therefore discourage individual cheating. Tasks can be assigned within projects that require each team member to contribute their unique skills and expertise.
Conducting in-class assessments, both written and oral, reduces opportunities for AI-assisted cheating and can include analysis that draws on class discussions (Mills, 2023). These in-class assessments are formative – or assessment for learningKnown as AfL for short, and also known as formative assessment, this is the process of gathering evidence through assessment to inform and support next steps for a students’ teaching and learning – enabling educators to measure real-time understanding of the material and to monitor student progress.
Implementing timed exams limits the time available for students to consult external sources or use AI-generated content. Ensure that the time allocated is reasonable for the scope of the assessment.
Open-book exams with a twist include questions that require higher-order thinking or apply concepts in novel ways. This makes it less advantageous for students to rely solely on reference materials.
Oral assessments mean that if students are able to speak intelligently on the topic and defend their position or arguments, then they have demonstrated learning regardless of whether AI has been used or not (Kumar et al., 2023). Oral assessments can test a student’s understanding, gauging their ability to respond to questions and prompts in conversation.
Adaptive testing techniques should be used where questions are tailored to a student’s performance. This makes it difficult for students to share answers, since each test may be unique.
Peer review and evaluation allows students the opportunity to ‘teach-back’ (Sharples, 2022), a communication confirmation method whereby students demonstrate their understanding in speech. The incorporation of peer review and evaluation processes for assignments and projects can help to identify discrepancies in group-work, lead to revisions of work and encourage students to hold each other accountable.

There are several strategies that may make it more difficult for AI technologies to produce satisfactory results – for example, asking students to use sources that require an institutional subscription within the assignment or ensuring that questions are applied more to each student’s context. You could also have staged submission of the final assessment, requiring an outline, research notes and drafts, where students explicitly respond to feedback, or you could have students submit a portfolio of the work that they have done on the way to their final submitted work.

Supporting students in appropriate use of AI

Students need to be made aware of the limitations of using GenAI, including:

Incorrect, inaccurate and mis-information: GenAI responses may not always be accurate or factually correct, especially when the AI generates content that cannot be grounded in any of its training data or the source content provided and it has ‘hallucinations’ (Fitzpatrick et al., 2023). GenAI cannot filter relevant information or distinguish between credible and less-credible sources, so inaccuracies can result.
Biased, prejudiced or stereotypical responses: Generated responses may replicate historical and societal prejudices and stereotypes that were present in the set of text data used to build the GenAI (Eke, 2023; Fitzpatrick et al., 2023).
No understanding of the context: GenAI will process and generate natural language, but it has no understanding of the content of the queries, the context of the prompts or the physical world (Alvero, 2023).

GenAI is not able to assess the value or accuracy of the information that it provides, meaning should be critical of AI-generated content. They should confirm all information, fact-checking and cross-referencing information from multiple sources to ensure informed decision-making (Alvero, 2023; Fitzpatrick et al., 2023). Students could also be tasked with marking or reviewing a piece of AI writing, exploring the limitations of GenAI. Providing a clear rationale for assessment conditions will help students to see the value in responsible uses of AI technologies.

Concluding thoughts

A key purpose of assessment is to provide evidence of student learning. This evidence depends on submitted work accurately representing each student’s contribution. This article has considered how traditional assessments can be redesigned to accommodate the advent of AI into the educational arena, allowing educators to feel assured that learning is being assessed and that individual students are demonstrating their knowledge and application of that knowledge.

McMurtrie (2022) argues that GenAI tools will become part of everyday writing in some shape or form, in the same way that calculators and computers have become part of maths and science. It is, therefore, very possible that human–AI collaboration will become the norm in assessments, so that we adopt ‘assessment as learning’. The AI Assessment Scale has been included to cater for this potential future of assessments in education. The impact of GenAI provides an opportunity to rethink assessment strategies so that students demonstrate their knowledge, skill and learning, whilst accommodating the potentially prolific use of GenAI.

Alvero R (2023) ChatGPT: Rumors of human providers’ demise have been greatly exaggerated. Fertility and Sterility 119(6): 930–931.
Bessette LS (2023) This isn’t another piece on ChatGPT. The National Teaching and Learning Forum 32(2): 11–12.
Eke DO (2023) ChatGPT and the rise of generative AI: Threat to academic integrity? Journal of Responsible Technology 13: 1–4.
Fitzpatrick D, Fox A and Weinstein B (2023) The AI Classroom: The Ultimate Guide to Artificial Intelligence in Education. Beech Grove, IN: TeacherGoals Publishing.
Floridi L (2023) AI as agency without intelligence: On ChatGPT, large language models, and other generative models. Philosophy & Technology 36: 15.
Kumar R, Eaton SE, Mindzak M et al. (2023) Academic integrity and artificial intelligence: An overview. In: Eaton SE (ed) Handbook of Academic Integrity. Cham: Springer, pp. 1583–1596.
McMurtrie B (2022) AI and the future of undergraduate writing. The Chronicle of Higher Education, 13 December, 22. Available at: www.chronicle.com/article/ai-and-the-future-of-undergraduate-writing (accessed 1 April 2024).
Mills A (2023) AI text generators: Sources to stimulate discussion among teachers. Available at: https://docs.google.com/document/d/1V1drRG1XlWTBrEwgGqd-cCySUB12JrcoamB5i16-Ezw/edit#heading=h.sot8caygc8jr (accessed 1 April 2024).
Nowik C (2022) The robots are coming! The robots are coming! Nah, the robots are here. Change is Hard Podcast. Available at: https://christinenowik.substack.com/p/the-robots-are-coming-the-robots#detail (accessed 1 April 2024).
Perkins M, Furze L, Roe J et al. (2023) Navigating the generative AI era: Introducing the AI assessment scale for ethical GenAI assessment. arXiv preprint. Available at: https://arxiv.org/ftp/arxiv/papers/2312/2312.07086.pdf (accessed 1 April 2024).
Rudolph J, Tan S and Tan S (2023) ChatGPT: Bullshit spewer or the end of traditional assessments in higher education? Journal of Applied Learning and Teaching 6(1): 342–363.
Sharples M (2022) Automated essay writing: An AIED opinion. International Journal of Artificial Intelligence in Education 32(4): 1119–1126.

0 0 votes

Please Rate this content

0 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

AI and assessment: Rethinking assessment strategies and supporting students in appropriate use of AI

LYNSEY ANNE MEAKIN, SENIOR LECTURER IN EDUCATION, INSTITUTE OF EDUCATION, UNIVERSITY OF DERBY, UK

What is generative AI, what types of assessment can it support and where can it hinder assessment?

Considerations for assessment design

When it is not appropriate to use AI

Assessment design

Supporting students in appropriate use of AI

Concluding thoughts

From this issue

Issue 21: Approaches to assessment

Impact Articles on the same themes

From the editor

A reflective narrative on supporting international trainee teachers to thrive when teaching in English schools

‘Putting the children first, fighting their corner, being prepared to stick their head above a parapet’: Exploring reasons for the SENDCo recruitment crisis

Designing new pathways into teaching: Widening access, representation and inclusion in school communities through teacher degree apprenticeships

Beyond the pipeline: Drivers of ethnic disparities in teacher recruitment, retention and progression in England

A case study of implementing inclusive academic assessment practice for trainees in an employment-based ITE programme

Enhancing action in supporting international pre-service science teachers via self-study

How can training on culturally responsive mentoring support the integration and inclusion of international trainee teachers on a secondary PGCE?

‘We train dogs but we educate people’: How listening to student voice shapes successful and wider outcomes in initial teacher education – findings from a co-research approach

Do I really fit the mould? Gendered perspectives on becoming and belonging in early career teaching