AI and assessment: Rethinking assessment strategies and supporting students in appropriate use of AI

Written by: Lynsey Meakin
7 min read
LYNSEY ANNE MEAKIN, SENIOR LECTURER IN EDUCATION, INSTITUTE OF EDUCATION, UNIVERSITY OF DERBY, UK

Traditional assessment methods often fall short in capturing students’ true abilities and fostering authentic learning experiences, and the introduction of AI (artificial intelligence) tools further complicates the task of accurately assessing genuine student learning.  

This article will provide a brief explanation of what generative artificial intelligence is, and will consider what types of assessment it can support and where it can hinder assessment. Details about how AI can be incorporated into assessments when it is appropriate will be provided, and suggestions for how assessments can be developed to reduce the likelihood of AI being used when its use is not appropriate will also be included.

What is generative AI, what types of assessment can it support and where can it hinder assessment?

Generative artificial intelligence (GenAI) is a natural language-processing artificial intelligence system. It is a machine-learning system that has been trained on a massive dataset of text from the internet, including books, articles and websites (Bessette, 2023). GenAI uses algorithms to access this dataset and make predictions about how to string words together, putting one word in front of another based on statistical probability, much like an enhanced predictive text or the autocomplete function of a search engine (Floridi, 2023). 

This ability to generate brand-new text to user prompts means that its use can hinder assessment of learning, especially if students are using it to answer an assessment question for them. Equally, though, GenAI can support learning when used legitimately – for example, to help with grammar and spelling or as a search tool to research assignment topics. GenAI can also be a support tool to help students to understand explanations of concepts and even to plan and develop an outline structure or generate ideas for a written assessment.

Considerations for assessment design

To ensure that it is our students completing assessments and demonstrating their learning, assignments and assessments should be constructed in such a way that they are challenging to complete using AI tools or by copying from external sources (Rudolph et al., 2023). 

The evolving nature of AI tools and technologies requires open and ongoing communication with students around the permissibility of AI tools to ensure responsible use. It is possible to categorise AI use in assessments into three areas: 

  1. AI is not permitted 🡪 AI tools may not be used to complete any portion of the assignment
  2. AI can be used in specific ways 🡪 AI tools may be used by students in certain ways, but not in others; this will have to be clarified to students
  3. AI is permitted 🡪 Students are permitted and/or encouraged to use AI tools to support their learning as they complete the assignment (for example, brainstorming, planning, drafting, revising).

 

These expectations should be clearly communicated so that students know whether AI tools can be used, and if so, how they can be used. Educators should clearly communicate details about student responsibilities, including whether and how citation should be completed. To avoid academic misconduct, students should also be made aware of potential consequences of the misuse of AI. Perkins et al. (2023) have developed an AI Assessment Scale (AIAS), which allows for the integration of GenAI tools into educational assessment by detailing different levels of AI usage in assessments, based on the assessment learning outcomes. Use of the AIAS provides increased clarity and transparency for both students and educators, and offers an approach to assessment that embraces the opportunities of GenAI while recognising that there are situations where such tools may not be pedagogically appropriate or necessary (Perkins et al., 2023).

When it is not appropriate to use AI

Assessments can be developed to reduce the likelihood of AI being used – for example, by avoiding assessment formats that are easily addressed by GenAI, such as essays without personalised application; questions that require closed answers, such as those used in multiple-choice questions or short-answer exam questions that ask students to define, list or reproduce; and assessments requiring information that can be found on the internet. 

Students could be asked to write about a highly specific and niche topic, on which GenAI may find it difficult to find relevant information, or students could be tasked with including personal experiences or perspectives in their writing, as it is difficult for AI systems to replicate this (Nowik, 2022).

Assessment design

  • Emphasise critical thinking. Assignments that require critical thinking, problem-solving and analysis and projects that demand students to apply their knowledge in unique ways are less likely to be generated by AI. Assignments should be created that foster students’ creative and critical-thinking abilities (Rudolph et al., 2023). For example, assessments could be created that require students to deliver presentations or performances. 
  • Project-based learning, group projects and presentations can encourage collaborative learning and may therefore discourage individual cheating. Tasks can be assigned within projects that require each team member to contribute their unique skills and expertise.
  • Conducting in-class assessments, both written and oral, reduces opportunities for AI-assisted cheating and can include analysis that draws on class discussions (Mills, 2023). These in-class assessments are formative – or assessment for learning – enabling educators to measure real-time understanding of the material and to monitor student progress. 
  • Implementing timed exams limits the time available for students to consult external sources or use AI-generated content. Ensure that the time allocated is reasonable for the scope of the assessment.
  • Open-book exams with a twist include questions that require higher-order thinking or apply concepts in novel ways. This makes it less advantageous for students to rely solely on reference materials.
  • Oral assessments mean that if students are able to speak intelligently on the topic and defend their position or arguments, then they have demonstrated learning regardless of whether AI has been used or not (Kumar et al., 2023). Oral assessments can test a student’s understanding, gauging their ability to respond to questions and prompts in conversation. 
  • Adaptive testing techniques should be used where questions are tailored to a student’s performance. This makes it difficult for students to share answers, since each test may be unique.
  • Peer review and evaluation allows students the opportunity to ‘teach-back’ (Sharples, 2022), a communication confirmation method whereby students demonstrate their understanding in speech. The incorporation of peer review and evaluation processes for assignments and projects can help to identify discrepancies in group-work, lead to revisions of work and encourage students to hold each other accountable.

 

There are several strategies that may make it more difficult for AI technologies to produce satisfactory results – for example, asking students to use sources that require an institutional subscription within the assignment or ensuring that questions are applied more to each student’s context. You could also have staged submission of the final assessment, requiring an outline, research notes and drafts, where students explicitly respond to feedback, or you could have students submit a portfolio of the work that they have done on the way to their final submitted work.

Supporting students in appropriate use of AI

Students need to be made aware of the limitations of using GenAI, including: 

  • Incorrect, inaccurate and mis-information: GenAI responses may not always be accurate or factually correct, especially when the AI generates content that cannot be grounded in any of its training data or the source content provided and it has ‘hallucinations’ (Fitzpatrick et al., 2023). GenAI cannot filter relevant information or distinguish between credible and less-credible sources, so inaccuracies can result. 
  • Biased, prejudiced or stereotypical responses: Generated responses may replicate historical and societal prejudices and stereotypes that were present in the set of text data used to build the GenAI (Eke, 2023; Fitzpatrick et al., 2023). 
  • No understanding of the context: GenAI will process and generate natural language, but it has no understanding of the content of the queries, the context of the prompts or the physical world (Alvero, 2023).

 

GenAI is not able to assess the value or accuracy of the information that it provides, meaning should be critical of AI-generated content. They should confirm all information, fact-checking and cross-referencing information from multiple sources to ensure informed decision-making (Alvero, 2023; Fitzpatrick et al., 2023). Students could also be tasked with marking or reviewing a piece of AI writing, exploring the limitations of GenAI. Providing a clear rationale for assessment conditions will help students to see the value in responsible uses of AI technologies.

Concluding thoughts 

A key purpose of assessment is to provide evidence of student learning. This evidence depends on submitted work accurately representing each student’s contribution. This article has considered how traditional assessments can be redesigned to accommodate the advent of AI into the educational arena, allowing educators to feel assured that learning is being assessed and that individual students are demonstrating their knowledge and application of that knowledge.

McMurtrie (2022) argues that GenAI tools will become part of everyday writing in some shape or form, in the same way that calculators and computers have become part of maths and science. It is, therefore, very possible that human–AI collaboration will become the norm in assessments, so that we adopt ‘assessment as learning’. The AI Assessment Scale has been included to cater for this potential future of assessments in education. The impact of GenAI provides an opportunity to rethink assessment strategies so that students demonstrate their knowledge, skill and learning, whilst accommodating the potentially prolific use of GenAI.

    0 0 votes
    Please Rate this content
    Subscribe
    Notify of
    0 Comments
    Oldest
    Newest Most Voted
    Inline Feedbacks
    View all comments

    From this issue

    Impact Articles on the same themes