From education to employment

Opening the Black Box: Can AI Repair the Student–Teacher Feedback Loop?

Dr Michael Smith Exclusive

The rhythm of the further education calendar is punctuated by the recurring ritual of the GCSE English & Maths mock exam. Far from being confined to the traditional silence of a sports hall, these assessments frequently spill into classrooms and canteens, transforming everyday spaces into sites of high-stakes practice. 

Mock examinations are intended as a diagnostic milestone for GCSE resit students. Yet it often triggers something else entirely: a period of pedagogical stasis. While teachers disappear beneath mountains of scripts, students often wait days, if not weeks, before receiving any meaningful response to their efforts. By the time feedback arrives, the moment has passed. Errors have hardened, motivation has waned, and the learning opportunity has slipped away.

It is in this gap, between a student’s action and a teacher’s response, that the promise of assessment for learning (AfL) is most frequently lost.

The AfL paradox

The concept of the “black box” of learning, famously articulated by Paul Black and Dylan Wiliam in 1998, described a system where inputs and outputs were visible, but the processes in between remained opaque. AfL emerged as a response: using evidence of student learning to adapt teaching in ways that meet learners’ needs.

Today, AfL is widely accepted as a cornerstone of effective teaching practice. Its principles shape initial teacher education, inspection frameworks and professional discourse. And yet, more than two decades on, even its strongest advocates concede that there are very few institutions where AfL is implemented consistently and well. The issue is not a lack of theoretical clarity. It is a problem of enactment.

Why feedback remains so hard to get right

Feedback sits at the heart of AfL and remains one of its most stubborn challenges. Research has long established what effective feedback looks like:

  • It prioritises improvement over judgement, focusing on next steps rather than grades.
  • It requires cognitive effort from the learner, not just labour from the teacher.
  • It is most powerful when it is timely, arriving close to the point of action.

In practice, however, these principles collide with the realities of further education, particularly in GCSE English and maths resit contexts.

Mock examinations do important work. They provide practice under exam conditions, surface learning gaps, inform curriculum planning and support accountability processes. But when delivered at scale, their feedback function is routinely compromised. Grading crowds out evaluation. Actionable guidance becomes generic or retrospective. And marking delays are unavoidable when hundreds, or thousands, of scripts land on teachers’ desks at once. The result is an assessment system that is diagnostic in intent but summative in effect.

A role for AI in restoring responsiveness

It is against this backdrop that artificial intelligence is increasingly being explored across the FE sector. Advances in large language models now make it possible to process substantial volumes of student work rapidly, identify patterns of misunderstanding, and generate targeted, curriculum-aligned feedback. In practice, this can reintroduce one of AfL’s core conditions: responsiveness.

Evidence from early pilots, suggests that when students receive immediate, task-level feedback, they are more likely to reattempt questions, address misconceptions and engage in independent improvement. In resit settings, where time is short and confidence often fragile, this immediacy is particularly significant.

Importantly, this is not about removing teachers from assessment. In effective implementations, teachers remain central. Setting success criteria, standardising judgements and using emerging insights to adapt teaching in real time. What changes is the distribution of effort. Time previously consumed by repetitive marking can be redirected towards diagnosis, explanation and responsive instruction.

Re-entering the black box

AfL depends on contingent teaching: noticing what students can do, interpreting what it means, and responding accordingly. For too long, the scale of manual assessment in further education has made this ideal difficult to realise consistently.

The opportunity presented by AI-supported assessment lies in reducing the mechanical burden of marking, allowing teachers to re-engage with learning as it unfolds. When feedback becomes timely, actionable and iterative, assessment begins to function as a genuine engine of progress rather than a retrospective audit.

The challenges facing GCSE resit education have never been a failure of theory. They are a failure of feasibility. If new approaches can help teachers step out from behind the marking pile and back into the learning process itself, then perhaps the black box need not remain closed.

By Dr Michael Smith is a specialist in educational assessment, an academic researcher and co-founder of Markus. He has worked in further education for over 17 years, most recently as Vice Principal at a London College.

References

Black, P. and Wiliam, D. (1998) Inside the black box: raising standards through classroom assessment. London: School of Education, King’s College London.


Responses