A group of UK technologists and academics have demonstrated that a new method of assessment, similar to that used in optometry testing, can improve exam grading accuracy by up to 30% from the current reported average of 56.2%.
The model developed by education technology company Digital Assess, with consultancy from exam research experts Professor Richard Kimbell (Goldsmiths, University of London) and Dr Alastair Pollitt (Cambridge), is based on Adaptive Comparative Judgement (ACJ), and allows examiners to more precisely identify the standard of an exam script. It has been proven to significantly increase marking accuracy in open ended assessment.
Based on the Law of Comparative Judgement, ACJ proves that people are better at making comparative, paired judgements rather than absolute ones. This logic is utilised by opticians to accurately assess the required prescription of a patient. Rather than asking patients to state precisely how much clearer the lens makes their sight, they are instead asked to repeatedly compare one lens against another and determine which helps them to see more clearly. After a number of comparisons an extremely accurate conclusion is reached.
Professor Richard Kimbell explains: “In the current exam system, examiners make an absolute judgement by marking every paper according to a mark scheme. We have shown that asking the same markers to compare one paper to another side-by-side and declare which is better can produce more consistent results. Repeating this process multiple times with multiple assessors enables a mathematical algorithm to produce a reliable ranking of all candidates. Grade boundaries can then be applied against this.”
ACJ has been proven to significantly improve accuracy in as few as 8-10 sets of paired comparisons. A recent Ofqual report showed that in 2016, 43.8% of all reviewed marks for GCSE, AS and A-level exam scripts were returned with a different mark. Grade changes were more prevalent in subjects where the nature of assessment is more subjective than others.
Soft skills such as ‘critical thinking’ or ‘original interpretation’ are difficult to measure consistently, and mark schemes are less applicable to these than to hard skills such as factual or theoretical knowledge, leaving them open to the discretion of one marker.
Matt Wingfield, Chief Business Development Officer at Digital Assess, explains: “Mark schemes lend themselves well to regurgitating facts and ticking boxes, but assessing soft skills is much harder, as they are inherently more subjective and mean different things to different markers. This failure to accurately assess soft skills is of little use to employers, who actively seek these when recruiting.”
“It is unrealistic to assume all examiners will form the same judgements, and ACJ takes this requirement away by automatically incorporating lots of expert opinions into the final ranking. It also makes it easier for examiners to clearly see what good work looks like, because they can compare an entire cohort of work, instead of looking at one in isolation and using only a mark scheme as a gauge.”
“The outcome of an eye test is by definition ‘high stakes’ - where people’s health is involved, it needs to be highly accurate - but exam results can determine people’s future, and should be no less accurate. Why not apply the logic of one to the other? The traditional assessment method has been historically unfair, and no doubt we will again see lots of exam appeals following results day this year.”
ACJ has consistently proven its accuracy in tests with awarding bodies and leading institutions worldwide, and is also applicable to coursework or portfolios of work, as well as end point assessment.