From education to employment
UAL Headline Banner 31 Dec

Ensuring the correct level in end-point assessment design

Jacqui Molkenthin
  • SkillsWorld LIVE is back

One of the most common areas of difficulty I find when working with End-point Assessment Organisations (EpAOs) is in the development (design and testing) of their tools and materials. Some EpAOs lack the confidence or ability to design assessment materials appropriate to the level of the apprenticeship assessment plan they intend to deliver.

When I discuss ‘levels’ with EpAOs, I get a mixed response, usually one of the following:

  • the level has already been set by the Institute for Apprenticeships and Technical Education (IfATE) and trailblazer group, so if the EpAO follows the assessment plan the assessment will be at the correct level;
  • Ofqual condition E9 (qualification levels) does not apply to EpAOs;
  • the assurance of accuracy in the level rests with the External Quality Assurance (EQA) provider, whose remit during the readiness checks, as detailed in the EQA framework, table 3 and 4, is to check that materials are appropriate to the standard and level of the apprenticeship;
  • we don’t know how to.

So, let me tackle each of those responses in turn:

1) Level set by the IfATE and Trailblazer group

Apprenticeship standards and assessment plans are, rightly, designed by industry experts through the trailblazer groups, using the IfATE guidance and associated Ofqual level descriptors for qualifications and occupational competence. However, the inclusion of assessment experts within the trailblazer groups, is not mandated, which means that the groups do not have guaranteed expertise on what levels mean in practice, and how to ensure the application of levels across the design, descriptors and language of the standard and assessment plan.

The other factor to remember is that assessment plans are at a high level: for example, they do not set the questions and answers for tests or professional discussions / interviews, or the detail of the tasks for observations or projects. Under these conditions, there is no guarantee of the correct level in end-point assessment design simply by using the assessment plan. I have worked with over 50 assessment plans during the past 6 years, and have seen a huge variation in the quality and detail contained within assessment plans, which has a direct knock on to the EpAOs who are using the assessment plans to design the assessment tools and materials.

2) Ofqual Condition E9 does not apply

It is correct that Ofqual Condition E9 does not apply to EpAOs. However, the EPA Qualification Level Condition EPA4.1 refers specifically to levels and must be adhered to when designing assessment tools and materials. One could argue that Ofqual Condition E9 actually applies to the IfATE and trailblazer design groups (assigning, reviewing and managing level). This could explain the rationale behind the design of IfATE’s EQA framework and the details of the IfATE’s consideration of levels for assessment delivered by EpAOs.

3) Role of the EQA provider

EQA providers are there to check compliance with requirements, including setting assessments at an appropriate level, it is not their role to show an EpAO how to design and set assessments at the right level. IfATE’s EQA framework establishes how apprenticeship end-point assessments must be externally quality assured. EQA’s must test the readiness of the EpAO to deliver assessments, conduct appropriate monitoring, feedback and provide an annual summary report on each EpAO to IfATE’s Quality Assurance Committee per standard, help EpAOs’ to be compliant and support consistency and continuous improvement, escalating issue to IfATE if mitigation has not achieved the expected outcome.

4) Knowledge and understanding of how to design to the correct level

Just like Trailblazer Design groups, many EpAOs have been borne out of industry expertise and demand, and are not traditional awarding organisations (of the 325 EpAOs, only 51 are Ofqual recognised awarding organisations). This means that they may not have been exposed to the concept and descriptors of levels in designing the assessments, resulting in a lack of understanding of how to apply levels to assessment design.  

So why is it so important to understand levels?

End-point assessment must be robust, valid, reliable, comparable and consistent. If an EpAO designs and tests their assessment tools and materials based on the assessment plan alone, without a clear understanding of levels, there is a significant risk of inconsistency in assessment design which in turn impacts the validity of assessment.

  • Cognassist Masterclass In Article Button MARCH

Now you understand the risk, I’ll introduce you to some of the ways and methods of understanding and embedding the correct level in assessment tools design. To do this I have teamed up with David Jenkins-Handy of Agora Business Consulting with 20 years’ experience in qualification and assessment design. We will follow this up with further articles and training.

One of the most commonly known models for setting learning outcomes is that of Blooms Taxonomy, and replicated here are the categories for the cognitive levels, alongside some common verbs used for each category:

  • Remembering: label, recite, recognise, define, recall, list, quote
  • Understanding: describe, annotate, search, interpret, compare, summarise, explain, restate
  • Applying: demonstrate, present, apply, solve, calculate
  • Analysing: plan, organise, associate, classify, analyse, critique, prioritising
  • Evaluating: evaluate, assess, critique, justify, compare and contrast, relate, recommend, validate
  • Creating: develop, solve, modify, generate, negotiate, formulate, create, compose

Using Bloom’s Taxonomy helps assessment designers establish the number of questions that can be applied to an examination within a cognitive level; so, how many recall/remember questions versus how many application questions in an examination. If you translate this model into end-point assessments, we can develop an understanding of what this means for different types of assessment methods, for example:

 Multiple Choice Questions (MCQs)

  • At level 3 the focus of MCQ tests/examinations concerns the apprentice being given opportunities to demonstrate knowledge and understanding relevant to their occupation.
  • At level 4 the focus of MCQ tests/examinations shifts to knowledge within a context (so, for example, what health and safety legislation/regulation applies in a specific scenario) or how a particular activity should be carried out under given circumstances (so, a question might be based on what would you do if…).


  • At level 3 an Assessor should be able to find evidence that an apprentice exhibits sound knowledge and has a good level of understanding and can suggest ways to apply knowledge effectively.
  • At a level 4 an Assessor should be able to find evidence that an apprentice can articulate in detail how to apply knowledge and understanding (so, what they know and what they do are closely linked in the execution of tasks). They should also be familiar with at least one analytical method to gather and scrutinise information that facilitates the production of evidence for more senior individuals to base their decisions on (so, accuracy, validity and authenticity become important at this level).

Professional discussion

  • At level 3 the discussion would allow the apprentice to express in some detail how they go about their job, the processes carried out, the procedures applied and the specific working practices that are involved.
  • At level 4 the apprentice would be expected to know the processes, procedures and working practices in detail, but could also clearly explain methods and techniques applied in practice, what works well and what works less well in a given context, and how this shapes their ability to produce the evidence or outputs necessary for success in their role.

To aid understanding further, here is one example of why getting the level right is so important in assessment design to ensure reliability, validity and consistency:

In the passenger transport driver level 2 apprenticeship (STO338) there are two parts to the end-point assessment, an observation and a professional review. Skill S27 is assessed via the professional review and the skill requires the apprentice to “Prepare and submit documents, reports and logs containing performance, incident and technical information”.

The assessment plan gives further detail on what the pass criteria looks like, but unfortunately it does not provide any real exemplification on the skill detailed in the standard (pass = “Able to prepare and submit documentation containing performance, incident and technical information”). If you applied Bloom’s Taxonomy to Ofqual’s level descriptors, what could a professional review question look like?

  • List the reports you must complete at the start/end of your shift.
  • Identify the reports you must complete, and describe the reason you must complete them, at the start/end of your shift.
  • Identify how reports completed at the start/end of your shift contribute to the management of risks.

In broad terms, question 1 would be a level 1 question, question 2 a level 2 question, and question 3 a level 3 question. But it is important to remember that it is not just the question that must be set at the right level, the EpAO must also set out the expectations of learners’ responses, appropriate to the level, within the assessor guidance. If the EpAO designed and used a question at the wrong level, and/or set the expectation of the answer at the wrong level, then the assessment would not be valid or fair, risking adverse effects on the apprentice, and reputational damage to the EpAO and for apprenticeships.

The example provided above relates to the theory of design, but there are also practical methods of ensuring the correct level of assessment and assessment judgement, and thus the validity and reliability of assessment. Examples include: (a) interrater reliability, where the testing of scoring takes place to ensure that the assessment judgements are set at the right level as well as the questions themselves; and (b) parallel-form, developing two tests covering the same KSB and comparing the results to test reliability. Another article will be written to explain these in more detail.

We hope that this has provided a useful introduction. You may also have realised, when reading this article, that the concept of levels is equally relevant to apprenticeship training providers when designing their apprenticeship programme / curriculum based on the apprenticeship standard. We will therefore be developing and delivering training and further articles for both EpAOs and training providers on levels during the spring and summer of 2021.

Jacqui Molkenthin, JEML Consulting

Recommend0 recommendationsPublished in Exclusive to FE News

Related Articles