6.3.13 University Policy and Guidance on Scaling by Boards of Examiners at the Assessment Level
1. This document sets out University policy and associated guidance on scaling at assessment level by boards of examiners (approved by Senate in February 2012). The policy articulates a transparent, but flexible, approach that recognises the differences in assessment methods across the University.
What is scaling?
2. Scaling is the adjustment of marks for an entire cohort carried out at on an assessment item so that the marks better reflect the achievement of the students as defined by the University Mark Descriptors. The need to scale typically arises from a problem with an assessment resulting in student outcomes that do not map onto the University Mark Descriptors. In addition, the requirement for scaling may arise from an optional part to an assessment where one group of students appear to have been disadvantaged simply by their choice of option. In both cases, the outcomes of an assessment are deemed to not accurately reflect what other sources of evidence would show to be an expected level of student achievement.
Deciding when to apply scaling
3. Scaling of marks should only be done in exceptional circumstances. There is no specific expectation that scaling should be done, and University policy does not mandate an approach to formulate new marks. However, the policy mandates that whenever scaling is applied to an assessment it must always maintain the ranked position of each student within a specific assignment. Where scaling is applied to an optional part of an assessment, the ranked position of each student within a treatment set should be maintained (a treatment set referring to all those students who selected a specific option in an assignment that is being scaled).
4. Assessment tasks should be designed so they map onto the University Mark Descriptors. Assuming this is achieved the students outcome on an assessment task will accurately reflect their performance according to the anticipated outcomes of that module.
5. Thus, scaling should only be required when there are acknowledged problems in the assessment process. For instance, scaling may be needed where an error or ambiguity in an examination question is discovered or an item of course work turns out to be harder or easier than intended.
6. In such cases these issues affect an entire cohort (everyone who sat that particular assignment) and therefore action should be applied to anyone who submitted that assignment. When action is taken it is important to apply the same treatment process to all students.
7. Therefore, scaling is fundamentally different from a routine adjustments of marks. Thus, the process of moderation which may find addition errors on scripts, and double marking which requires negotiation between staff to agree differences between individual’s marks, are not examples of scaling.
8. Since the need to scale is an acknowledgement of an assessment’s failure to map marks to the University Mark Descriptors the approaches used to re-map the marks should be discussed and clearly documented in examination board minutes.
Examples of when it may be appropriate to scale
9. Typical mark ranges vary across disciplines, so it is not practicable to define precise institutional guidelines. However, generally a module’s mean will fall within certain limits. These should be roughly comparable across modules on which the same students are registered. There are however explainable differences between modules that might result in significantly different module means. Examples include, where students have engaged with a module and therefore have performed exceptionally well in an assessment or where the students registered on the module are unrepresentative of the cohort of a particular programme, or where poor results are potential explained by other factors such as poor attendance.
10. On rare occasions, it might be necessary to consider scaling marks down. For instance, a new member of staff working with qualitative evaluation may have misjudged an academic level of study. Such changes are unlikely to be necessary with quantitative / template marking since such issues should have been resolved through standard shadowing / mentoring approaches.
11. Scaling may be considered when:
a) there is a significant, known and clearly identifiable issue with an assessment such as an error or ambiguity, or;
b) The range of marks significantly fails to match student performance, for instance failing to fit onto the University Mark Descriptors which might be evidenced by one or more of the following:
- An atypical: mean, distribution (i.e. unusual patterns of high or low marks) or overall mark spread
- The range of marks is not in line with that which would be expected from past performance on this module
- The range of marks is not in line with that which has been achieved by the same students registered on other modules at that level
- The number of fails is not in line with that which has been achieved by the same students registered on other modules at that level
- The mark profile is not that which would be expected from students’ past performance on this module
12. It should be noted that the above criteria are not defined as a means to require a boards of examiners to scale. Rather, they are guidelines as to when it might be appropriate to consider scaling. Any final decision as to whether or not to scale should always focus on a whether there is a significant misalignment to the student outcomes to the University Mark Descriptors.
Outcomes of scaling
13. Typically scaling is used to increase a spread of students across each University Mark Descriptor (see Figure 1) or to re-align a high or low average attainment (see figure 2). The exact approach adopted to achieve this very much depends on the nature of the issue with the assessment, however in all cases exam board should be assured that the resulting range of marks provides an accurate mapping to the University Mark Descriptors.
Figure 1: Scaling to use the full range of attainment Figure 2: scaling to realign Mark descriptors
14. To understand how scaling may be used, consider the following scenario. An examination paper is set requiring a calculation that took students longer to complete than expected. This resulted in all but a few students failing to complete the paper. In this scenario, the outcome of the assessment’s mark profile is likely to be that represented in Figure 2 (left curve). The module team may wish, given the problems with the paper, to scale attainment upwards (i.e. increase the module average). Since module teams are required to map marks to the University Mark Descriptors they should use a range of scripts (i.e. those from the top, middle and bottom ranges of the assessment rank) and come to a conclusion as to an appropriate re-mapping of the marks so the outcomes accurately represent student performance. In the above example, the team may reason that a proportion of the paper was not accessible to the students due to time constrains and therefore re-assess outcomes on the basis on the percentage of the paper it was reasonable for students to have completed within the allotted time. However please note, in this example any student who achieved full marks prior to the remapping would be disadvantages by scaling as their excellent performance cannot be recognised from those that have been scaled to achieve the maximum. Consequently, it is important that scaling opportunities are used by exception and that module teams endeavoured to resolve any issues in future years.
Examples of when not to scale
15. Scaling is difficult to do accurately when a cohort is small, i.e. of less than 15 students. This is because statistical comparisons are unlikely to be valid. In such cases all scripts should ideally be re-moderated/remarked, but it is recognised for certain assessment types (such as multiple-choice questions) scaling may be the only alternative to change the distribution of marks. In this case scaling may be used but only in if the issue is deemed to be significant and re-moderated/remarked would not resolve the issue.
Scaling at various assessment levels
16. If scaling is to take place it should be applied to a run of marks for a specific assessment (e.g. an item of coursework or single examination). Normally it should not be applied to marks resulting from a collection of assignments (i.e. a module mark). It is also acceptable to scale parts of an assessment where students are required to select to respond to optional questions.
Timing of scaling
17. Scaling should take place prior to final board of examiners’ meeting as regulations do not permit marks to be changed once marks are confirmed by the Board.
18. When considering scaling it is important to remember that, in accordance with the University Principles of Assessment, marks are an important form of feedback to students on their progress.
19. Ideally scaling should be applied before assessment marks are returned to students, but only once the appropriate quality procedures have been completed. Creating a long delay on returning feedback to students on their course is undesirable so sometimes it will be necessary to release provisional marks to students before scaling has taken place. Furthermore, it is recognised that sometimes assessment outcomes balance over a module (e.g. a module may have one assessment where the mean mark is high and another on which the mean mark is lower). Any marks returned to students should always include the statement that marks are provisional until approved by the BoE. However, any changes made to marks through the application of scaling must at some point be communicated to students.
20. As noted above the need for scaling is a clear indication of an issue with an assessment, so where such cases occur it is anticipated that some form of investigation will be carried out to mitigate for the issue in future years.
Scaling and the relationship to discretion
21. Scaling is not the only approach available to Boards to consider the impact of varying module averages on student classification. It is recognised that members of the Board may express concerns regarding the advantage of students registered on modules with high module averages compared to those registered on modules with low averages when these modules are defined as optional. An alternative to scaling may be to consider these cases as a criterion for applying discretion. For instance, any student at a borderline and whose performance may have been impacted by a low module average may be considered as a case for considering promotion.
Procedure for the Application of Scaling
22. Staff who intend to scale should:
a) check that one or more of the conditions in paragraph 11 above applies;
b) ensure that the chair of the board of examiners in consultation with the module coordinator agrees that the initial assessment marks do not fairly reflect student performance and that left unchanged they inappropriately represent student performance;
c) ensure that, where the assessment forms a significant part of the module (25% or greater), the external examiner has been consulted and agrees the proposed re-mapping of marks;
d) ensure that any new mapping of marks appropriately relates to the University Mark Descriptors;
e) document any decisions made and ensure these are reported to the board of examiners and noted in the board’s minutes;
f) notify students the reasons and approach used to scale their marks;
g) identify how the issue with the assessment is going to be mitigated in future years to ensure that scaling is not routinely required.