When results from public exams and assessments are published, the first response should, of course, be to congratulate students on their achievements. Every student deserves to enjoy their success; they worked hard; their teachers worked hard. Celebrations are due! Even though the circumstances in the last two years have been radically different to any other before, a student who has done all that’s asked of them has a right to feel that their grades have as much value as anyone else’s. Every student is entitled to believe their grades are legitimate, arrived at through a fair process. I hope most students feel that way and have negotiated the progression paths they’ve earned access to.
However, after that pause for celebration, acknowledging the turbulence that students have endured, it should be perfectly ok to discuss the issues surrounding the nature of the assessments we use to award qualifications and to consider the issue of grade inflation. It’s not some kind of mean-spirited denigration of people’s achievements – it’s a technical discussion about the way we determine standards and the value we give to outcomes.
I also think it’s important not to make it a party political issue or consider it as unique to our country. Grade inflation is discussed in Ireland, Sweden, the US, Canada; it’s discussed in relation to Higher Education as well as GCSEs and A levels. Why? Because it’s an inherent issue in any large-scale assessment system: the mechanisms we use to determine the standards that students have reached and maintain them over time are complex. Our attitudes and beliefs around standards and access to qualifications are complex.
It’s not a question of trust in teachers’ professionalism or knowledge – I see it as a major error when people react against concerns about grade inflation by seeking to defend teachers as if their integrity is under attack. This is so common – and it prevents us from having the technical discussions we need. It’s not about cheating. It’s not ‘nonsense’ – as if we can just sweep it away. As I explore in this post, the trust we need is in systems that teacher operate in, not the teachers themselves.
This year, the outcomes have been determined in a way that locks in ‘benefit of the doubt’; the ‘best of three’ effect has fed into teachers’ judgements so a significant increase in top grades was inevitable, especially when any cohort-wide counterbalancing moderation seems to have been negligible. To some this represents a fairer representation of the standards students have reached compared to the exam-dominant system from 2019 and before. That’s hugely debatable in itself but, in any case, it’s important not to confuse this year’s rise with the steady creep of inflation that can happen even when the assessment format is identical year to year.
Why is grade inflation a problem? It matters if we want the grades students receive to have value and credibility – all the grades, not just the ones at the top. Even people who want to dismiss this issue would have a limit. Let’s say the normal pattern is that 25% of grades are A/A* or 7-9 but then it shifts to 30%. Some people will say the usual ‘testament to everyone’s hard work, trust teachers’ stuff and not want to acknowledge the inflation. They will see it as more students having reached the standard, rather than the standard dropping. But what if it then goes to 35%, then 40, 45, 50. At some point everyone commenting on this would reach a point where they’d think – hang on, that’s rather too many people getting the top grades.
Once people doubt the value of grades, everyone loses. In our system, if we are to have a set of graded outcomes, we need some confidence that a grade means a standard – and like it or not, a bell-curve reference is inherently part of that. In my assessment training I often refer to this curve:
It comes from a trial Maths exam taken before new GCSE rolled out. The point is that the shape of the distribution is a property of the performance of the cohort – some 5000 students who took this test. The 0-100 range is scaled from the original lowest to highest raw scores. This is no conspiracy. The question then is how we use this to determine a set of standards and then award credit for reaching them. The grade boundaries are entirely arbitrary. If we run two sets of assessments with similar large cohorts, and want to align the standards of those assessments, the most secure assumption we have is that the distributions of performance would be similar – so we shift the boundaries to make that happen. We invest in extensive moderation. If we let the number of people achieving the higher grades increase year on year, we can be deluded into thinking standards are rising – ie more students are learning more. In fact, it’s more likely we are letting standards slip because we’re making it easier to get the higher grades. This is such an important distinction and if we conflate, ignore, dismiss…. we have a system that people lose faith in – so they start setting their own extra tests and nobody benefits.
Of course, we need to allow for underlying standards to rise – so it’s not rigid. However the pace of that is likely to be gradual. I remember being told by an Ofqual bigwig that for a whole cohort of Maths GCSE passes at 4+ to rise by 0.7% in one year was ‘pretty generous’; this was in line with evidence from national reference tests that deal with type of problem. That means that rises of 5% or more do not represent standards rising – they mean the test was easier or the grading more generous/lenient – depending on your perspective. This isn’t then about students working hard and getting their rewards – students work hard every year!
I don’t think this should be controversial. What frustrates me is how many people involved in education don’t even seem to understand how the system works in normal times, never mind when we’ve had a two-year aberration where we’re not comparing like with like. I don’t even think ‘inflation’ is the right word for what happened this year – it’s a step-change resulting from using an entirely different system and it means there’s a risk we’ve partially lost track of the underlying standards the grades were originally pegged to. If we kept the teacher assessment system, it would require total recalibration. If we’re going to return to exams next year, we can treat 2020 and 2021 as anomalous years in the trend. However, at some point, we’ll need to reestablish the baseline where people know and understand what constitutes the standards for all the grades and for that we need consistency and minimal change.
Of course, we’re still left with the issue of the ‘forgotten’ third. That’s a big problem. Lower grades have been utterly devalued even though it’s statistically not possible in all likelihood for everyone to ‘pass’ with grade 4 or above. We have the bell curve cage problem. The answer to that lies in finding different ways of measuring success (yes, I mean the National Bacc for example) and investing in the long-run struggle of teaching students more effectively from day one – but it does not lie in trying to force more students higher up the bell-curve by gerrymandering the outcomes. That’s a lose-lose situation.