Some recent conversations have made me want to return to this theme. I really hope we can build some momentum around this issue. Here goes:
I reckon that in 50 years time, we will look back at the current Ofsted-grading era as one of the big educational blackspots of history. Serious educationalists and policy makers will laugh in knowing horror (much as they do now about VAK) learning styles) at the extraordinary folly of a defunct inspection regime that involved sending a tiny team of people to schools they’d never been to before for a day or two to evaluate them against a massively long list of criteria and give them an overall one-word judgement. All of this while also projecting a national illusion that these judgements made by different people were fair, accurate, reliable and consistent across time and across the nation. And all of that alongside the delusion that this actually made for an ever-improving education system. Ho ho.
But, for now, when we’re in the thick of the not-so-funny reality of it, let’s explore reasons for ditching the grades:
1. They are a ludicrous over-simplication
School A: Good. School B: Requires Improvement. School C: Outstanding. It’s extraordinary really. How ever much you fudge it or dress it up, the effect of this system is to suggest that any School B must be worse than any school A; that the Bs require improvement where a School A does not; a School A is Good- just not as good as any School C. School Cs are definitely better than School Bs. They are outstanding.
The rationale seems to be that parents, who MUST HAVE CHOICE, need a simple system to understand (because they’re a bit basic and love a good rating system (see also films and restaurants….). But this is massively misleading and patronising at the same time. Aside from the ugly reality of the choice issue, the truth is that schools are just too complicated to sum up in this way. There is always in-school variation, fluctuations over time and a long list of strengths and areas for development. There is also the big factor of context: some schools could hardly fail to be Outstanding – everything is in their favour; in other schools, where there are many more students down the other end of the attainment bell-curve, it will always be a struggle to even scrape into Good.
The list of lines of enquiry in the Ofsted framework is massive- take a look: learning, teaching, assessment, leadership, safeguarding, SEND, behaviour, curriculum…. in lots of details across departments, year groups. Isn’t it just CRAZY to think we can sum all this up in a single judgement? It’s no less ridiculous than suggesting that Leeds is Good but Coventry ‘Requires Improvement’ – as if we could or would ever make those judgments or give them any official status.
2. Inspection is fundamentally too subjective and unreliable
I could cite numerous examples of neighbouring sets of schools that I’ve known quite well where their grade did not tell the school’s story- the Good school arguably more effective in what it did than the Outstanding school, for example. To date, there have been no reliability trials in secondary schools to test whether different teams would arrive at similar judgements or would interpret observations and data in similar ways; I would argue that the whole process is massively flawed; largely an expensive exercise of confirmation bias and group think.
I know of numerous cases – recent ones – where inspectors tell Heads that they were ‘on the cusp’ of Good, but still RI, or ‘Good with Outstanding Features’ or that, on balance they had ‘just crept into Outstanding’ – these spurious cliff-edge decisions turning more or less on the general hunch of the lead inspector, sometimes after a bit of a tussle over whether one subset of data meant they would go up or down. That’s how it works. There is just no way that these judgements can be consistent or that the borderline cases that fall one way or the other could be shown to be objectively secure.
Then there is the validity of all the constituent processes: looking at books, talking to tiny samples of students and observing lessons. I observe lots of lessons every week and I know that it makes a big difference whether you know what the typical outcomes are for a particular group or teacher. If you know a teacher to be very successful – or the opposite- it helps to contextualise the activities and interactions in a room. But inspectors do not know the outcomes for a given class when they observe – and assume special powers in their capacity to ‘see learning’ as they zoom round flicking through a few books and having a few rushed chats with a few students.
10 minute walk-through snapshots are massively flawed; doing 30 flawed snapshots in a day doesn’t make each bit any less flawed. I’ve been with a Lead Inspector who made sniffy comments about lessons from teachers who secured superb outcomes… she (quite literally) didn’t know what she was talking about; she was guessing, projecting her bias – quite blatantly. That will happen a lot. And how many books do you need to see to judge standards? What might be a fair and reliable sample size? Inspections of big schools don’t get anywhere near the level required. Inspection is arguably a statistical disaster in terms of valid sampling. Sigh…. I could go on. Inspections of schools by strangers involves too much guessing. Guessing with consequences.
3. The public interest illusion: schools needing the most help are damaged even more.
I’ve discussed the consequences of grading with several Heads recently. In schools judged to be RI, it’s quite common for the Head to agree with many of the actual findings in the report – especially when they’ve had a good inspector inviting open dialogue. There might not be any dispute about the areas that need improvement – but the problems come with the whole status of the judgement. For one Head, he accepted the agenda for improvement but was bitter that the nearby school had a Good, not RI, when the data and various other indicators were identical. It had consequences. For other heads, they said their RI had immediate consequences for student recruitment and teacher recruitment, fuelled by the typical shallow inflammatory reporting in the local press.
In a totally direct and obvious way, the lower school grades have the effect of shaming schools and their leaders. This makes it harder for them to do their work, not easier. How is this in the public interest? It just isn’t. Especially given the reliability issues. People are being sold a lie – the false certainty of a super-crude overall judgement. What kind of system puts its school leaders’ heads on a stick for public vilification in this way – in the name of standards? It’s madness. Isn’t it?
4. The hideous hubris of Outstanding – and the albatross effect.
I’ll admit to finding the public big-ups around Outstanding judgements irritating. In particular I hate the over-used claim that ‘it’s testament to the hard work and dedication of staff and students’. No question, staff in RI schools are probably working just as hard, if not harder – and some Outstandings are like falling off a log compared to some Goods. This idea of people ‘deserving’ Outstanding is pernicious given how much contextual variation there is across the country.
In other discussions I’ve met a number of heads in the last year who inherited schools with the Outstanding label only to find that this was not the truth of their school. In many cases, the Outstanding is based on an ancient framework and exam regime and nobody has been back since. What do you do? It’s a big risk to burst the bubble for your school community – this school isn’t as good as you think it is. So you start to paper over the cracks; you keep up the facade and suffer the consequences when a new inspection adjusts the grade to the new reality. I’ve lost count of how many people have shared this experience with me.
I’ve also seen the terrible personal impact on Heads that lose their Outstanding status – dropping down even to a respectable Good – having invested so much in that grade, allowing it to define their success and sense of professional worth. I also know several Heads who pitched for a ‘Good’ and not Outstanding purely to protect themselves from landing in the albatross situation. It’s just healthier all round to be Good, with room to improve and no overblown status to lose with all the painful public scrutiny that entails. (The school is the same whatever some team or other says about it and they’re gone after a day or two so why dance the dance?)
It’s certainly my sense that, where Heads decide not to push at a grade boundary, they perceive the inspection process to be much more satisfactory – fair, helpful, sensible – because the grading BS has been removed from the equation and everyone can get on with talking about the issues. They even report that the inspectors hint at this too. Everyone wins once the grading is a non-issue.
5. Every School Requires Improvement.
Finally, isn’t this just the most obvious thing; all schools require improvement. Wouldn’t it just be so much better if we took all the labels off the reports, forced people to read them and left all schools with a record of their areas of strength and areas for development? Sure, we need a category for ‘below the line’ – and a separate process for dealing with urgent safeguarding failures – but even here I would argue that it should be called something that suggested maximum support was on its way, recognising the challenges at work – not the pejorative Jack Boots of ‘inadequate’ that just kicks everyone in the teeth.
In a healthy future system, schools will be evaluated in the round, over time, in depth, in context, by teams that include people who know them, with processes that stand up to validity testing – and the public will be educated to understand the level of complexity and nuance inherent in any sound comparison of schools or measure of improvement. That’s the future – it will come, I’m sure of it – because, ultimately, bad ideas die in the face of sense and evidence. We just need to make sure it happens – and not hang about. Let’s do it now. #ditchthegrades