Evidence of our deep ‘judge and rate’ culture can be seen and heard everywhere. Nearly everyone does it – it’s in us!
Great teacher; great lesson. Ah,that was a crap lesson…Hmm, not bad, a solid lesson…. oh wow, what a wonderful lesson. Lovely books; shoddy books. In old money – probably a 3 on the cusp of a 2. He’s an outstanding teacher. I was graded Outstanding in my last observation. She is a superb teacher. He is a weak teacher. That had some elements of ‘good’ but not quite enough. The quality of education this school provides is Good. Mr Jones is way better than Mr Smith. School X is better than Academy Y.
We judge. We rate. It’s part of the language.
But – what does all this mean? The assumption implicit in all of this is that, somehow, we know! There’s an agreed background scale of quality that is discernible and commonly understood so that ‘good’ has meaning. We know what the standards are; we know it when we see it. We just know! But… do we? And if we do, so what?
If I meet some people discussing their fitness, they might all express a wish to be fitter. We could spend time worrying about ‘how fit’ each person is comparing various metrics – there all kinds of fitness tests with their own scales. Each metric comes with a values judgement about how much we care about them but we could agree on a way to aggregate them into some kind of holistic scale of fitness. Then we could line up all people and decide if they might be ‘unfit’, ‘fit enough’, ‘very fit’ or ‘super fit’ – or something! We could then decide the best kind of exercise for each person to do to move up the scale.
But, actually, even though definable measures actually exist, we can get fitter without any of them. ‘How fit are you?’ – is normally answered based on a feeling we have about how our bodies function during exercise. We don’t have the stop-watch out.
There are lots of areas of life where we make an evaluation that something could be improved or enhanced without needing to arrive at a measured judgement on a quality scale of some kind. Nearly everyone could be healthier, better at playing piano or more fluent in another language… and we might discuss the strategies that might help to move up those scales. Crucially we can usually identify the strategies without really needing to say how healthy, how good on the piano or how fluent we are any time: there are things we can do to improve. If someone asks me -are you any good on the guitar? – I say, hmm, ‘quite good I suppose?’ It’s meaningless really. But I know I can play better; I know where I have to apply effort, what to practice. ‘How good’ means nothing; playing better means a lot.
In teaching, for multiple reasons (bad science, ignorance, power cultures, accountability obsession) we’ve been through an extended period where people have tried to treat teaching like fitness. Our system has been infused with the delusional and toxic idea that teaching can be evaluated on a scale – not just overall, but during individual lessons. I’ve met inspectors and leaders who, even when challenged and presented with research to the contrary – will assert that they personally can ‘just tell’ how good a lesson is; they assert that from day to day, school to school, teacher to teacher, they are able to maintain a consistent and coherent scale of quality in their heads that enables them, in their mighty wisdom, to make judgements which are reliable, valid, consistent, fair. It’s tragic.
Just for instance, here’s Dylan Wiliam on the subject:

I think schools that still grade individual lessons (and book scrutinies!!) are in the minority; that culture is changing. Intelligent school leaders have long-since dropped the folly of grading. (See this post for more on reasons to do so: The delusional voodoo of grading lessons has got to stop.). But the culture of judgement remains strong. It runs deep. Most school leaders have come up through a system where quality assurance and accountability systems have more or less demanded that they judge everything: Lessons, teachers, books, each other. Judge, judge, judge. It’s a badge of self-respect -that, as a leader, you can ‘tell’ quality when you see it; you set standards and know when you see them. Even if you no longer grade officially – you do so unofficially. You will have teachers you regard as ‘excellent’, ‘sound’, ‘a bit crap but the best we could find’ – or whatever!
Now, these judgements are more or less unavoidable. Judging is human! In everyday life we evaluate people all the time. We size people up; we decide if trust them, if we like them, if they are funny, wise, interesting, witty, boring, selfish, generous and so on. But, crucially, unless we’ve taken some truth drugs, we keep these opinions to ourselves, recognising their subjectivity and the potentially damaging social consequences for telling people what you think of them. In a professional teaching context, we should apply the same filters: our evidence base is also deeply subjective and there are professional consequences for being wrong.
In my experience as a leader and now trainer of leaders, far too much time is spent thinking about the judgement-making – even in the absence of grades. It’s still common practice for a leader to observe a series of lessons and then spend hours writing up all their observations on a complex proforma, carefully crafting their www’s and ebi’s. God knows how many hours of my life I’ve wasted doing the same in the past. There are leaders who pride themselves on writing good reports or on their systems of learning walks where teachers get slips or emails telling them the range of judgements that were made on their 10 minute fly-by visits. (There are teachers who, sadly, still crave that affirmation of a big-up from their observers, who still write ‘Rated Outstanding’ on their application forms).
None of this helps; none of it is necessary. I think leaders need to re-think their entire role when it comes to lesson observations or other forms of scrutiny. Instead of thinking – ‘how good is this?’ (where does it sit on the scale) – they just need to think: ‘in what specific ways could this be more effective?’ (how could this be shifted further up the scale?).
To me, this is not just semantics. It’s a major shift. And it’s a lot harder! I can measure how tall the plant is easily… but can I help it grow taller? That is the question. And in teaching, even though we can’t even measure the ‘height’ before and after, we can still think about how to foster growth.
What does this mean in practice?
It means instead of deciding that James is a good teacher but querying if he is as ‘awesome’ as Jo, I engage with the various processes going on and the way students respond to focus on things James might do differently to be more effective. When I’m watching James teach, I’m thinking of specific possible actions he might take to support more students to learn more – and those will form part of my discussion with him later. It’s got nothing to do with Jo. I also need to think of things she might do to be more effective.. but, ‘awesome’ as she is, some of her learners could still be doing better.
It means, instead of putting Jenny and Jules in the camp of ‘a bit weak, can’t control their Year 9s‘ I’m working with them to think of specific actions they can take, routines they can create and reinforce, that will help them to manage behaviour more effectively. It’ no good to them to hear my internal rating on my internal made-up scale. Even if I do actually think they are ‘a bit weak’ – I’m no use to them at all if I can’t help them to become stronger; if I can’t identify actions they personally can take given where they are as people, as teachers.
It means that I stop writing down judgements and ratings or using the language of quality. I just look to see where problems lie and what the solutions might be. I use those evaluations in my discussions with the teachers. If they can see those things for themselves, that’s great, we’re away. If not, I can be more direct about it.. but it’s still about the specifics of how to improve; not the spuriousness of any rating.
Happily, it turns out that quite often the specifics we discuss are common to various teachers. That helps us design training programmes. It means that The five forms of feedback I give to teachers most often… has value as a generic set of options because lots of teachers can recognise themselves there. But, in practice, I don’t give this feedback cold; I don’t use it to judge. I involve teachers in a discussion where these issues arise and we agree on some action steps. I Dont’ judge. I Just help.
It’s hard at first to shake off the quality judgement language – but it’s so refreshing. It’s liberating.
Of course, I would say the same should apply to how schools are evaluated.. but that’s another story.
Totally agree, Tom. I know of one MAT in the north which has an 18-point observation grid which line managers use to assess observed lessons. It’s part of the surveillance and scrutiny culture that leads to a loss of integrity, illegal off-rolling, misadministration (cheating) in the primaries and a drain of staff from the profession as their passion for teaching is systematically undermined by this performativity culture. Good leadership teams don’t do this.
LikeLike