Despite my reservations about some of the big data measures that are used to judge schools, I am hopeful that our discourse is shifting the debate on assessment in a very positive direction. If this continues it will represent an important paradigm shift with positive consequences for students’ learning and their overall school experience.
The Old Paradigm: Macro Summative Attainment Tracking
This has reigned supreme for the last 10-15 years or possibly more. The focus has been on trying to represent students’ attainment across all disciplines in terms of generic ladders of grades and levels supported by descriptors so that attainment and progress could be tracked. The value given to data tracking has been the driving force – the idea that summative macro data is a requirement for driving up standards.
The elements of this paradigm – which still is absolutely dominant- include the following:
- Confusing bell-curve ranking with absolute standards. A C grade or a Level 4 never were or are indications of absolute standards; they only make sense in reference to a cohort. It’s never been true that, say, ‘explaining how a motor works’ is Grade B or ‘solving simultaneous equations’ is Grade 8. If a student has a grade 5 on their report – we know nothing about what they know. The very worst examples of this are where schools use GCSE grades as a ladder – the horrible notion of ‘Working At’ grades that has now penetrated into KS3 in some schools. In most subjects there is no meaningful sense in which you are at Grade 3, then a 4, then a 5; or a D then a C then a B. Bell-curve markers do not work in that way..it’s a nonsense. (I’ve even seen examples of 3+ and 4- being used – as if they are definitely, definably distinct).
- The illusion of ‘progress’ as something than can be measured via reference to bell-curve grades or data points in general. ‘Levels of progress’ was the worst example of this – with the absurd but pervasive misconception that a jump from a 3 to a 5 or a 4 to a 6 are broadly equal without any reference to the content of what is being learned. When we were forced to talk about students making ‘five sub-levels of progress’ I really thought we had lost our minds… it’s pure data idiocy. This has now gone (Gove’s greatest legacy in my view), but the very idea that ‘progress’ in learning has a measurable size, ludicrous as it is, is still widely held onto.
- The target-grade culture: The idea that, by setting systematic attainment targets in the language of summative grades, students will learn more. This is a deeply flawed notion. I am on a C. My target is a B. I must work harder to get a B. This sequence is devoid of learning content and rarely translates into anything actionable for students – because ‘being on a C’ doesn’t mean enough at the level of what they know and can do beyond ‘I must work harder and learn more to reach a higher grade‘ which is universal at any level.
- The attempt to use summative tests formatively. This is very common with students sitting endless mock exams and using ‘GCSE-style questions’ prematurely. It’s akin to getting students to play a whole piano piece over and over before they’ve learned their scales – or asking someone to play a match without learning the basic passing skills.
- Centralised data tracking machinery: Schools across the land have got spreadsheets like this where data is entered for the purposes of tracking. You can stare at the numbers and grades all day long, but unless this feeds back into different actions in the classroom, no child learns anything more.
One of the reasons I don’t like the mechanics of Progress 8 is because they reinforce the macro data-tracking behaviours of leaders at the expense of the micro learning-focused shifts that are more important… But that’s another story.
I enjoyed the response to this twitter ‘thought experiment’. Try it:
The trouble with macro data-tracking is that the journey from the spreadsheet to the classroom is circuitous at best, and normally doesn’t happen at all. All we are doing is generating data that might tell a story about relative student performance – it might, at best, give a picture of where students are – but it does nothing to take them further. Even here, the grades themselves do not actually tell us anything at all about what students know or can do; literally nothing.
Finally, a part of this paradigm has been our approach to marking. For years now, marking has been largely a PR exercise where how the marking looks has been more important than what it achieves with an appallingly low impact to workload ratio. Most marking is wasted. It doesn’t lead to students knowing more, understanding things better or producing better quality work. This in part due to the eternal marking paradox that those students who need the most help are the least able to interpret the real meaning of marking in order to adjust their thinking or performance. However, school control cultures – (insert reference to external accountability pressure if you want) – have demanded levels of marking – of red pen – that bear no relation to student’s progress in their learning.
Across the land, marking is peppered with things like B-, Good Effort, your ending is rather vague; explain this in more detail – that kind of thing: Comments that very many students cannot use to actually improve their work. Then there is the use of www/ebi self-evaluation where students write things like ‘I must avoid making silly mistakes’ – or ‘I need to revise more‘. These generic wish-statements cannot actually make a difference to what students know, understand or can do. Again, the approach is based on the need to satisfy an external machine that requires compliance to a set of institutional expectations (here, that www/ebi is ‘a good thing’) rather than being focused on learning itself.
The New Paradigm: Authentic learning-focused formative assessment.
Luckily, there is light. Things are changing, stirring, evolving. A new dawn beckons….(or is it going back to a previous dawn that faded, years ago?) There are several key influences that I think are helping to shift us towards a new paradigm:
- Dylan Wiliam retains his high-profile influencer status and is continually reinforcing the importance of ‘responsive teaching’ – the true meaning of AfL; the vital role of ‘minute by minute’ formative assessment where teachers check for understanding, adjust their teaching and continually seek to deepen students’ understanding and knowledge. This remains at the heart of real assessment: it’s in the moment with tight feed-back loops leading to immediate actions focused on specific elements of learning.
- Daisy Christodoulou’s Making Good Progress has exploded our fixed ideas about assessment practice, showing how problematic our use of summative testing is; how rare it is to see genuine formative assessment: high frequency, low-stakes, narrowly focused testing using raw marks, owned only by a teacher and her/his students, feeding directly back into the teaching and learning process. She has also promoted important ideas around comparative judgement as a truly reliable means of gauging relative standards, – mainly (but not only) in relation to writing in English – free from the flaws of descriptors and rubrics that are so hard to use consistently.
- Cognitive Science is growing in its reach into our consciousness as a profession. The lessons from cognitive load theory – amongst others – encourages us to employ much more direct means of improving students’ knowledge by using effective instructional methods, regular retrieval practice through knowledge reviews and low-stakes recall testing. This then allows us to gauge how well students are doing in terms of how much they actually know about specific topics. The rise of knowledge organisers and personal learning checklists is helping to frame this work: being more explicit about what students should know and then helping them to learn it. Here the assessment is purely in terms of what they know in the absence of any proxy grading system.
- The popularity of Ron Berger’s An Ethic of Excellence and the fabulous metaphor-packed Austin’s Butterfly is hugely influential, in my view. Here we see that we need to define our butterflies. We need to spell out or at least exemplify what excellence might look like and then devise iterative feedback processes that allow students to see the steps from where they are to where they could be in the detail of their learning goals. Austin and his teacher do not need a graded ladder or tracking system to help him improve; it’s all about the detail of the work itself linked to a clear idea of what excellence might look like, modelled by an exemplar.
- Schools such as Michaela are showing that, if you think hard about what you do and are firmly focused on maximum impact on learning, and not answering to the external machine, you engineer paradigm shifts around practices such as feedback and marking. Ideas such as whole-class feedback instead of traditional book marking are catching on. Jo Facer’s blog on this is superb; it’s a game-changer. Instead of slavishly marking books, we should be giving whole-class feedback that is prompt, immediately actioned, workload-efficient and effective in securing improvement. This is what matters, not the red pen – and NOT the ‘verbal feedback given’ stamp. FFS.
The challenge with all these elements of the ‘new paradigm’ – is that they do not produce neatly aligned datasets for leaders to scrutinise on their Management Information Systems. The devil is all in the detail. It’s more precise and simultaneously more organic. Instead of seeing that a student is on a 6 or scored 57% on a mock exam, you need to see the work in their books, to see the test scores on very specific topics – knowing and understanding that this does not have an associated level or grade on a bell-curve. This is authentic assessment, focused on learning, not on creating false codified meaning to facilitate comparisons outside the learning arena.
In this context, leaders would need to do much more to triangulate information so that they build up a picture of what is going on for any student without placing undue emphasis on any one part.
There are areas that I think we need to develop further still – or avoid:
Avoid the development of massive unwieldy centralised statement banks: the horror of the ‘Can Do’ statement machine. Once you’ve got a hundred tick-boxes, you’ve created something you can’t monitor closely enough to feedback into your teaching. The statements themselves are often too vague (I can explain photosynthesis; I can add and multiply fractions; I can evaluate the use of metaphors in a poem) – because it all depends on the context of those questions. You need concrete examples. Replacing an amorphous generic grade with a massively unwieldy list of vague statements is not an improvement. The key is to keep formative information as close as possible to the site of the learning – in students’ books and folders, not on a teacher’s computer.
Let’s give a reprieve to Grades 1-3 and get them out of the box marked FAIL. Until all the points on a bell curve – that forces 30% of students into the bottom 30% (shocking, I know!) – are seen for what they are in neutral terms, we’re writing off thousand of children in an unacceptable fashion.
Let’s do more to share exemplar work nationally so that we can define standards at any given level against concrete examples. This could be through the content expected in difficulty-based subjects like maths and science and through examples of work – in line with the comparative judgement process – in English and History. The work done by NoMoreMarking in this area is a great example of this. It’s the start of what could become a national database of standards that all could access and use of comparative judgement.
I do think the changes I’ve highlighted here would represent a paradigm shift if we could take the whole system on this journey. It feels like it’s still at an embryonic stage but at least we’ve made a start.