I’ve explored this issue extensively in some previous posts that might be worth revisiting for additional context:
- Assessment, Standards and the Bell Curve
- Defining the Butterfly: knowing the standards to set the standards
- The Assessment Uncertainty Principle
- Authentic Assessment and Progress: Keeping it Real
- KS3 Assessment: 8 steps to a workable system
Over the last two years schools have been forced to rethink their ideas about assessment across the curriculum. This has largely been driven by the removal of KS3 levels but also the introduction of new specifications at GCSE. Lots of basic assumptions about assessment and standards are up for discussion. At my last school we were working on numerous sometimes competing priorities:
- Making assessment formative; it needs to support feedback that in turn leads to improved standards.
- Assessment must help to define standards for students and teachers to aim at – blending comparative judgements against school and national bell-curves with some absolute benchmarks where they exist.
- Assessment must be authentic to each discipline; an organic element of the curriculum and learning process in a subject; not confined by the demands of external accountability
- The last priority is to morph the different assessment models into a system that links it all for various purposes: comparison across subjects, communications to parents, tracking and monitoring at whole school level, giving governors the tools needed to fulfil their accountability duties.
My former Head of School for Ks3 and I met with Heads of Department in small subject clusters to discuss this issue and share ideas. We wanted to be sure that each department had an approach rooted in an understanding of the curriculum and standards in the subject; an approach that each teacher shared. This is more important than having data but we also wanted to explore how data could be generated.
We’re decided to use a 1-9 scale across KS3 and now that FFT has produced 1-9 targets based on the latest KS2 starting points, we adopted these in place of our home-grown starting profiles. It’s a useful common reference frame for the bell-curve of relative standards that every subject can use as a gauge for pitching how to generate 1-9 data from their assessments. Crucially, we did not tell students their targets; they sat in the background for teachers to use as guidelines. Even when we issued 1-9 attainment grades, we stressed that the real meaning lies in the original raw scores and progress grades used within each subject:
In this context, what does authentic assessment look like?
Maths: Tests! Very simply, students take tests and these produce marks. The challenge for maths is to make sure the tests match the curriculum tightly for each set. It’s hard to compare outcomes from different tests but, at the same time, the content of each test varies between sets. Turning the test scores into a 1-9 scale is another challenge and there is a high degree of professional judgement required. How do we know what 23 out of 40 or 80% is? The FFT targets help here but we also need to use some tests that link to national schemes so that we develop the expertise to set grade boundaries in a meaningful way. We also have a system of Independent Learning Tasks – essentially mini-tests on specific topics that provide formative feedback. For sure the ILT feedback and test scores dominate discussions, not the 1-9 grades.
Science: Again, we use tests. At HGS we adopted the Pearson Exploring Science scheme and used their tests. This makes it nice and easy as the outcomes have a national scheme to reference against. Some tests are simply factual recall and others are longer answer questions sampling the curriculum. There are mark schemes that staff can use to generate 1-9 grades after some internal moderation. The tests match the scheme’s knowledge organisers tightly. These examples are available for free from the Pearson website:
In English: Here the department was exploring a highly formative approach based on progress within their writing units. Extensive departmental moderation is required to agree relative standards against internal and external samples. The improvement process informs teachers’ assessment of progress and then FFT targets provide an anchor for where students lie on the bell-curve given the progress they’ve made. The key to the accuracy of this is the rigour of the moderation:
They are not using any knowledge tests – nothing as simple as a basic recall test on key facts from text books- but we’ve discussed whether that might be useful in some units of work to supplement assessment of writing.
In MFL: Here there’s a combination of skills to assess. The writing, again looks at drafting and improvement with a broad brush assessment using our E,G,S,P system.
Students are also assessed in speaking and separately for reading and listening comprehension. Each of these components generates different outputs: some grades, some tests scores and lots of formative feedback. Students have different strengths and can achieve very different outcomes so it is an approximate average process to morph this into a single overall data point. The standards for each component are benchmarked using some national exemplar material and text books as well as GCSE criteria. Again, departmental moderation – especially of the writing – is essential.
In Geography, the situation is similar to Science where tests with a mix of answer lengths are set. Here we were devising our own using a range of resources and exemplars. The challenge with this is to know whether the tests are pitched at the right standard – the same issue for any internally-set tests. Tests are common across the department so scores can be compared and, through discussion of thresholds, linked to the FFT 20 targets. Detailed knowledge organisers like these GCSE examples are part of the package used to set standards and guide assessment:
History is assessed in a similar manner to English in that the emphasis is on synoptic essays, rather than knowledge tests. Factual content is assessed through the way it is used in the writing. At the lower end of the assessment range, key knowledge is the basic requirement but to score beyond the lowest levels, factual recall has to be evidenced through historical analysis. The department has used the Pearson 12 step scale to shape their shared understanding of standards. This isn’t used with students or parents; it’s a benchmarking tool. History knowledge organisers are also very detailed and give students clear guidance about what they’re supposed to reference in their writing.
With Music, there is a complicated array of assessment tools: listening assessments – yielding numerical test scores; practical performance assessment scores based on a detailed schema and, finally, composition assessments that are criteria based. As with languages, there’s a significant element of judgement needed to aggregate this together neatly.
In Art, Drama and PE, departments have each taken their GCSE specification criteria and adapted them for KS3. The exam boards use different bands to describe levels of performance with criteria that teachers need to get off the page through examples and moderation. They’re all keen to use the language of assessment used at GCSE – and each subject has different sub-element and terminology. The challenge with this approach is to ensure that our 1-9 scale for KS3 (which is more flight path, bell-curve marker and not a set of steps) works without suggesting step progression through the numbers. PE teachers find huge variation across different sports and in Drama, written and practical elements are often vastly different. So, here the overall 1-9 scale is very much secondary to the success criteria and feedback used within each component.
It’s hard to capture the detail because authentic assessment varies hugely from discipline to discipline. I’ve missed out lots of details and some departments already – but hopefully this gives a flavour.
We updated our KS3 Assessment booklet so that parents get a sense of what we’re doing at the overview level. The idea that 1-9 is actually a set of overlapping regions rather than a set of precise measurements is important – albeit hard to convey. Assessment at KS3
The FFT 20 targets are certainly highly aspirational for many students and this is reflected in the progress table. Students have to be pushing the boundaries to make Exceptional Progress. Students will see something like: 6, G or 7, S depending on their prior attainment that determines the FFT targets. With the range of students we have, it’s important for all students to have the possibility for exceptional progress from a range of starting points.
To finish, I do have to stress that I would not expect students to fuss too much about these grades. They might not even remember them. I’d want them to know what mark they got on their latest science and maths tests and where they went wrong, what their English progress grade was and what the feedback was on their history essay.