Big Questions: Assessment

1812-measuring-tape-in-eights

I’ve explored this issue extensively in some previous posts that might be worth revisiting for additional context:

Over the last two years schools have been forced to rethink their ideas about assessment across the curriculum. This has largely been driven by the removal of KS3 levels but also the introduction of new specifications at GCSE.  Lots of basic assumptions about assessment and standards are up for discussion.  At my last school we were working on numerous sometimes competing priorities:

  • Making assessment formative; it needs to support feedback that in turn leads to improved standards.
  • Assessment must help to define standards for students and teachers to aim at – blending comparative judgements against school and national bell-curves with some absolute benchmarks where they exist.
  • Assessment must be authentic to each discipline; an organic element of the curriculum and learning process in a subject; not confined by the demands of external accountability
  • The last priority is to morph the different assessment models into a system that links it all for various purposes: comparison across subjects, communications to parents, tracking and monitoring at whole school level, giving governors the tools needed to fulfil their accountability duties.
Screen shot 2014-10-19 at 08.59.16
This diagram captures our approach. We focus on working out where students are and, then decide what this means in terms of progress.

My former Head of School for Ks3 and I met with Heads of Department in small subject clusters to discuss this issue and share ideas.  We wanted to be sure that each department had an approach rooted in an understanding of the curriculum and standards in the subject; an approach that each teacher shared. This is more important than having data but we also wanted to explore how data could be generated.

We’re decided to use a 1-9 scale across KS3 and now that FFT has produced 1-9 targets based on the latest KS2 starting points, we adopted these in place of our home-grown starting profiles.  It’s a useful common reference frame for the bell-curve of relative standards that every subject can use as a gauge for pitching how to generate 1-9 data from their assessments.  Crucially, we did not tell students their targets; they sat in the background for teachers to use as guidelines.  Even when we issued 1-9 attainment grades, we stressed that the real meaning lies in the original raw scores and progress grades used within each subject:

In this context, what does authentic assessment look like?

Maths: Tests! Very simply, students take tests and these produce marks.  The challenge for maths is to make sure the tests match the curriculum tightly for each set. It’s hard to compare outcomes from different tests but, at the same time, the content of each test varies between sets.  Turning the test scores into a 1-9 scale is another challenge and there is a high degree of professional judgement required. How do we know what 23 out of 40 or 80% is?  The FFT targets help here but we also need to use some tests that link to national schemes so that we develop the expertise to set grade boundaries in a meaningful way.  We also have a system of Independent Learning Tasks – essentially mini-tests on specific topics that provide formative feedback.  For sure the ILT feedback and test scores dominate discussions, not the 1-9 grades.

Science: Again, we use tests.  At HGS we adopted the Pearson Exploring Science scheme and used their tests. This makes it nice and easy as the outcomes have a national scheme to reference against.  Some tests are simply factual recall and others are longer answer questions sampling the curriculum. There are mark schemes that staff can use to generate 1-9 grades after some internal moderation. The tests match the scheme’s knowledge organisers tightly. These examples are available for free from the Pearson website:

 

In English: Here the department was exploring a highly formative approach based on progress within their writing units.  Extensive departmental moderation is required to agree relative standards against internal and external samples.  The improvement process informs teachers’ assessment of progress and then FFT targets provide an anchor for where students lie on the bell-curve given the progress they’ve made.  The key to the accuracy of this  is the rigour of the moderation:

 

They are not using any knowledge tests – nothing as simple as a basic recall test on key facts from text books- but we’ve discussed whether that  might be useful in some units of work to supplement assessment of writing.

In MFL:  Here there’s a combination of skills to assess.  The writing, again looks at drafting and improvement with a broad brush assessment using our E,G,S,P system.

file_001

Students are also assessed in speaking and separately for reading and listening comprehension.  Each of these components generates different outputs: some grades, some tests scores and lots of formative feedback.  Students have different strengths and can achieve very different outcomes so it is an approximate average process to morph this into a single overall data point. The standards for each component are benchmarked using some national exemplar material and text books as well as GCSE criteria.   Again, departmental moderation – especially of the writing – is essential.

In Geography, the situation is similar to Science where tests with a mix of answer lengths  are set. Here we were devising our own using a range of resources and exemplars.  The challenge with this is to know whether the tests are pitched at the right standard – the same issue for any internally-set tests.  Tests are common across the department so scores can be compared and, through discussion of thresholds, linked to the FFT 20 targets. Detailed knowledge organisers like these GCSE examples are part of the package used to set standards and guide assessment:

screen-shot-2016-12-11-at-22-14-33

History is assessed in a similar manner to English in that the emphasis is on synoptic essays, rather than knowledge tests.  Factual content is assessed through the way it is used in the writing.  At the lower end of the assessment range, key knowledge is the basic requirement but to score beyond the lowest levels, factual recall has to be evidenced through historical analysis.  The department has used the Pearson 12 step scale to shape their shared understanding of standards.  This isn’t used with students or parents; it’s a benchmarking tool.   History knowledge organisers are also very detailed and give students clear guidance about what they’re supposed to reference in their writing.

With Music, there is a complicated array of assessment tools: listening assessments – yielding numerical test scores; practical performance assessment scores based on a detailed schema and, finally, composition assessments that are criteria based.  As with languages, there’s a significant element of judgement needed to aggregate this together neatly.

In Art, Drama and PE, departments have each taken their GCSE specification criteria and adapted them for KS3.  The exam boards use different bands to describe levels of performance with criteria that teachers need to get off the page through examples and moderation. They’re all keen to use the language of assessment used at GCSE – and each subject has different sub-element and terminology. The challenge with this approach is to ensure that our 1-9 scale for KS3 (which is more flight path, bell-curve marker and not a set of steps) works without suggesting step progression through the numbers.  PE teachers find huge variation across different sports and in Drama, written and practical elements are often vastly different.  So, here the overall 1-9 scale is very much secondary to the success criteria and feedback used within each component.

It’s hard to capture the detail because authentic assessment varies hugely from discipline to discipline. I’ve missed out lots of details and some departments already – but hopefully this gives a flavour.

We updated our  KS3 Assessment booklet so that parents get a sense of what we’re doing at the overview level.  The idea that 1-9 is actually a set of overlapping regions rather than a set of precise measurements is important – albeit hard to convey.  Assessment at KS3

Screen shot 2015-10-03 at 13.56.34

The FFT 20 targets are certainly highly aspirational for many students and this is reflected in the progress table. Students have to be pushing the boundaries to make Exceptional Progress.  Students will see something like:  6, G  or 7, S depending on their prior attainment that determines the FFT targets. With the range of students we have, it’s important for all students to have the possibility for exceptional progress from a range of starting points.

screen-shot-2016-12-12-at-00-01-02

To finish, I do have to stress that I would not expect students to fuss too much about these grades. They might not even remember them.   I’d want them to know what mark they got on their latest science and maths tests and where they went wrong, what their English progress grade was and what the feedback was on their history essay.

15 comments

  1. Wow, when it came to assessment and grading in Australia, I thought we complicated the simple in Australia. This seems like a whole other level of complexity.
    I like how you have allowed faculties to decide their grading methods. One side does not fit all when it comes to assessment.
    From my experience of assessment in schools, teacher judgements are largely useless when it comes to accurately assessing student’s level of performance. Much of the research that suggests there is greater variability in quality between classrooms within schools than there is between schools is based on teacher judgements of student progress. I have often wondered if this variation is in fact due to teacher effectiveness or just due to the lack of consistency in teacher assessment. Teacher X’s students all achieve B+ standard and Teacher Y’s achieve C- yet the students in each class produce the same quality of work. Teacher X is held up to be doing a better job than teacher Y which is clearly not the case.
    It seems you have steps in place to combat this. Common tests and moderation are the only ways to generate consistency in teacher judgements yet even they are no guarantee. In our system teachers still have the freedom to adjust their student’s final assessments up or down so consistency, as generated by common assessment tasks, can still be overridden.
    Have your English dept looked at the “no more marking” comparative judgements software? https://www.nomoremarking.com/ It looks promising to generate consistency but it does seem lacking in the feedback department.
    Assessment serves many masters.
    It is interesting that the world education community has not come up with a standard system for grading and recording student assessments. We haven’t worked out the best way to do that yet.

    Liked by 2 people

  2. Hi Tom

    I’m in charge of Assessment/Outcomes at The Kingston Academy, a new school with years 7 and 8 currently. Our model looks similar. Age-related 1 to 9 grades. It would be interesting to have a chat with somebody from your school to swap notes!

    Here are a few key features of our model:
    – we do set pupil targets (Flight Paths), but these are flexible and can be changed on an annual basis, putting pupils in control.
    – we also have a fixed internal accountability target (MET – Minimum Expected Target) linked to KS2 outcomes which is for staff only. The Flight Path is generally one grade above the MET at the outset.
    – we assess summatively 3 times per year, and because learning is not linear we calculate a GPA (Grade Point Average) where the most recent grade is double-weighted. The GPA is used as the main indicator of progress over time.
    – assessments are quality assured for robustness before pupils take them
    – standardisation of grades across all subject areas, including our in-house TKA Grade Descriptors which are taken from the common strands observable across all GCSE descriptors.
    – Pupils use their Chromebooks to update a Pupil Achievement Tracker (PAT), a spreadsheet which automatically graphs their grades, GPA and flight path so they can visualise their progress. They also set learning targets on this.
    – dashboards for subject and senior leaders to visualise key indicators

    This is pretty rushed, I’m sure I’m leaving out a few things, but you get the idea. It’s in a constant state of review and refinement, but it’s fair to say that visitors have been impressed by the model we’ve developed.

    Do get in touch.

    Alex Deveson
    Assistant Headteacher
    @MathsAlex

    Liked by 1 person

    • Hi Alex,

      Do you find, even though you only have Years 7 and 8, that having pupils update their own PATs and seeing their grades and flight paths change helps them to take ownership of their learning? Do students have (or are they encouraged to) discuss how close they are to their “original” flight paths and the long term effects of their, hopefully, hard work.

      I ask because although we currently use marksheets which students update, they don’t really understand their flight paths and long term targets. Students often appear to see filling in their progress grids as a “box-ticking” exercise – I wonder whether implementing a similiar practice to yours, or even just explaining the flight paths on Go4Schools, would encourage more ownership of learning with our students.

      Adam.

      Like

      • Hi Adam

        We’ve only just gone through the process of pupils completing their PAT, so it’s too early to say conclusively. Pupils understand their flight paths and grades because they are age-related; they know grades 8 and 9 are fantastic and they know grade 5 is the expected standard.

        We had really interesting conversations at Y8 parents evening last week where pupils were saying that they really wanted to raise their flight path, or wanted to keep it the same for now but raise it next year. Definitely some good signs that they are starting to take ownership. The PAT shows their flight path at each particular assessment point so they can see where they have changed it over time.

        Have a look at this example of a PAT in English: https://docs.google.com/spreadsheets/d/1ta6t3F3Sw0ubjOnV0UEoa-7ez0r_uC2IWigiGzk3BEI/edit?usp=sharing

        The beauty of separating out the flexible flight path target from the fixed MET is that there is no disincentive for teachers to suppress targets.

        I’ve not heard of Go4Schools. Is it worth checking out?

        Like

  3. Thank you so much for sharing your AwL journey. We are using a similar concepts and I am wondering if you planning to use any form of external testing (GL assessment ‘Insight’ or FFT ‘PoP’)?

    Like

  4. Hi everyone
    We launched the 9-1 assessment framework in September and as with the other comments it’s reassuring to hear that other schools are working on the same principles.
    Alex – we have completed a ‘paper version’ of your PAT. Very interesting to see an electronic version of this and how it works.
    We used GL assessment English, maths and science baseline tests as our starting points with Y7. A little frustrating as science was given as GCSE predictors but English and maths were current levels of attainment. It will be interesting to see how GL develop this next year, but also investigating FFT.
    Thank you everyone!
    Claire
    Q

    Like

  5. “Crucially, we don’t tell students their targets; they sit in the background for teachers to use as guidelines.”

    Each time I read that you do not tell students their targets, I am reassured that it is possible to run a school where students aren’t prescribed target grades.

    Do you ever show students their individual figures from FFT – namely the % of students ‘like them’ who achieved each grade in a subject?

    I have found this a useful way to show students what could be possible, so they can come up with their own target for that subject. I always spell out what ‘like them’ means, and that this shows how many students have achieved these grades in the past – it is not their chances of achieving a grade!

    Like

Leave a comment