In various blog posts and twitter exchanges I have critiqued several widely used approaches to assessment tracking and reporting. Reasons for my critique include the following:
- Forcing teachers across very different disciplines to morph their organic, authentic subject specific assessments – including wide-ranging quality and difficulty models – into a common grading system at an excessive number of times determined by the needs of the system, not the teachers.
- Using measures and scales that give a false impression of being reliable, valid, secure, absolute measures of standards or progress but are actually subjective, arbitrary or even totally spurious.
- Basing ideas of attainment and progress on an imagined ladder of absolute standards when, usually, all that’s being measured are bell-curve positions within a cohort. This includes the widely inappropriate use of GCSE grades as milestone markers in a ladder 5 to 6 to 7 etc.
- Projecting from small-scale teacher devised assessments to comparisons with national cohort distributions and national examination grades without the appropriate level of caution and acknowledgement of the margin of error.
- A general failure to accept margins of error within assessment including the totally spurious use of decimalised sub-levelling for GCSE grades – bad at any point but especially bad a long way out from completing the course.
- The false but widespread assumption that teachers are not only able to be consistent teacher to teacher within a subject team but also between subjects, even though no moderation processes exist to align Art, Maths, History and PE and there is no real sense in which learning in these domains can be compared.
- The overly optimistic assumption that teachers can meaningfully evaluate student attitudes consistently across subjects so that their ‘attitude to learning grades’ say more about variation in student attitudes than teacher perceptions.
- The attempt to track progression through a massive curriculum by checklisting against massively long statement banks using dubious mastery level descriptors, can-do statements, RAG coding and so on as if it is meaningful to state “I can explain how… I know that…” without reference to the depth of answers and the knowledge content included. Turning statement tracking into sub-levelled GCSE grade flightpaths is possibly the worst data-crime there is: spuriousness upon spuriousness. I’ve been hectored by email and online by sellers of these systems for ‘dissing’ their products but I’m sorry – they’re Emperor’s New Clothes.
- Excessive workload-to-impact. Assessment should help teachers, not burden them. Far too much data that is collected does not support any form of response – its collection serves no purpose beyond creating false patterns and trajectories at a granularity that isn’t related to the reality of what students know and can do and the actions teachers take to deepen learning,
- The idea that targets are inherently a good thing that provide a lever to secure better outcomes. This links to the confusion of terms for reported grades: Predicted, forecast, realistic, estimated, aspirational, working at, most likely to achieve, minimum target, FFT5, FFT20, current….laced with all the horrible psychology of teachers second guessing how they will be judged according to how accurate or optimistic they are.
Let’s face it – tracking systems and parents’ reports are riddled with spurious data noise. My son’s teachers have always been wonderful wonderful people; he’s been in capable hands. But some of his reports have made little sense: too much about the system, not enough about the child. Bad reports are unactionable. They fail the simplest of tests. Is my child doing well or not? Is s/he working hard enough or not? Is s/he on track to achieve the grades they should – or not? It’s possible to receive a report which leaves you none the wiser.
So… what should be done!?
I really do think there are multiple ways really sensible assessment and reporting can be done provided some key principles are adhered to:
- Maximum value is given to authentic formative assessment expressed in the form most useful to teachers – not anyone else.
- We report attainment based on what we actually know for sure – things like test scores – or we are absolutely explicit that we’re making a judgement with a margin of error.
- We answer parents’ questions honestly and directly whilst acknowledging that the answers are judgements, not absolutes: How is my child doing? What else can they do to improve still further? Are they on track for future success given their starting point? We are absolutely explicit that teacher judgement about attitudes, progress and sometimes even attainment, is subjective and will vary teacher to teacher.
- We do not collect more data than teachers can usefully use to lever improvement: ie the frequency and nature of assessment data is driven by the needs of teacher-student teaching-learning feedback and improvement interactions, not the macro accountability systems. Leaders go to where the small-scale data is; we don’t distort data into common formats as far as is possible.
- Where we can, we reference discussions about attainment and progress to detailed information about what students need to know in order to achieve success away from the semantics of nebulous qualifying descriptors.
- Processes are in place to benchmark internal data against national cohort data or, failing that, local reference data, in every subject for every programme of study.
- We devise systems that support leaders, pastoral teams and parents to glean actionable information from the assessment data.
What can this look like in practice? There are multiple permutations, but here is a suggestion:
Baseline tests using widely used national systems: eg CATS. MiDYIS.
FFT data is used for internal use only: using FFT5 to challenge any low expectations: stare at it: embrace the possibility! FFT20 might be more reasonable in lower prior attaining schools.
To track cohorts, if the curriculum matches, national standardised tests like GL Assessments can also be useful. However, every department needs an external reference mechanism to gauge standards: establish it and use it to evaluate standards at KS3.
Set up internal assessment regimes with multiple use of low stakes tests/assessments that track understanding of the curriculum and incremental improvement in performance. Teachers form a rounded view of student attainment in a format that suits them in a timeframe that is meaningful. Senior leaders develop an understanding of internal departmental data. Test scores, essay marks, portfolio assessments.
Ensure that a rigorous system of ‘gap closing’ happens whereby students use formative assessment feedback to improve their work and deepen their understanding. This is part of a feedback policy that is supported by excellent information about standards and knowledge-requirements so students, teachers and parents can engage in conversations about the actual learning content, not the numbers.
Ensure attitude grades are handled with caution and are supported by more detailed statements from a statement bank. Statement banks have been available to schools for 25 years and are underused. The emphasis is on describing the tangible actions students should take – not on trying to describe subjective features of their character.
At one mid-year point and one end-of-year point, gather information in a centralised data-drop to form a picture of where students are. In examination courses, there is value in linking this to terminal grades but not using ‘current grade’ which is an invalid notion. It uses teacher judgement to forecast a likely terminal grade or grade range based on what is known to-date, including FFT projections. However, nothing more granular than whole grades should be used, providing illusory precision. No school or MAT really needs more centralised data than this, especially if they are keeping close to the real formative information,
At KS3, you do not need exam-linked grades; this is false; premature. We can either report raw scores in some more major assessments with some reference point alongside – eg class average – or we convert raw scores into scaled, standardised scores for the year group. Or we simply use a simple descriptor system: Excellent, Very Good, Average etc or other defined scale that simply reports the teachers’ view of a student’s attainment compared to the cohort. It’s honestly in its qualitative nature. There are no target grades; we refer parents to the formative feedback given to students about tangible improvement steps.
Is there an example?
Well, yes there is. I found most of what I’ve described here in action at Saffron Walden County High School in Essex. They have a very sensible set of systems.
- Strong use of formative assessment: lots of feedback and ‘close the gap’ responses; students focus on continual improvement. It’s strongly evident in books and lessons.
- Use of a SWCHS grade that typically falls between FFT5 and FFT20, informed by teacher judgements. Progress reports use simple descriptors at KS3 – not false GCSE grade projections. At KS4, they use the very sensible terminology of ‘forecast’. Forecasting implies margin of error, balance of judgement with evidence, not being set in stone.
- Progress checks are run and reported to parents twice yearly for KS3 and three times at KS4. Attitude grades are used accompanied by a comprehensive set of statements in a statement bank- all run through configuring the Go4Schools system. Teachers have a wide range of options to describe where students need to focus in order to improve, all through a set of codes.
Rather brilliantly, their data manager has devised an excel-based system for exporting codes for the statements into a condensed format that pastoral staff can access through their VLE. It looks like this:
At a glance, a year leader can see the issues via the coding and, where needed, can quickly refer to the origins by subject. This is rich info from a data drop: it goes beyond crude attainment averages or subjective comments about ‘working harder’. It goes beyond making major inferences from the attainment grade data – which is impossible without knowing what the formative data and curriculum mapping is. It indicates homework, work rate, attendance, in-class contributions – whatever the specifics are, neatly condensed to inform conversations.
So – there you are. No ludicrously big knowledge statement trackers, no spurious can-do statements, no spurious flight paths, no false-granularity GCSE grades. Just sensible organic authentic assessment packaged up to be useful, informative and, where appropriate, linked to exam outcomes, with high expectations built-in. It’s not rocket science. But it comes from a place of intelligent understanding of curriculum and assessment and a spirit of giving value to professional expertise.
Is there a template on go4schools that models this approach Tom?
No. The system has an export function of the codes given to each student but the school devised a way to concatenate the data into that easy-read format for pastoral staff.
LikeLiked by 1 person
Reblogged this on The Maths Mann.
[…] with intelligent systems – including the assessment regime I described in an earlier post: The Ideal Assessment Tracking Regime? The school has high expectations of staff, for sure. But the culture allows the systems to […]
[…] The ideal assessment tracking regime? […]
Do you have an example of some of the statements used Tom? Are they personalised to each subject or is it a generic list that is applicable to all students?
Great information provide by this blog thanks for sharing the useful information. Its really helps.
click for writing services
Would love to know what the codes are…
A like this summary on the key issues re assessment and the challenges of data collections for both parents and pastoral purposes. BUT I don’t understand that summary at all. What does it show? I prepare data summaries via comment banks and teacher forecasts too am always looking to improve. If I knew a little more about this summary it could help??.
Hi Jo. Each subject teacher can give up to three actions from a statement bank, coded A-Z. The summary compiles those actions into a string. eg FFMMPTT Each of those letters means something – everyone can just refer to the list of statements. It tells the tutor the range of actions relevant to that student. F could mean something about reading or a revision strategy or punctuality or whatever. The more often a code is repeated, it means more teachers think this is relevant for that student.
I am trying to start a new tracking system, but after reading this I think on the wrong track. Would like to set up something similar to Saffron Walden County High School. But with out seeing clear examples this is difficult to implement. Any help trying set up a system that works better than using grade descriptions would be brilliant.