Authentic Assessment and Progress. Keeping it Real.

Screen shot 2014-10-19 at 08.59.16
There are many progress paths. The bell curve helps to define standards at any given point but does not fix the path that follows.

This post is based on the ideas that I outlined during my workshop at #TLT14 in Southampton.  It forms part of the process of rethinking assessment at KS3 now that levels have gone.  This is a live discussion at my school and is very much a work in progress.

A good starting point is to revisit the many very good reasons for moving away from levels.  A recent TES post by Tim Oates explains this very well:

Screen shot 2014-10-19 at 09.05.58

I’ve explored a lot of these ideas in previous posts:

In replacing levels, we should be seeking to implement a system that tackles some of the problems levels created.  Here is a re-cap of some of the problems that I see:

  • Levels create the impression that learning follows a linear progress path in equal-sized steps.  This an illusion – though widely held as true and enshrined in the levels-of-progress concept.
  • Levels suggest precise parallel standards between subject areas within a school  – 5a in History is as good as a 5a in Science – even though almost no work is done in schools to measure this, beyond checking distributions on a bell-curve model.
  • In reality, levels and sub-levels have become general bell-curve indicators for a cohort not statements of absolute attainment – so the detail of what has been learned and understood is largely absent from the discourse between teachers and with parents.
  • The moderation needed to ensure that a 5a in English in School X in Birmingham means the same as a 5a in School Y in Exeter doesn’t happen.  Again, it is largely an illusion that this level of national standardisation is meaningful.
  • It requires precious time and effort to explain how a piece of work can be assessed on a level scale; meaning and detail are lost in the process.  Similarly, it takes precious time and effort to explain how the next level might manifest itself in a real piece of work; more detail and meaning are lost.  Using levels does not help to explain the next steps in a child’s learning in most situations; it’s far more effective to explain the steps in the context of the work itself.
  • Very often, the demand to show progress through incremental steps through the levels forces teachers to make arbitrary decisions and to concoct perverse attainment statements that do not fit the organic nature of their discipline.

A possible solution:   Authentic assessment and progress reporting

What is authentic assessment?

In practice, there are just a few different ways to measure performance from which teachers can make deductions about learning:

  • Tests. Right and wrong answers or extended answers evaluated for quality. This generates an aggregated score.
  • Qualitative evaluation of a product against some criteria – a piece of writing, a painting, a piece of design, a performance. These can generate a wide range of outcomes: marks, scores, broad overall grades or levels. Teachers’ professional judgement is critical.
  • Absolute benchmarks: A straight-forward assessment that a student can do something – or can’t do it yet. I’d suggest that there is a very limited set of learning goals that are simple enough to be reduced to can do/can’t do assessment; in most cases there is a proficiency scale of some kind.

Across the range of disciplines at KS3, different situations in different subjects lend themselves to being assessed using a particular combination of these measures. There is usually an authentic, natural, common-sense mode of assessment that teachers choose with an outcome that fits the intrinsic characteristics of the discipline. My suggestion is that we simply report how students have performed in these assessments, with data in the rawest possible state, without trying to morph the outcomes into a code where the meaning is lost.

Let’s explore an example:

In science, students learn about balancing chemical equations in Year 9. They take a test with several questions of increasing difficulty.  Each question is assigned marks based on the number of elements that can be right or wrong. Some or all could be multiple choice questions.  The marking generates a score which indicates the level of a student’s performance.  It could be expressed in raw terms – say a mark out of 30, but a percentage would also to help make comparisons with other tests.

If consistent tests are used over time, the range of marks for any cohort will tell teachers about the performance of each student in the context of that specific topic.  Over time, a series of tests allows teachers to build up a profile of a student’s learning and progress.  Some tests might be harder than others but teachers can see this from the pattern of performance of the whole cohort.   The more tightly focused each test is on a specific set of concepts, the more precise the information will be about any student’s learning.

Teachers would know that a score of, say 70%, is an exceptional score for student with a low starting point, representing excellent progress.  For a High Starter ( to borrow from John Tomsett), 70% might represent progress below the expected level.  For both students, the feedback can focus on the details of balancing equations and the wrong answers. This is miles away from the nebulousness of a 6c.   At the end of each term or year, the cumulative data from tests would represent a strong basis for a discussion with students and parents and for making an overall statement about attainment and progress in a report.

This will work if the tests are well designed to sample the curriculum and to span the range of likely performance levels.  It’s no good if lots of students gain full marks in every test because that would suggest that their is a ceiling on their potential attainment in that area of the curriculum.  The details of all the tests could be shared with parents and students (perhaps online) so that it is clear and transparent. Eg High Starters should be aiming to achieve at least 80% on the unit tests and in the practical assessment. The tests cover the topic with questions like these…..

There is a case for exemplifying standards more explicitly with samples of writing. Not all of science is made up of right and wrong answers; there is always the question of depth:

A:   When someone is running they need to pump more oxygen to their muscles and take the carbon dioxide to the lungs so their heart has to beat faster.

B:  During exercise, energy is released from respiration in muscle cells as they contract repeatedly.  The heart rate increases in order to regulate the supply of oxygen to the cells and the rate at which the waste product carbon dioxide can be expelled via diffusion from the blood into the air via alveoli in the lungs… Etc.

Without the obfuscation of a level ladder, it is possible to illustrate different levels of depth in an extended answer.  This may link to the number of marks given in an assessment and could be used as an exemplar for parents and students.  It is expected that Middle Starters making excellent progress will be writing answers like Example B by the end of Year 9. 

I could make up a similar  example for maths.  There is likely to be a series of topic-specific  tests and, in conjunction with some exemplars of the increasing level of challenge of content areas through the curriculum, this would give all the information needed.  In History and Geography, each unit could have specific outcomes described with success criteria for a synoptic assessment allowing progress to be measured relative to a starting point. Exemplars for written work could be produced and the students’ books would serve as an organic record of progress for all to see.  In Art or DT, success criteria could be used referenced to some exemplar work for students to benchmark their work against.  Grading or levelling might work here at the impressionistic level that NC Levels were originally designed – not the basket-case of sub-levelling that we ended up with.

It might be too confusing for parents to engage with 10 very different modes of assessment across the curriculum.  (One reason levels are held onto by some is because of the illusion of simplicity – an opiate for the masses that masks the underlying house of cards). At KEGS, we devised a generic *, 1,2,3 system that was explained in detail for each subject with specific attainment criteria defined and shared with students and parents.  At Highbury Grove I think a similar system could work but we’d need to add in another dimension to account for the broader range of starting points.  The principle would  be the same: students with starting point X, should  be aiming to reach standard Y by the end of the year, with the standards defined and exemplified by subject.  We haven’t started work on this yet but it is the direction of travel.

Progress will be relatively easy to report, focusing on attainment relative to the starting point and the progress of the cohort.   We’re going to use the simple four-stage code: Excelling, Good Progress, Some Concerns, Poor Progress.

A parent at KS3 could be told that, in Science, a Middle Starter child’s progress level is S (Some Concerns) because the assessments (eg a test average of 48%) indicate that progress isn’t yet in line with that expected for a student starting at that point.  A similar assessment for a Low Starter might warrant a progress level G (Good Progress) and for a High Starter in would be P (Poor).  The combination of progress and attainment is critical to understanding the full picture but the progress measure is the most important.

If I was told my son was Excelling – I wouldn’t necessarily need to know precisely how – I’d trust the teachers to know what they are doing.  However, if I needed more information, I’d expect the teacher to say “your son is Excelling, because for his age and starting point, his score of  82% in the science assessment represents excellent progress”.  In History, it might be a question of showing me my son’s books or an essay at parent’s evening so I could see the progress (or lack of it) with my own eyes. During lessons I’d expect my son to be informed of his areas for development in some depth; he should know which 18% he got wrong and why.   Similarly, he should know where his writing in English needs improvement based on an authentic assessment that suits the process of assessing English.  Levels? Marks out of 20? Approximate GCSE Grade? Whatever is the most natural and retains the most detail.

(See: Formative use of summative tests.)

Standards and Moderation

An important reinforcement to this approach will be the routine moderation of work between teachers within departments and between schools.  If there was a national database of tests and samples of work that exemplified standards for children of different ages then schools could  cross-reference their own standards easily.  In the short term this needs to happen though school-to-school collaboration.  Teachers in next-door classrooms ought to have a shared understanding of what ‘exceptional work’ might look like for their parallel Year 8 classes.  Moderation should create upward pressure; if one school is getting much better work out of the Year 8s who came in with Level 6 in English, then it would lead to a review of standards.  Currently, because everyone’s version of a level varies, that discussion is often reduced to an exchange of mutual suspicion about the validity other people’s assessments.  If we ‘keep it real’, that won’t happen.  It will just fuel an upward spiral of challenge.  That’s the theory in any case.  Let’s see!

As I said, this is a work in progress… and, as ever, I’m more or less thinking aloud.


  1. Thanks for sharing. I have some questions. Apologies if I’ve missed the answers.

    In this model, how would a student respond to the classic question ‘what do you need to do to move forwards?’. Would it be only in context of that topic or would there be common features across the topics? Or would you expect them to say ‘ get 78% in the next test?

    To clarify the tests, do you mean we should design several tests that cover the same content but differently? For example if the topic was weather systems, in one test the question would be to label the water cycle and another there might be a question ‘what happens to rainfall?.

    Especially in the case of GCSE, what would you do if you have a student who genuinely ‘knows it all’ and ‘can do it all’ ? Do you add into the standardised tests work that is above GCSE level that for some is completely unneeded? I’m trying to think about how you extend without making it too much for some.

    Thanks in advance


    • Hi. To move forward, it’s all about the formative use of summative tests/assessments. You can’t just urge someone to get a higher score – they need to know the detailed reasons for their previous errors or the specific nature of their room to improve. Levels have been a barrier to this, not an aid. Sharply focused topic tests make it easier to give precise feedback. With tests, I’m not sure about your question but I’m increasingly drawn to the idea of lot of short tests on specifics rather than huge end-of-topic tests that cover everything. The same tests can be used each year to help build up a good understanding of what constitutes a good score. With top end students exceeding A* at GCSE, it’s difficult to re-write tests but you can certainly teach beyond GCSE in the classroom and possibly make up some ad hoc assessments just for them.


      • Tom you make a very good point about what I call “progress tests” , the shorter focused topic tests. The key is in the formulation of the question ensuring that the student can demonstrate the three levels of mastery. Also the idea of using the tests and almost (well from the students perspective) ignoring the levels means the student focuses on developing an understanding, the conversion to GCSE grades coming as a result of this and not as a target. If you did not read my article “Assessment Without Levels. Is it Possible?” then please take a moment to, I think it underpins what you are saying.

        Liked by 1 person

  2. An interesting article reflecting your “thinking out aloud”, may I reply in the same manner.

    Designing tests should start with what the teacher wants the students to demonstrate in terms of knowledge or understanding. Few do, in my experience most start with writing the question and not the answer. This means they get caught out when an unexpected answer is provided. There is a good example of this in Ellen Langer’s work on mindful learning and the way questions are framed (i). Tests based on the question and not the answer more often prevent he learner demonstrating what they know or understand and tend to reflect what they do not know. This is related to the issue of depth to which you refer.

    Success criteria against which work is bench-marked may only account for the outcome and may exclude the process that is, or rather should be, so much part of D&T. For example a project may fail because of an earlier design decision, recognising this during the evaluation stage and suggesting a retrospective alternative could result in a greater understanding of the issues than somebody who achieved a good outcome but “played it safe”.
    Whilst I like the recognition of a starting point I am reminded that this is flawed when added to targeted progress. A starting point is based on past performance and not potential. This has always been the problem with any algorithm that claims to predict future performance. There needs to be some form of objective assessment of potential which is allowed to challenge past performance. Unlocking the story at this point has the ability to unlock potential. In my experience the problem with objective assessments such as CAT’s is that they get hijacked for predictive use rather than for diagnostic purposes. Where they are used to challenge present performance they have much greater value, in part because they are diagnostic of the learning hurdles and because of their objectivity if read correctly. With predicting performance over time we have little to go on other than a form of “norm” which we know is a forced association with the way in which learners develop. I am not in favour of providing interim or final targets for they are limiting, they set artificial ceilings. A study of boys and their motivation has shown me they often do just enough based on expectations and rarely base their aims on their true potential (ii).

    Progress is an arbitrary term and we need to consider what good progress looks like if constantly achieved set against a target grade. In my view progress should be measured against understanding. If something is easy to report it does not necessarily mean it is appropriate or easy to understand. If a student is recognised for what they have achieved in terms of understanding then I believe this is of greater benefit. It certainly paves the way for target setting – not a grade or percentage but subject matter, knowledge and understanding.
    Assessment and reporting requires a great deal of resources and schools need to consider their return on investment. In my view too often the process leads to a document that is little more than “shelf furniture” or a “drawer bottom dweller” and what is required is a document which can be referred to between reporting cycles. In my view we need to confirm what a student knows and understands, what they need to revise and what they need to fully revisit. For me getting students to move away from targets and grades or levels until the point of final assessment promotes a better understanding of their progress and assists with personal learning objectives (iii). Further there is greater motivation to achieve and it promotes self-efficacy.

    I like your sentiment of keeping it real. Real in a number of senses, it has value, promotes learning, requires an appropriate amount of time and resources, is accessible by teachers, learners and parents, and leads to real improvements in teaching and learning.

    i)Ellen J Langer 1998 The Power of Mindful Learning Perseus Books Massachusetts

    ii) Advocating Creativity 2013 Why many boys only do just enough.

    iii) Advocating Creativity 2014 Assessment Without Levels. Is it Possible?

    Liked by 1 person

  3. Hi Tom,

    Maybe you could look at this web site – To quote from the site – The Australian Curriculum sets consistent national standards to improve learning outcomes for all young Australians. It sets out, through content descriptions and achievement standards, what students should be taught and achieve, as they progress through school. It is the base for future learning, growth and active participation in the Australian community. ACARA develops the Australian Curriculum through rigorous, consultative national processes.

    If you go to F-10 Curriculum/general capabilities/literacy/continuum then you can read, for example, what is expected by the end of each year for literacy for each year level. This can be done for each learning area.

    Maybe this might help you? Cheers, Anne


    • Thanks Anne. I guess this is what the national curriculum in England was meant to achieve but the assessment element became corrupted along the way. I’ll take a good look at this. Thanks.


  4. Hi Tom. Interesting reading and links to what we are aiming for in Maths, except that I am struggling with how we would accurately define a student’s “starting point”. We all know what they are anecdotally, but I feel we need a clear set of rules to work from. Any thoughts?


    • You could easily aggregate KS2 levels, a CATS test and your own in-house baseline maths test given in the first week. That would give you a sense of where students are. Or work with feeder schools so that you trust their assessments and use them.


      • My school is currently pursuing a rank-led system whereby pupils are allocated a * / 1 / 2 / 3 based on their percentile position within the year group for key assessments. They get a progress descriptor to accompany this and the teacher can add a comment for the tutor to pass on. Is this a solution from your point of view?


  5. The issue of progress and attainment is an interesting one. Your combination is useful as not only does it allow a recognition of poor progress by a high attainer even if test results “look” good, but as important (if not more so) allows a low attainer to recognise there is more to achieve , even if progress is “good” We must do everything to challenge any built in lower aspirations for children with lower starting points.


  6. When my kids were in school there weren’t levels (or, at least, they weren’t shared). The feedback from teachers was more qualitative, but pretty much fit for purpose. It was possible to leave a parents’ evening knowing where your child was doing well, and where they might need some support…what their level of effort was, their attitude, their direction of travel. I venture that this is sufficient for parents.


  7. I struggle with the likely disappointment/confusion of parents. What will the parent of a low starter who has been told their child is making excellent progress think when they realise that means their child will finally achieve a grade ‘5’. Or a high starter being told their child is not making good progress; meaning they finally achieve a 7?


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s