Study reveals Top Marks AI achieving 0.90 correlation with Edexcel for GCSE History: 16 Mark Question

AI Marking for Teachers: Top Marks AI Achieves 0.90 Correlation on Edexcel GCSE History: 16 Mark Question

Study reveals Top Marks AI achieving 0.90 correlation with Edexcel for GCSE History: 16 Mark Question, November 24, 2025

AI Marking for Teachers Achieves 0.90 Correlation for Edexcel's GCSE History: 16 Mark Question

"How reliable is your GCSE History AI marking system?" We encounter this question regularly when speaking with teachers and educational institutions.

As such, we've conducted extensive testing to show exactly how accurate the Top Marks' GCSE History AI marking tools really are.

In this study, we're analysing Edexcel History -- specifically, the GCSE History: 16 Mark Question.

Edexcel makes available numerous exemplar essays for their exam papers and we've put our tool to the test using 85 of those very same exam board approved standardisation materials. These exemplars showcase a broad spectrum of answer quality. These essays are provided for standardisation purposes - so teachers can see what different levels of responses actually look like in practice.

We took 85 of these essays and ran them through our dedicated marking tool. Then we measured the correlation between the official marks the board awarded each essay, and the marks Top Marks AI assigned to those same essays.

We employed the Pearson correlation coefficient. In short:

  • • A value of 1 would mean perfect correlation -- when one marker assigns a high score, the other always does too, and when one assigns a low score, the other always does too.
  • • A value of 0 means no correlation whatsoever -- knowing one marker's score tells you nothing about what the other marker awarded.
  • • Negative values would mean the markers systematically disagree - when one assigns high scores, the other assigns low scores.

For context, how do humans perform?

What sort of correlation do experienced human markers achieve when marking essays already marked by a lead examiner?

Cambridge Assessment conducted a rigorous study to measure precisely this. 200 GCSE English scripts - which had already been marked by a chief examiner - were sent to a team of experienced human markers. These experienced markers were not told what the chief examiner had given these scripts. Nor were they shown any annotations.

The Pearson correlation coefficient between the scores these experienced examiners gave and the chief examiner was just below 0.7. This indicated a positive correlation, though far from perfect. You can find the study here.

How did Top Marks AI perform?

Top Marks, across the 85 essays, achieved a correlation of 0.90 -- an incredibly strong positive correlation that far outperforms the experienced human markers in the Cambridge study. (Top Marks AI was also not privy to the "correct marks" or any annotations).

Moreover, 76.47% of the marks we gave were within 2 marks of the grade given by the chief examiner.

Another interesting metric is the Mean Absolute Error, for which our system scored 1.29. On average, the AI differed from the board by 1.29 marks, which is comfortably within 1.6 marks. As a percentage, that's an average of 8.1% difference.

In contrast, in that same Cambridge study, experienced examiners marking a 40-mark question showed a Mean Absolute Error of 5.64 marks, that's a difference of 14.1%. These results highlight the exceptional accuracy of Top Marks AI compared to traditional marking practices.

We don't claim that Top Marks is infallible, but when it does get things wrong, just how bad is it? Well, let's turn to the Root Mean Square Error to find out. Root Mean Square Error (RMSE) is a measure of the severity of large errors. When you square the number 1, you still get 1, and when you square 2, you still only make a small jump to 4. But square 5, and you're suddenly all the way up at 25. That's how RMSE works - it (essentially!) highlights large errors by squaring them.

Top Marks AI's Root Mean Square Error was 1.70, meaning even when larger errors occur, they remain remarkably small relative to the 16-mark scale.

You can see the full side-by-side human and AI scores below.

Essay IDBoard ScoreTop Marks AI ScoreDifference
Migrants in Britain 6 -16 Marks 1 (-) (13).pdf13.014.8+1.8
2019 Paper 1 medicine 16 Marks 1 (-) (16).pdf16.015.7-0.3
2019 Paper 1 crime 16 Marks 1 (-) (16).pdf16.015.1-0.9
Migrants in Britain 5 -16 Marks 3 (-) (8).pdf8.09.9+1.9
2019 Paper 1 crime 16 Marks 3 (-) (13).pdf13.012.1-0.9
2019 Paper 1 medicine 16 Marks 3 (-) (16).pdf16.013.8-2.2
2019 Paper 1 warfare 16 Marks 1 (-) (14).pdf14.014.4+0.4
Migrants in Britain 6 -16 Marks 4 (-) (7).pdf7.010.5+3.5
2019 Paper 1 medicine 16 Marks 2 (-) (8).pdf8.08.0+0.0
2019 Paper 1 medicine 16 Marks 4 (-) (7).pdf7.09.3+2.3
2019 Paper 1 crime 16 Marks 2 (-) (7).pdf7.011.3+4.3
2019 Paper 1 crime 16 Marks 4 (-) (10).pdf10.011.0+1.0
Migrants in Britain 5 -16 Marks 4 (-) (16).pdf16.015.7-0.3
Migrants in Britain 6 -16 Marks 7 (-) (7).pdf7.09.8+2.8
2019 Paper 2 Elizabethan 16 Marks 1 (-) (8).pdf8.09.4+1.4
2019 Paper 1 warfare 16 Marks 2 (-) (9).pdf9.08.5-0.5
2019 Paper 2 Elizabethan 16 Marks 3 (-) (16).pdf16.016.0+0.0
Migrants in Britain 5 -16 Marks 1 (-) (16).pdf16.016.0+0.0
2019 Paper 2 King Richard I and King John 16 Marks 2 (-) (9).pdf9.010.7+1.7
Exemplars 5c -16 Marks 3 (-) (12).pdf12.08.0-4.0
June _ Summer 2024 5 -16 Marks 2 (-) (10).pdf10.06.6-3.4
June Summer 2024 5 -16 Marks 1 (-) (14).pdf14.012.9-1.1
Exemplars 5 -16 Marks 2 (-) (16).pdf16.015.5-0.5
2019 Paper 1 warfare 16 Marks 3 (-) (6).pdf6.08.2+2.2
2019 Paper 2 Henry VIII 16 Marks 3 (-) (12).pdf12.010.2-1.8
2019 Paper 2 Henry VIII 16 Marks 1 (-) (16).pdf16.015.4-0.6
June Summer 2024 6 -16 Marks 2 (-) (12).pdf12.08.1-3.9
Migrants in Britain 5 -16 Marks 2 (-) (8).pdf8.010.4+2.4
June _ Summer 2024 6 -16 Marks 1 (-) (3).pdf3.03.8+0.8
June 2024 6 -16 Marks 1 (-) (14).pdf14.010.4-3.6
2019 Paper 2 Anglo Saxon 16 Marks 2 (-) (13).pdf13.014.5+1.5
Exemplars 5c -16 Marks 2 (-) (5).pdf5.07.7+2.7
June _ Summer 2024 6 -16 Marks 2 (-) (8).pdf8.07.1-0.9
Paper 1 Exemplars Summer June 2022 Sixteen Marker 2 (-) (16).pdf16.015.5-0.5
June 2024 6 -16 Marks 2 (-) (16).pdf16.013.8-2.2
June Summer 2024 6 -16 Marks 1 (-) (15).pdf15.014.4-0.6
June 2024 5 -16 Marks 1 (-) (10.5).pdf10.510.8+0.3
2019 Paper 2 Anglo Saxon 16 Marks 1 (-) (16).pdf16.014.4-1.6
Paper 1 Exemplars Summer June 2022 Sixteen Marker 4 (-) (15).pdf15.014.3-0.7
Migrants in Britain 6 -16 Marks 5 (-) (14).pdf14.014.5+0.5
Paper 1 Exemplars Summer June 2022 Sixteen Marker 3 (-) (7).pdf7.05.8-1.2
Paper 1 Exemplars Summer June 2022 Sixteen Marker 1 (-) (7).pdf7.06.3-0.7
Paper 1 Exemplars Summer June 2022 Sixteen Marker 5 (-) (8).pdf8.08.8+0.8
June 2024 5 -16 Marks 1 (-) (8).pdf8.09.2+1.2
June 2024 5 -16 Marks 2 (-) (15).pdf15.014.4-0.6
Exemplars 5 -16 Marks 1 (-) (13).pdf13.014.1+1.1
2019 Paper 2 Henry VIII 16 Marks 2 (-) (7).pdf7.08.5+1.5
June _ Summer 2024 5 -16 Marks 1 (-) (16).pdf16.015.1-0.9
2019 Paper 2 King Richard I and King John 16 Marks 1 (-) (16).pdf16.014.5-1.5
2019 Paper 2 Elizabethan 16 Marks 2 (-) (11).pdf11.010.8-0.2
Paper 1 Exemplars Summer June 2022 Sixteen Marker 6 (-) (14).pdf14.014.4+0.4
Summer 2022 6 -16 Marks 1 (-) (8).pdf8.05.8-2.2
Paper 1 Exemplars Summer June 2022 Sixteen Marker 7 (-) (12).pdf12.011.6-0.4
Summer 2022 Q1c (Elizabethan England) -16 Marks 1 (-) (6).pdf6.09.6+3.6
Summer 2022 Q1c (Anglo-Saxons & Normans) -16 Marks 1 (-) (7).pdf7.07.2+0.2
Summer 2022 Q1c (Henry VIII) -16 Marks 1 (-) (7).pdf7.06.4-0.6
summer June 2024 6 -16 Marks 1 (-) (11).pdf11.09.5-1.5
Paper 1 Exemplars Summer June 2022 Sixteen Marker 8 (-) (15).pdf15.012.8-2.2
Summer 2022 Q1c (Elizabethan England) -16 Marks 2 (-) (15).pdf15.012.8-2.2
Summer 2022 Q1c (Richard & John) -16 Marks 1 (-) (6).pdf6.09.4+3.4
Summer 2022 6 -16 Marks 2 (-) (10).pdf10.09.7-0.3
Summer 2022 Q1c (Anglo-Saxons & Normans) -16 Marks 2 (-) (15).pdf15.013.5-1.5
summer June 2024 6 -16 Marks 2 (-) (16).pdf16.014.8-1.2
Summer 2022 Q1c (Henry VIII) -16 Marks 2 (-) (14).pdf14.013.7-0.3
summer June 2024 5 -16 Marks 2 (-) (15).pdf15.015.1+0.1
Summer 2022 Q1c (Richard & John) -16 Marks 2 (-) (15).pdf15.015.5+0.5
Examiner's Report - B4 Paper 2 June 2024 Q1ci -16 Marks 2 (-) (16).pdf16.015.2-0.8
Examiner's Report - B4 Paper 2 June 2024 Q1cii -16 Marks 1 (-) (12).pdf12.012.4+0.4
Examiner's Report - B3 Paper 2 June 2024 Q1ci -16 Marks 2 (-) (16).pdf16.016.0+0.0
Examiner's Report - B3 Paper 2 June 2024 Q1cii -16 Marks 1 (-) (10.5).pdf10.513.5+3.0
Examiner's Report - B2 Paper 2 June 2024 Q1ci -16 Marks 2 (-) (6).pdf6.05.5-0.5
Examiner's Report - B4 Paper 2 June 2024 Q1ci -16 Marks 1 (-) (6).pdf6.06.7+0.7
Examiner's Report - B4 Paper 2 June 2024 Q1cii -16 Marks 2 (-) (14.5).pdf14.514.7+0.2
Examiner's Report - B2 Paper 2 June 2024 Q1cii -16 Marks 1 (-) (16).pdf16.016.0+0.0
Examiner's Report - B1 Paper 2 June 2024 Q1ci -16 Marks 2 (-) (7).pdf7.07.5+0.5
Examiner's Report - B1 Paper 2 June 2024 Q1cii -16 Marks 2 (-) (12).pdf12.011.6-0.4
Examiner's Report - B3 Paper 2 June 2024 Q1ci -16 Marks 1 (-) (6.5).pdf6.55.2-1.3
Examiner's Report - B3 Paper 2 June 2024 Q1cii -16 Marks 2 (-) (16).pdf16.015.2-0.8
Examiner's Report - B2 Paper 2 June 2024 Q1ci -16 Marks 1 (-) (10).pdf10.09.3-0.7
Examiner's Report - B2 Paper 2 June 2024 Q1cii -16 Marks 2 (-) (8).pdf8.06.2-1.8
Examiner's Report - B1 Paper 2 June 2024 Q1cii -16 Marks 1 (-) (5).pdf5.05.7+0.7
Examiner's Report - B1 Paper 2 June 2024 Q1ci -16 Marks 1 (-) (11).pdf11.010.5-0.5
Medicine June 2022 5 | 16 Marks 2 (-) (16).pdf16.014.4-1.6
Medicine June 2022 6 | 16 Marks 2 (-) (15).pdf15.015.0+0.0
Medicine June 2022 5 | 16 Marks 1 (-) (7).pdf7.010.8+3.8

Can I see a graph to help me visualise this?

Absolutely.

First, here's a scatter graph to show you what a theoretical perfect correlation of 1 would look like:

Perfect Correlation Graph

Now, let's look at the real-life graph, drawn from the data above:

Actual Correlation Graph for Edexcel GCSE History: 16 Mark Question

On the horizontal axis, we have the grade given by the exam board. On the vertical, the grade given by Top Marks AI. The individual dots are the essays -- their position tells us both the mark given by the exam board and by Top Marks AI. You can see how closely it resembles the theoretical graph depicting perfect correlation.

Discover how Top Marks AI can revolutionise assessment in education. Contact us at info@topmarks.ai.

We use cookies to enhance your experience. By continuing to visit this site you agree to our use of cookies. Learn more in our Cookie Policy.