This year was a special year for many teachers in Austria. We were "blessed" with the new Matura (a final exam for 18 year olds).

We did have a Matura before but this year it was on the same day in the whole country and all schools got the same assignments.

I couldn't help but wonder if you could analyzed the exam texts and maybe find an algorithm that can explain or predict a student's grade based on the text itself so I played around with a few things and these are my findings:

Words per sentence

At first I calculated the WPS (words per sentence) value and compared it to the grade of the exam. Words per sentence graph

Interesting! It seems like smaller sentences lead to better grades. Let's continue..

Syllables per word

Next I analyzed the SPW (syllables per word). Words with more syllables tend to be more complex (in german at least).

Syllables per word graph Again it seems like simpler words can result in better grades. Bear in mind that the lowest grades (5) might not be the ones with the worst texts but could be the result of textual or grammar mistakes.

Readability Index

The "Readability Index" is an algorithmic approach to analyze the readability of a text. Texts with longer sentences and higher syllable counts are usually harder to read. The algorithm is different for every language since the structure of sentences and words is very different in german compared to english.

But in all languages the Readability Index spits out a score for each text which states the computed quality of a text (higher scores are better).

Scores and their meaning

0 - 30: Very hard to read 30 - 50: Hard 50 - 60: Moderately hard 60 - 70: Moderate 80 - 90: Easy 90 - 100: Very easy

Student's scores

Student performance in readability

The best grades are moderately hard to moderate to read. If a text is too complicated the grade will drop.

TL;DR

Easier written texts with shorter sentences and less syllables per word have a higher chance of getting a good grade for the Matura.