Instructors returning to high school and college classes this fall, take note: Grading your students’ work is about to get a lot easier.

A UC Berkeley professor and three former graduate students are putting the finishing touches on an artificial intelligence technology that groups answers and allows them to be graded en masse.

The AI-boosted capability, now wrapping up beta testing before becoming available this fall, will be the newest feature of the online grading application Gradescope. The team launched the app as a company two years ago, in part to stem cheating. Having a digital record of a graded paper makes it hard to alter written answers and argue the paper was incorrectly graded.

Having accumulated a sample of 10 million answers to around 100,000 questions asked in a wide range of college courses, Gradescope has already shortened the grading process by 50 percent. This is thanks to a friendly interface and the ability for multiple teaching assistants to grade papers in parallel.

The addition of AI promises to slash grading times by as much as 90 percent, said Sergey Karayev, a Gradescope co-founder who finished his Ph.D. in computer science in 2014. Fellow co-founder and Ph.D. recipient Arjun Singh is Gradescope’s CEO.

Highly Repeatable

The AI isn’t used to directly grade the papers; rather, it turns grading into an automated, highly repeatable exercise by learning to identify and group answers, and thus treat them as batches.

Using an interface similar to a photo manager, instructors ensure that the automatically suggested answer groups are correct, and then score each answer with a rubric. In this way, input from users lets the AI continually improve its future predictions.

“Traditionally, if you were to give a test to 100 students and they all write the correct answer, you would have to go through all 100 and mark them correct,” said Karayev. “With AI-assisted grading, you could grade one answer and it would apply to all 100 students.”

Karayev said the AI feature attempts to address three challenges: identifying question types, such as multiple choice, fill in the blank or written answers; distinguishing between different written marks, including when a student crosses out a multiple choice answer and chooses another; and, perhaps the toughest of the three, recognizing handwriting.

That last one required a recurrent neural network trained using the Tesla K40 and GeForce GTX 980 Ti GPUs to take in images and put out words because, as Karayev pointed out, “there’s not a good handwriting recognition engine out there.”

GPUs and AI a Powerful Combination

The GPU-powered AI approach has proven highly effective, based on initial returns. In a blog post published on the Gradescope site in June, company co-founder Pieter Abbeel, an associate professor of electrical engineering and computer science at Berkeley’s AI Lab, said he used an early version of the company’s AI-boosted grading feature for a computer science final given to more than 600 students. The early version cut grading time by 75 percent.

That kind of time savings figures to be cause for celebration among educators weary from the burden of grading.

“While crucial, it’s unfortunately one of the least fun instructional responsibilities,” Abbeel wrote. “Grading fairly and consistently without AI assistance can be extremely time-consuming.”

Down the line, the team intends to apply the machine learning approach it’s used for the handwriting recognition to group and grade complex chemistry and engineering diagrams, among others. That capability would build on a lot of work researchers around the world have done using GPUs to train models to identify people, animals and objects in photos.

“There are networks able to recognize an image of a Dalmatian and distinguish it from images of beagles,” Karayev said. “It’s the same kind of thing for the diagrams people have to draw.”

Watch Gradescope in action in the video below, and let us know what you think in the comments section.